# Semantic Kernel Introduction - Enhanced Edition

This notebook provides an introduction to Microsoft Semantic Kernel, a powerful framework for building AI-powered applications. We'll explore the core concepts, compare different approaches, and understand how Semantic Kernel simplifies AI integration.

## Setup and Configuration

First, let's load our configuration settings. This includes API keys, endpoints, and model configurations that we'll use throughout this notebook.

If you hadn't done so already, run the notebook [0-AI-settings](./0-AI-settings.ipynb) which will collect the necessary settings in an interactive way

In [5]:
// Load helper functions and configuration settings
// This imports utility functions for loading API keys, endpoints, and other settings from settings.json
#!import config/Settings.cs 

// Load configuration from settings.json file
// This returns a tuple with all the necessary configuration values
var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId, embeddingEndpoint, embeddingApiKey) = Settings.LoadFromFile();

Console.WriteLine($"Configuration loaded:");
Console.WriteLine($"- Model: {model}");
Console.WriteLine($"- Endpoint: {azureEndpoint}");
Console.WriteLine($"- API Key configured: {!string.IsNullOrEmpty(apiKey)}");

Configuration loaded:
- Model: gpt-4o
- Endpoint: https://eastus.api.cognitive.microsoft.com/
- API Key configured: True


## Approach 1: Direct AI API Calls with Microsoft.Extensions.AI

Before diving into Semantic Kernel, let's see how you would typically make direct calls to AI services using Microsoft.Extensions.AI. This will help us understand what Semantic Kernel abstracts away and why it's valuable.

### Microsoft.Extensions.AI Overview

Microsoft.Extensions.AI is a unified abstraction layer for AI services that provides:
- Consistent interfaces across different AI providers
- Built-in logging, telemetry, and caching
- Dependency injection integration
- Rate limiting and retry policies


In [6]:
// Import required NuGet packages for Microsoft.Extensions.AI
#r "nuget: Azure.Core, 1.49.0"
#r "nuget: Microsoft.Extensions.AI.Abstractions, 9.10.0.0"
#r "nuget: Microsoft.Extensions.AI, 9.7.1"
#r "nuget: Azure.AI.OpenAI, 2.5.0-beta.1"
#r "nuget: Microsoft.Extensions.AI.OpenAI, 9.7.1-preview.1.25365.4"


Console.WriteLine("Microsoft.Extensions.AI packages loaded successfully.");

Microsoft.Extensions.AI packages loaded successfully.


In [7]:
// Create a chat client using Microsoft.Extensions.AI

using System.ClientModel;
using Microsoft.Extensions.AI; 
using Azure.AI.OpenAI;

IChatClient chatClient = new AzureOpenAIClient(
        new Uri(azureEndpoint),           // Azure OpenAI endpoint URL
        new ApiKeyCredential(apiKey))     // API key for authentication
    .GetChatClient(model)                // Get chat client for specific model
    .AsIChatClient();                    // Convert to Microsoft.Extensions.AI interface

// Make a direct API call to the AI service
// This is a simple request-response pattern
ChatResponse response = await chatClient.GetResponseAsync(
    new ChatMessage(ChatRole.User, "how do i get the k8s pod that consume the most resources"));

Console.WriteLine("Direct API Response:");
Console.WriteLine("====================");
Console.WriteLine(response.Text);

// Note: With direct API calls, you need to:
// 1. Manage authentication and endpoints manually
// 2. Handle conversation history yourself
// 3. Implement prompt templating from scratch
// 4. Build your own plugin/function calling system

Direct API Response:
To identify the Kubernetes pod consuming the most resources (CPU or memory), you can use a combination of `kubectl top pods` and other commands to analyze the resource utilization. Below are some methods to achieve this:

### Prerequisites
1. Ensure that the `metrics-server` is deployed in your Kubernetes cluster, as it is required to pull resource usage metrics from pods.
2. Install `kubectl` CLI and ensure you have access to your Kubernetes cluster.

---

### Steps to Find the Pod Consuming the Most Resources

#### 1. Check Resource Usage Across All Namespaces
```bash
kubectl top pods --all-namespaces
```
This will display resource usage for all pods across all namespaces, including CPU and memory consumption.

#### 2. Sort by CPU Usage
To find the pod consuming the most CPU:
```bash
kubectl top pods --all-namespaces --sort-by=cpu
```

#### 3. Sort by Memory Usage
To find the pod consuming the most memory:
```bash
kubectl top pods --all-namespaces --sort-by=memor

## Approach 2: Using Semantic Kernel

Now let's see how Semantic Kernel simplifies AI integration and provides additional capabilities.

### Why Semantic Kernel?

While direct API calls work, Semantic Kernel provides several advantages:

1. **Higher-level abstractions**: Focus on your business logic, not AI plumbing
2. **Built-in prompt templating**: Powerful template system with variable substitution
3. **Plugin system**: Easy integration of custom functions and external APIs
4. **Planning capabilities**: Automatic orchestration of complex AI workflows
5. **Memory management**: Built-in conversation history and context management
6. **Enterprise features**: Logging, telemetry, security, and scalability


In [9]:
// Import Semantic Kernel
#r "nuget: Microsoft.SemanticKernel, 1.67.1"
using Microsoft.SemanticKernel;
using Kernel = Microsoft.SemanticKernel.Kernel;  // Alias to avoid conflicts

Console.WriteLine("Semantic Kernel package loaded successfully.");

Semantic Kernel package loaded successfully.


### Creating and Configuring a Kernel

The `Kernel` is the central orchestrator in Semantic Kernel. Think of it as the "brain" that coordinates all AI operations, manages services, and executes prompts and functions.

In [10]:
// Create a Kernel builder - this is the factory for creating Kernel instances
IKernelBuilder kernelBuilder = Kernel.CreateBuilder();

// Add Azure OpenAI chat completion service to the kernel
// This registers the AI service with the dependency injection container
kernelBuilder.AddAzureOpenAIChatCompletion(
    model,          // The model name (e.g., "gpt-4", "gpt-35-turbo")
    azureEndpoint,  // Your Azure OpenAI endpoint
    apiKey          // Your API key
);

// Build the kernel instance
// This creates a fully configured kernel ready for use
Kernel kernel = kernelBuilder.Build();

Console.WriteLine("Kernel created and configured successfully.");

Kernel created and configured successfully.


### Simple Prompt Execution

Now let's execute the same query using Semantic Kernel. Notice how much simpler this is compared to the direct API approach.

In [11]:
// Execute a simple prompt using Semantic Kernel
// InvokePromptAsync is a high-level method that handles all the complexity
var result = await kernel.InvokePromptAsync("how do i get the k8s pod that consume the most resources");

Console.WriteLine("Semantic Kernel Response:");
Console.WriteLine("=========================");
Console.WriteLine(result);

// Benefits of using InvokePromptAsync:
// 1. Automatic service resolution (no need to manually get the chat service)
// 2. Built-in error handling and retries
// 3. Logging and telemetry out of the box
// 4. Consistent API regardless of the underlying AI service

Semantic Kernel Response:
To identify the Kubernetes pod that is consuming the most resources (CPU and/or memory), you can use commands or tools to extract and analyze the resource consumption data. Here's how you can do that:

---

## CLI Approach Using `kubectl top`:
The `kubectl top` command fetches the resource consumption metrics for nodes or pods. It requires the `metrics-server` to be installed in your Kubernetes cluster.

### Steps:
1. Make sure the `metrics-server` is deployed in your cluster. If it's not, you can install it using:
   ```bash
   kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
   ```

2. To get resource usage for all pods in a specific namespace:
   ```bash
   kubectl top pods -n <namespace> --sort-by=<resource>
   ```
   Replace `<namespace>` with the desired namespace and `<resource>` with either `cpu` or `memory`.

   Example:
   ```bash
   kubectl top pods -n default --sort-by=cpu
   ```

3. To lis

### Working with Chat Completion Services Directly

Sometimes you need more control over the AI interaction. Semantic Kernel allows you to access the underlying chat completion service while still benefiting from its abstractions.

In [12]:
// Import the chat completion namespace
using Microsoft.SemanticKernel.ChatCompletion;

// Get the chat completion service from the kernel's dependency injection container
// This gives you direct access to the underlying AI service
IChatCompletionService completionService = kernel.GetRequiredService<IChatCompletionService>();

Console.WriteLine($"Chat completion service type: {completionService.GetType().Name}");
Console.WriteLine($"Service attributes: {string.Join(", ", completionService.Attributes.Select(x => $"{x.Key}={x.Value}"))}");

Chat completion service type: AzureOpenAIChatCompletionService
Service attributes: DeploymentName=gpt-4o


In [13]:
// Use the chat completion service directly
// This gives you more control over the request/response cycle
var chatResult = await completionService.GetChatMessageContentAsync(
    "how do i get the k8s pod that consume the most resources");

Console.WriteLine("Direct Chat Completion Service Response:");
Console.WriteLine("========================================");
Console.WriteLine(chatResult);

// You can also access additional properties:
Console.WriteLine($"\nResponse metadata:");
Console.WriteLine($"- Role: {chatResult.Role}");
Console.WriteLine($"- Model ID: {chatResult.ModelId}");
Console.WriteLine($"- Content length: {chatResult.Content?.Length ?? 0} characters");

// This approach is useful when you need:
// 1. Access to response metadata
// 2. Custom message formatting
// 3. Advanced conversation management
// 4. Integration with existing chat systems

Direct Chat Completion Service Response:
To identify the Kubernetes (k8s) pod consuming the most resources (CPU or memory), you can rely on various tools and approaches, such as `kubectl`, Kubernetes metrics server, and monitoring solutions like Prometheus or Grafana. Below are some methods to achieve this:

---

### **1. Using `kubectl` and Metrics Server**

To use these commands, ensure the Kubernetes Metrics Server is installed in your cluster.

#### **Step 1: Find the Pod Consuming the Most CPU**
```bash
kubectl top pod --all-namespaces --sort-by=cpu
```

#### **Step 2: Find the Pod Consuming the Most Memory**
```bash
kubectl top pod --all-namespaces --sort-by=memory
```

The `kubectl top pod` command retrieves resource utilization (CPU and memory) metrics for pods in your cluster. Using the `--sort-by` flag, you can sort the results by the resources being consumed.

---

### **2. Using Custom `kubectl` Commands to Filter and Sort**
If the `kubectl top pod` output is verbose, you c

## Advanced Prompt Templating

One of Semantic Kernel's most powerful features is its prompt templating system. This allows you to create reusable, parameterized prompts that can be dynamically filled with data.

### Template Syntax

Semantic Kernel uses a simple but powerful templating syntax:
- `{{$variableName}}` - Substitutes a variable value
- `{{functionName}}` - Calls a function
- `{{plugin.functionName}}` - Calls a function from a specific plugin

Let's see this in action with a text summarization example.

In [14]:
// Define a prompt template with variable substitution
// The {{$input}} placeholder will be replaced with actual content
string skPrompt = @"
{{$input}}

Give me the TLDR in 5 words.
";

// Sample text to summarize (Asimov's Three Laws of Robotics)
var textToSummarize = @"
    1) A robot may not injure a human being or, through inaction,
    allow a human being to come to harm.

    2) A robot must obey orders given it by human beings except where
    such orders would conflict with the First Law.

    3) A robot must protect its own existence as long as such protection
    does not conflict with the First or Second Law.
";

Console.WriteLine("Prompt template created:");
Console.WriteLine("=======================");
Console.WriteLine(skPrompt);
Console.WriteLine("\nText to summarize:");
Console.WriteLine("==================");
Console.WriteLine(textToSummarize);

Prompt template created:

{{$input}}

Give me the TLDR in 5 words.


Text to summarize:

    1) A robot may not injure a human being or, through inaction,
    allow a human being to come to harm.

    2) A robot must obey orders given it by human beings except where
    such orders would conflict with the First Law.

    3) A robot must protect its own existence as long as such protection
    does not conflict with the First or Second Law.



In [15]:
// Execute the prompt template with variable substitution
// The kernel will replace {{$input}} with the value from the arguments dictionary
var result = await kernel.InvokePromptAsync(skPrompt, new() { ["input"] = textToSummarize });

Console.WriteLine("Summarization Result:");
Console.WriteLine("=====================");
Console.WriteLine(result);

// Benefits of prompt templating:
// 1. Reusable prompts - write once, use many times
// 2. Dynamic content - inject data at runtime
// 3. Maintainable - centralized prompt management
// 4. Type-safe - compile-time checking of template variables
// 5. Composable - combine multiple templates and functions

Summarization Result:
Don't harm, obey, self-preserve.


## Interoperability: Microsoft.Extensions.AI â†” Semantic Kernel

One of the great features of the Microsoft AI ecosystem is interoperability. You can easily convert between Microsoft.Extensions.AI interfaces and Semantic Kernel services, allowing you to:

- Use existing Microsoft.Extensions.AI code with Semantic Kernel
- Leverage Semantic Kernel's advanced features in Microsoft.Extensions.AI applications
- Gradually migrate between the two approaches

**Important Note**: The conversion methods are currently experimental and marked with `SKEXP0001`.

In [16]:
// Disable the experimental API warning for demonstration purposes
#pragma warning disable SKEXP0001

// Convert Microsoft.Extensions.AI IChatClient to Semantic Kernel IChatCompletionService
IChatCompletionService chatCompletionServiceFromChatClient = chatClient.AsChatCompletionService();

// Convert Semantic Kernel IChatCompletionService to Microsoft.Extensions.AI IChatClient
IChatClient chatClientFromChatCompletionService = completionService.AsChatClient();

Console.WriteLine("Interoperability demonstration:");
Console.WriteLine("===============================");
Console.WriteLine($"Original IChatClient type: {chatClient.GetType().Name}");
Console.WriteLine($"Converted to IChatCompletionService: {chatCompletionServiceFromChatClient.GetType().Name}");
Console.WriteLine($"Original IChatCompletionService type: {completionService.GetType().Name}");
Console.WriteLine($"Converted to IChatClient: {chatClientFromChatCompletionService.GetType().Name}");

// Use cases for interoperability:
// 1. Legacy code migration - gradually adopt Semantic Kernel
// 2. Mixed architectures - use both frameworks in the same application
// 3. Testing - easier to mock and test with familiar interfaces
// 4. Third-party integration - work with libraries expecting specific interfaces

Interoperability demonstration:
Original IChatClient type: OpenAIChatClient
Converted to IChatCompletionService: ChatClientChatCompletionService
Original IChatCompletionService type: AzureOpenAIChatCompletionService
Converted to IChatClient: ChatCompletionServiceChatClient


## Working with Local Models

Semantic Kernel isn't limited to cloud-based AI services. You can also use it with local models running on your machine. This is particularly useful for:

- **Privacy-sensitive applications** - Keep data on-premises
- **Cost optimization** - Avoid per-token charges
- **Offline scenarios** - Work without internet connectivity
- **Development and testing** - Use local models for development

### Example: Using LM Studio with Llama 3.2

[LM Studio](https://lmstudio.ai/) is a popular tool for running local language models. It provides an OpenAI-compatible API, making it easy to use with Semantic Kernel.

**Prerequisites**:
1. Install LM Studio
2. Download a compatible model (e.g., Llama 3.2)
3. Start the local server (typically on `http://127.0.0.1:1234`)

In [17]:
// Example configuration for local models using LM Studio
// Note: This will only work if you have LM Studio running locally

using Microsoft.SemanticKernel;
using Kernel = Microsoft.SemanticKernel.Kernel;

// Create a new kernel builder for local model
IKernelBuilder localKernelBuilder = Kernel.CreateBuilder();

// Add OpenAI chat completion pointing to local LM Studio server
// LM Studio provides an OpenAI-compatible API endpoint
localKernelBuilder.AddOpenAIChatCompletion(
    "llama-3.2-3b-instruct",           // Model name (as configured in LM Studio)
    new Uri("http://127.0.0.1:1234/v1"), // Local LM Studio endpoint
    ""                                   // No API key needed for local models
);

// Build the kernel for local use
Kernel localKernel = localKernelBuilder.Build();

Console.WriteLine("Local model kernel configuration:");
Console.WriteLine("==================================");
Console.WriteLine("Model: llama-3.2-3b-instruct");
Console.WriteLine("Endpoint: http://127.0.0.1:1234/v1");
Console.WriteLine("Status: Configured (requires LM Studio to be running)");

// Note: The following execution will only work if LM Studio is running
// Uncomment the lines below to test with a running local model:

/*
try
{
    var localResult = await localKernel.InvokePromptAsync("how do i get the k8s pod that consume the most resources");
    Console.WriteLine("\nLocal Model Response:");
    Console.WriteLine("=====================");
    Console.WriteLine(localResult);
}
catch (Exception ex)
{
    Console.WriteLine($"\nLocal model not available: {ex.Message}");
    Console.WriteLine("Make sure LM Studio is running with the specified model.");
}
*/

Local model kernel configuration:
Model: llama-3.2-3b-instruct
Endpoint: http://127.0.0.1:1234/v1
Status: Configured (requires LM Studio to be running)
