# Introduction to Semantic Kernel


## Lab Introduction

In this lab, we’ll explore Microsoft’s **Semantic Kernel**, a lightweight, open-source SDK that effortlessly integrates large language models (LLMs) into Python applications. You’ll configure the Semantic Kernel environment, define semantic and native functions, and build a basic AI agent with chat capabilities.

### Why Use Semantic Kernel?

* **Middleware for LLMs**: It manages prompt generation and LLM interaction, automatically calling registered functions when needed.
* **Modular and Extensible**: Easily plug in your own code via native plugins or even OpenAPI specs.
* **Cross‑language Support**: Though we’re using Python here, Semantic Kernel also supports C# and Java .

### What You’ll Achieve:

* Set up the Kernel and connect to Azure/OpenAI LLM backends
* Create and invoke **semantic functions** (using prompt templates)
* Build **native plugins** to extend functionality via real code
* Chain prompts and functions into coherent workflows
* Optionally, explore PLM agents, memory, vector store integration, and grounding

## Initial Setup

Import Semantic Kernel SDK from pypi.org

In [None]:
# Note: if using a virtual environment, do not run this cell
# %pip install -qU semantic-kernel
%pip install --upgrade semantic-kernel openai
from semantic_kernel import __version__

__version__

Initial configuration for the notebook to run properly.

In [1]:
# Make sure paths are correct for the imports

import os
import sys

notebook_dir = os.getcwd()
parent_dir = os.path.dirname(notebook_dir)
grandparent_dir = os.path.dirname(parent_dir)


sys.path.append(grandparent_dir)
print(grandparent_dir)

d:\OneDrive - bookstruck1\Azure\Microsoft Workshops\AgenticAI\Labs


In [None]:
import os

GLOBAL_LLM_SERVICE="AzureOpenAI"
AZURE_OPENAI_API_KEY=""
AZURE_OPENAI_ENDPOINT=""
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME="gpt-4o-mini"
AZURE_OPENAI_TEXT_DEPLOYMENT_NAME="gpt-4o-mini"
AZURE_OPENAI_API_VERSION="2024-10-21"

os.environ["GLOBAL_LLM_SERVICE"] = GLOBAL_LLM_SERVICE
os.environ["AZURE_OPENAI_API_KEY"] = AZURE_OPENAI_API_KEY
os.environ["AZURE_OPENAI_ENDPOINT"] = AZURE_OPENAI_ENDPOINT
os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"] = AZURE_OPENAI_CHAT_DEPLOYMENT_NAME
os.environ["AZURE_OPENAI_TEXT_DEPLOYMENT_NAME"] = AZURE_OPENAI_TEXT_DEPLOYMENT_NAME
os.environ["AZURE_OPENAI_API_VERSION"] = AZURE_OPENAI_API_VERSION

Let's define our kernel for this example.

In [13]:
from semantic_kernel import Kernel

kernel = Kernel()

We will load our settings and get the LLM service to use for the notebook.

In [14]:
selectedService = (
    "azureopenai"
)
print(f"Using service type: {selectedService}")

Using service type: azureopenai


We now configure our Chat Completion service on the kernel.

In [15]:
# Remove all services so that this cell can be re-run without restarting the kernel
kernel.remove_all_services()

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
        deployment_name=os.environ.get("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"),
        endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
        api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
        api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
    ),
)

# Run a Semantic Function

Let's load a Plugin and run a semantic function:


In [16]:
plugin = kernel.add_plugin(parent_directory="./plugins", plugin_name="Summarizer")

In [None]:
from semantic_kernel.functions import KernelArguments

summary_function = plugin["summarize"]

summary = await kernel.invoke(
    summary_function,
    KernelArguments(input="""
Azure Cognitive Search is a cloud-native search service that unlocks powerful full-text search, AI-powered cognitive skills (like OCR, key phrase extraction, and language detection), and integrated semantic ranking. It supports indexing from various data sources (Blob, SQL, Cosmos DB) with minimal setup and is ideal for building search-driven apps and knowledge agents."""),
)
print(summary)

# How to run a prompt plugins from file

Now that you understand the basics of the Kernel, let's explore how you can execute Prompt Plugins and Prompt Functions that are saved on disk.

A Prompt Plugin consists of multiple Semantic Functions, each defined using natural language in a text file.

For example, here is the Summary function from the Summarizer plugin:


Let's learn what prompts are and how to write them.

```
SUMMARIZE THE TEXT BELOW IN **EXACTLY THREE** BULLET POINTS.

RULES
- EACH BULLET ≤ 25 WORDS
- NO FULL-SENTENCE QUOTES FROM THE SOURCE
- KEEP NEUTRAL, FACTUAL TONE (NO OPINIONS OR NEW INFO)

+++++
{{$input}}
+++++
```

Note the special **`{{$input}}`** token, which is a variable that is automatically passed when invoking the function, commonly referred to as a "function parameter".


In the same plugin folder you'll see a second config.json file. The file is optional, and is used to set parameters for large language models

```
{
  "schema": 1,
  "description": "Summarizes text into 3 bullet points",
  "type": "completion",
  "completion": {
    "max_tokens": 200,
    "temperature": 0.7,
    "top_p": 0.8
  },
  "input": {
    "parameters": [
      {
        "name": "input",
        "description": "The text to summarize",
        "default": ""
      }
    ]
  }
}

```


Given a prompt function defined by these files, this is how to load and use a file based prompt function.


In [18]:
# Remove all services so that this cell can be re-run without restarting the kernel
kernel.remove_all_services()

service_id = None

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
        deployment_name=os.environ.get("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"),
        endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
        api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
        api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
    ),
)

Import the plugin and all its functions:


In [19]:
plugin = kernel.add_plugin(parent_directory="./plugins", plugin_name="Summarizer")

summarizeFunction = plugin["summarize"]

How to use the plugin functions:


In [20]:
result = await kernel.invoke(summarizeFunction, input="""
Azure Cognitive Search is a cloud-native search service that unlocks powerful full-text search, AI-powered cognitive skills (like OCR, key phrase extraction, and language detection), and integrated semantic ranking. It supports indexing from various data sources (Blob, SQL, Cosmos DB) with minimal setup and is ideal for building search-driven apps and knowledge agents.""", style="silly")
print(result)

- Azure Cognitive Search is a cloud-native service for full-text search and AI cognitive skills.  
- It supports indexing from multiple data sources, including Blob, SQL, and Cosmos DB.  
- The service facilitates the development of search-driven applications and knowledge agents.  


# Running Prompt Functions Inline


In the previous section, we demonstrated how to define a semantic function using a prompt template saved in a file.

Now, we'll explore how to define semantic functions directly within your Python code using Semantic Kernel. This approach is helpful when:

- You need to generate prompts dynamically based on runtime logic
- You prefer editing prompts within Python rather than separate TXT files
- You want to quickly prototype or build demos, as in this section

Prompt templates use the SK template language, which lets you reference variables and functions.

For now, we'll focus on the `{{$input}}` variable, with more advanced templates to come.

Most semantic function prompts include `{{$input}}`, which is the standard way to pass content from context variables into your prompt.


Let's define our kernel for this example.

In [21]:
from semantic_kernel.kernel import Kernel

kernel = Kernel()

We now configure our Chat Completion service on the kernel.

In [22]:
# Remove all services so that this cell can be re-run without restarting the kernel
kernel.remove_all_services()

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
        deployment_name=os.environ.get("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"),
        endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
        api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
        api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
    ),
)

Let's use a prompt to create a semantic function used to summarize content

The function will take in input the text to summarize.


In [24]:
from semantic_kernel.connectors.ai.open_ai import AzureChatPromptExecutionSettings, OpenAIChatPromptExecutionSettings
from semantic_kernel.prompt_template import InputVariable, PromptTemplateConfig

prompt = """{{$input}}
Summarize the content above.
"""

execution_settings = AzureChatPromptExecutionSettings(
    service_id=service_id,
    ai_model_id="gpt-4o-mini",
    max_tokens=2000,
    temperature=0.7,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="summarize",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="input", description="The user input", is_required=True),
    ],
    execution_settings=execution_settings,
)

summarize = kernel.add_function(
    function_name="summarizeFunc",
    plugin_name="summarizePlugin",
    prompt_template_config=prompt_template_config,
)

Set up some content to summarize, here's an extract about Bill Gates, taken from Wikipedia [source](https://en.wikipedia.org/wiki/Bill_Gates).


In [25]:
input_text = """
Demo (Bill Gates)
William Henry Gates III (born October 28, 1955) is an American businessman and philanthropist. A pioneer of the microcomputer revolution of the 1970s and 1980s, he co-founded the software company Microsoft in 1975 with his childhood friend Paul Allen. Following the company's 1986 initial public offering (IPO), Gates became a billionaire in 1987—then the youngest ever, at age 31. Forbes magazine ranked him as the world's wealthiest person for 18 out of 24 years between 1995 and 2017, including 13 years consecutively from 1995 to 2007. He became the first centibillionaire in 1999, when his net worth briefly surpassed $100 billion. According to Forbes, as of May 2025, his net worth stood at US$115.1 billion, making him the thirteenth-richest individual in the world.

Born and raised in Seattle, Washington, Gates was privately educated at Lakeside School, where he befriended Allen and developed his computing interests. In 1973, he enrolled at Harvard College, where he took classes including Math 55 and graduate level computer science courses, but he dropped out in 1975 to co-found and lead Microsoft. He served as its CEO for the next 25 years and also became president and chairman of the board when the company incorporated in 1981. Succeeded as CEO by Steve Ballmer in 2000, he transitioned to chief software architect, a position he held until 2008. He stepped down as chairman of the board in 2014 and became technology adviser to CEO Satya Nadella and other Microsoft leaders, a position he still holds. He resigned from the board in 2020."""

...and run the summary function:


In [26]:
summary = await kernel.invoke(summarize, input=input_text)

print(summary)

William Henry Gates III, born on October 28, 1955, is an American businessman and philanthropist, best known for co-founding Microsoft in 1975 with Paul Allen. He became a billionaire in 1987 and was ranked as the world's wealthiest person for 18 out of 24 years between 1995 and 2017. Gates briefly became the first centibillionaire in 1999, and as of May 2025, his net worth is estimated at $115.1 billion, making him the thirteenth-richest individual globally.

Gates was educated at Lakeside School in Seattle, where he developed an interest in computing, and later enrolled at Harvard College. He dropped out in 1975 to focus on Microsoft, serving as CEO for 25 years, then transitioning to chief software architect until 2008. He stepped down as chairman in 2014 but continues to serve as a technology adviser. Gates resigned from the Microsoft board in 2020.


# Using ChatCompletion for Semantic Plugins


You can also use chat completion models for creating plugins. Normally you would have to tweak the API to accommodate for a system and user role, but SK abstracts that away for you by using `kernel.add_service` and `AzureChatCompletion`


Here's an example of how to write an inline Semantic Function that gives a TLDR for a piece of text using a ChatCompletion model


In [27]:
kernel.remove_all_services()

service_id = None

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
        deployment_name=os.environ.get("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"),
			endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
			api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
			api_version=os.environ.get("AZURE_OPENAI_API_VERSION"),
    ),
)

In [28]:
from semantic_kernel.connectors.ai.open_ai import AzureChatPromptExecutionSettings, OpenAIChatPromptExecutionSettings

prompt = """
{{$input}}

IN **FIVE WORDS OR FEWER**, GIVE A TL;DR OF THE TEXT BELOW.

RULES
- ≤ 5 total words (count them!)
- No direct quotes from the source
- Plain, neutral wording
- Avoid emojis, slang, or punctuation other than periods if essential
"""

text = """
    1) A sunrise reminds us that each day begins with limitless potential waiting to be shaped.

    2) Laughter shared between strangers can dissolve the weight of an entire afternoon.

    3) A single seed, buried in silence, can one day split rock with the persistence of its roots.
"""


execution_settings = AzureChatPromptExecutionSettings(
    service_id=service_id,
    ai_model_id="gpt-4o-mini",
    max_tokens=2000,
    temperature=0.7,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="tldr",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="input", description="The user input", is_required=True),
    ],
    execution_settings=execution_settings,
)

tldr_function = kernel.add_function(
    function_name="tldrFunction",
    plugin_name="tldrPlugin",
    prompt_template_config=prompt_template_config,
)

summary = await kernel.invoke(tldr_function, input=text)

print(f"Output: {summary}")

Output: Nature and connection inspire growth.


# Building a Simple Chat Experience with Kernel Arguments

This example demonstrates how to create a basic chatbot by passing and updating kernel arguments with each user interaction.

We introduce the Kernel Arguments object, which acts as a key-value store for data you provide to the kernel during execution.

Here, chat history is stored locally in memory and will not persist beyond this Jupyter session.

In later examples, we'll cover how to save chat history to disk for use in your own applications.

As you converse with the bot, the chat context accumulates the conversation history. Each time the kernel runs, it uses the current kernel arguments and chat history to inform the AI's responses.

We have laid the foundation which will allow us to store an arbitrary amount of data in an external Vector Store above and beyond what could fit in memory at the expense of a little more latency.


In [29]:
from semantic_kernel import Kernel

kernel = Kernel()

service_id = None

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
    ),
)

Let's define a prompt outlining a dialogue chat bot.


In [30]:
prompt = """
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.

{{$history}}
User: {{$user_input}}
ChatBot: """

Register your semantic function


In [31]:
from semantic_kernel.connectors.ai.open_ai import AzureChatPromptExecutionSettings, OpenAIChatPromptExecutionSettings
from semantic_kernel.prompt_template import PromptTemplateConfig
from semantic_kernel.prompt_template.input_variable import InputVariable

execution_settings = AzureChatPromptExecutionSettings(
    service_id=service_id,
    ai_model_id="gpt-4o-mini",
    max_tokens=2000,
    temperature=0.7,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="chat",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="user_input", description="The user input", is_required=True),
        InputVariable(name="history", description="The conversation history", is_required=True),
    ],
    execution_settings=execution_settings,
)

chat_function = kernel.add_function(
    function_name="chat",
    plugin_name="chatPlugin",
    prompt_template_config=prompt_template_config,
)

In [32]:
from semantic_kernel.contents import ChatHistory

chat_history = ChatHistory()
chat_history.add_system_message("You are a helpful chatbot who is good about giving song recommendations.")

Initialize the Kernel Arguments


In [36]:
from semantic_kernel.functions import KernelArguments

arguments = KernelArguments(user_input="Hi, I'm looking for song suggestions", history=str(chat_history))

Chat with the Bot


In [37]:
response = await kernel.invoke(chat_function, arguments)
print(response)

Sure! What kind of music are you in the mood for? Any specific genres or artists you like?


Update the history with the output


In [38]:
chat_history.add_assistant_message(str(response))

Keep Chatting!


In [39]:
async def chat(input_text: str) -> None:
    # Save new message in the context variables
    print(f"User: {input_text}")

    # Process the user message and get an answer
    answer = await kernel.invoke(chat_function, KernelArguments(user_input=input_text, history=chat_history))

    # Show the response
    print(f"ChatBot: {answer}")

    chat_history.add_user_message(input_text)
    chat_history.add_assistant_message(str(answer))

In [40]:
await chat("I love rock and metallica, I'd like to listen to metallica, any suggestion?")

Function failed. Error: Argument 'history' has a value that doesn't support automatic encoding. Set allow_dangerously_set_content to 'True' for this argument and implement custom encoding, or provide the value as a string.
Something went wrong in function invocation. During function invocation: 'chatPlugin-chat'. Error description: 'Argument 'history' has a value that doesn't support automatic encoding. Set allow_dangerously_set_content to 'True' for this argument and implement custom encoding, or provide the value as a string.'


User: I love rock and metallica, I'd like to listen to metallica, any suggestion?


KernelInvokeException: Error occurred while invoking function: 'chatPlugin-chat'

In [28]:
await chat("that sounds interesting, what is it about?")

User: that sounds interesting, what is it about?
ChatBot: Metallica's music often explores themes such as personal struggle, war, and societal issues. For example, "One" is a powerful anti-war song that tells the story of a soldier who has been severely injured and longs for freedom. "Master of Puppets" delves into addiction and the loss of control it brings. Their sound is characterized by heavy guitar riffs, fast tempos, and dynamic song structures, making them a defining band in the thrash metal genre. If you're interested in the lyrics or themes of specific songs, I can provide more details!


In [29]:
await chat("if I listen that song, what exactly will I feel?")

User: if I listen that song, what exactly will I feel?
ChatBot: Listening to Metallica's music can evoke a range of emotions. For example:

- **"One"**: You might feel a deep sense of empathy and sadness as it portrays the struggles of a soldier. The haunting melodies and intense lyrics can create a somber yet powerful atmosphere.

- **"Master of Puppets"**: This song often instills feelings of urgency and intensity. The fast-paced riffs and aggressive vocals can give you a rush, while the lyrics may provoke thoughts about the dangers of addiction and manipulation.

- **"Enter Sandman"**: This track can create a sense of tension and thrill. The eerie opening and driving beat can evoke feelings of excitement mixed with a bit of fear, as it touches on themes of nightmares and childhood fears.

Each song has its own unique vibe, so you might experience a mix of adrenaline, introspection, or catharsis depending on the track. Let me know if you want to dive deeper into any specific song!


In [30]:
await chat("could you list some more songs I could listen to?")

User: could you list some more songs I could listen to?
ChatBot: Absolutely! Here are some more Metallica songs you should check out:

1. **"The Unforgiven"** - A powerful ballad about struggle and regret.
2. **"Seek & Destroy"** - A classic that captures the raw energy of their early sound.
3. **"Creeping Death"** - A fan favorite that combines heavy riffs with epic storytelling.
4. **"Nothing Else Matters"** - A beautiful, introspective track that showcases their softer side.
5. **"For Whom the Bell Tolls"** - Known for its iconic intro and themes of mortality.
6. **"Sad But True"** - A heavy, groove-laden track that explores self-reflection.
7. **"The Day That Never Comes"** - A more modern take with a mix of melody and aggression.
8. **"Battery"** - A fast-paced opener from the "Master of Puppets" album that sets a high-energy tone.

These songs offer a great mix of their styles and themes. Enjoy your listening!


After chatting for a while, we have built a growing history, which we are attaching to each prompt and which contains the full conversation. Let's take a look!


In [31]:
print(chat_history)

<chat_history><message role="system"><text>You are a helpful chatbot who is good about giving song recommendations.</text></message><message role="assistant"><text>Sure! What kind of mood are you in or what genre of music do you like?</text></message><message role="user"><text>I love rock and metallica, I'd like to listen to metallica, any suggestion?</text></message><message role="assistant"><text>If you love Metallica, you should definitely check out their iconic albums like "Master of Puppets" and "Ride the Lightning." Some standout tracks to listen to are "Enter Sandman," "One," and "Fade to Black." If you’re looking for something a bit different but still in the same vein, you might enjoy bands like Megadeth, Slayer, or Pantera. Let me know if you want more specific recommendations!</text></message><message role="user"><text>that sounds interesting, what is it about?</text></message><message role="assistant"><text>Metallica's music often explores themes such as personal struggle, 

# Running Native Functions


In the previous section we learned how to execute semantic functions inline and how to run prompts from a file.

In this section, we'll show how to use native functions from a file. We will also show how to call semantic functions from native functions.

This can be useful in a few scenarios:

- Writing logic around how to run a prompt that changes the prompt's outcome.
- Using external data sources to gather data to concatenate into your prompt.
- Validating user input data prior to sending it to the LLM prompt.

Native functions are defined using standard Python code. The structure is simple, but not well documented at this point.


In [32]:
from semantic_kernel import Kernel

kernel = Kernel()

service_id = None

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
    ),
)

Let's create a **native** function that gives us a random number between 100 and a user input as the upper limit. We'll use this number to create 100-x paragraphs of text when passed to a semantic function.


First, let's create our native function.


In [36]:
import random

from semantic_kernel.functions import kernel_function


class GenerateNumberPlugin:
    """
    Description: Generate a number between 100-x.
    """

    @kernel_function(
        description="Generate a random number between 100-x",
        name="GenerateNumberHundredOrHigher",
    )
    def generate_number_hundred_or_higher(self, input: str) -> str:
        """
        Generate a number between 10-<input>
        Example:
            "102" => rand(100,102)
        Args:
            input -- The upper limit for the random number generation
        Returns:
            int value
        """
        try:
            return str(random.randint(100, int(input)))
        except ValueError as e:
            print(f"Invalid input {input}")
            raise e

Next, let's create a semantic function that accepts a number as `{{$input}}` and generates that number of paragraphs about two engineers on an adventure. `$input` is a default variable semantic functions can use.


In [37]:
from semantic_kernel.connectors.ai.open_ai import AzureChatPromptExecutionSettings, OpenAIChatPromptExecutionSettings
from semantic_kernel.prompt_template import InputVariable, PromptTemplateConfig

prompt = """
Write a short story about two engineers on an adventure.
The story must be:
- G rated
- Have a positive message
- No sexism, racism or other bias/bigotry
- Be exactly {{$input}} paragraphs long. It must be this length.
"""


execution_settings = AzureChatPromptExecutionSettings(
    service_id=service_id,
    ai_model_id="gpt-4o-mini",
    max_tokens=2000,
    temperature=0.7,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="story",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="input", description="The user input", is_required=True),
    ],
    execution_settings=execution_settings,
)

engineer_story = kernel.add_function(
    function_name="EngineerStory",
    plugin_name="EngineerPlugin",
    prompt_template_config=prompt_template_config,
)

generate_number_plugin = kernel.add_plugin(GenerateNumberPlugin(), "GenerateNumberPlugin")

In [38]:
# Run the number generator
generate_number_three_or_higher = generate_number_plugin["GenerateNumberHundredOrHigher"]
number_result = await generate_number_three_or_higher(kernel, input=101)
print(number_result)

100


In [39]:
story = await engineer_story.invoke(kernel, input=number_result.value)

_Note: depending on which model you're using, it may not respond with the proper number of paragraphs. Please ensure to stop the cell after running for a while, since it will continue generating n (>100) paragraphs_


In [40]:
print(f"Generating an engineer story exactly {number_result.value} paragraphs long.")
print("=====================================================")
print(story)

Generating an engineer story exactly 100 paragraphs long.
**Title: The Engineers' Quest**

**Paragraph 1:**
In a small town nestled between rolling hills and sparkling streams, two engineers named Alex and Sam had a dream. They were passionate about building things that could help their community and the world at large.

**Paragraph 2:**
One sunny morning, they gathered in their workshop, a cozy space filled with tools, blueprints, and the scent of fresh wood. Their eyes sparkled with excitement as they spoke about their next big project: a solar-powered water purifier.

**Paragraph 3:**
“Imagine clean water for everyone,” Sam exclaimed, sketching ideas on a whiteboard. “This could change lives!” Alex nodded in agreement, their shared enthusiasm fueling their creativity.

**Paragraph 4:**
After hours of brainstorming, they decided to take a weekend to prototype their invention. They packed their bags with tools, snacks, and a bright yellow tent, ready for an adventure in the nearby for

## Kernel Functions with Annotated Parameters

That works! But let's expand on our example to make it more generic.

For the native function, we'll introduce the lower limit variable. This means that a user will input two numbers and the number generator function will pick a number between the first and second input.

We'll make use of the Python's `Annotated` class to hold these variables.


In [41]:
kernel.remove_all_services()

service_id = None

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
    ),
)

Let's start with the native function. Notice that we're add the `@kernel_function` decorator that holds the name of the function as well as an optional description. The input parameters are configured as part of the function's signature, and we use the `Annotated` type to specify the required input arguments.


In [42]:
import sys
from typing import Annotated

from semantic_kernel.functions import kernel_function


class GenerateNumberPlugin:
    """
    Description: Generate a number between a min and a max.
    """

    @kernel_function(
        name="GenerateNumber",
        description="Generate a random number between min and max",
    )
    def generate_number(
        self,
        min: Annotated[int, "the minimum number of paragraphs"],
        max: Annotated[int, "the maximum number of paragraphs"] = 10,
    ) -> Annotated[int, "the output is a number"]:
        """
        Generate a number between min-max
        Example:
            min="4" max="10" => rand(4,8)
        Args:
            min -- The lower limit for the random number generation
            max -- The upper limit for the random number generation
        Returns:
            int value
        """
        try:
            return str(random.randint(min, max))
        except ValueError as e:
            print(f"Invalid input {min} and {max}")
            raise e

In [43]:
generate_number_plugin = kernel.add_plugin(GenerateNumberPlugin(), "GenerateNumberPlugin")
generate_number = generate_number_plugin["GenerateNumber"]

Now let's also allow the semantic function to take in additional arguments. In this case, we're going to allow the our EngineerStory function to be written in a specified language. We'll need to provide a `paragraph_count` and a `language`.


In [None]:
prompt = """
Write a short story about two engineers on an adventure.
The story must be:
- G rated
- Have a positive message
- No sexism, racism or other bias/bigotry
- Be exactly {{$paragraph_count}} paragraphs long
- Be written in this language: {{$language}}
"""


execution_settings = AzureChatPromptExecutionSettings(
    service_id=service_id,
    ai_model_id="gpt-35-turbo",
    max_tokens=2000,
    temperature=0.7,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="summarize",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="paragraph_count", description="The number of paragraphs", is_required=True),
        InputVariable(name="language", description="The language of the story", is_required=True),
    ],
    execution_settings=execution_settings,
)

engineer_story = kernel.add_function(
    function_name="EngineerStory",
    plugin_name="EngineerPlugin",
    prompt_template_config=prompt_template_config,
)

Let's generate a paragraph count.


In [46]:
result = await generate_number.invoke(kernel, min=1, max=5)
num_paragraphs = result.value
print(f"Generating a engineer story {num_paragraphs} paragraphs long.")

Generating a engineer story 3 paragraphs long.


We can now invoke our engineer_story function using the `kernel` and the keyword arguments `paragraph_count` and `language`.


In [47]:
# Pass the output to the semantic story function
desired_language = "Spanish"
story = await engineer_story.invoke(kernel, paragraph_count=num_paragraphs, language=desired_language)

In [48]:
print(f"Generating an engineer story {num_paragraphs} paragraphs long in {desired_language}.")
print("=====================================================")
print(story)

Generating an engineer story 3 paragraphs long in Spanish.
Era un soleado día de primavera cuando dos Corgis, Toby y Luna, decidieron aventurarse más allá de su jardín. Siempre habían soñado con explorar el mundo más allá de las cercas, así que, con un salto de emoción, cruzaron la puerta y se encontraron en el parque cercano. El aroma de las flores y el sonido de los pájaros les llenaron de alegría. Juntos, se prometieron que hoy sería un día especial y lleno de sorpresas.

Mientras exploraban, Toby y Luna se toparon con un grupo de animales que jugaban a la pelota. Al principio, se sintieron un poco tímidos, pero pronto, la curiosidad superó su miedo. Se acercaron y, con un ladrido amistoso, pidieron unirse al juego. Todos los animales aceptaron con gusto, y así, los dos Corgis pasaron horas corriendo, saltando y riendo en compañía de sus nuevos amigos. Toby y Luna aprendieron que, a veces, lo desconocido puede traer momentos maravillosos si se tiene el valor de acercarse.

Al caer l

## Calling Native Functions within a Semantic Function

One neat thing about the Semantic Kernel is that you can also call native functions from within Prompt Functions!

We will make our EngineerStory semantic function call a native function `GenerateNames` which will return names for our characters.

We do this using the syntax `{{plugin_name.function_name}}`. 

In [None]:
from semantic_kernel.functions import kernel_function


class GenerateNamesPlugin:
    """
    Description: Generate character names.
    """

    # The default function name will be the name of the function itself, however you can override this
    # by setting the name=<name override> in the @kernel_function decorator. In this case, we're using
    # the same name as the function name for simplicity.
    @kernel_function(description="Generate character names", name="generate_names")
    def generate_names(self) -> str:
        """
        Generate two names.
        Returns:
            str
        """
        names = {"Ada", "Grace", "Linus", "Alan", "Margaret", "Dennis", "Barbara"}
        first_name = random.choice(list(names))
        names.remove(first_name)
        second_name = random.choice(list(names))
        return f"{first_name}, {second_name}"

In [50]:
generate_names_plugin = kernel.add_plugin(GenerateNamesPlugin(), plugin_name="GenerateNames")
generate_names = generate_names_plugin["generate_names"]

In [51]:
prompt = """
Write a short story about two engineers on an adventure.
The story must be:
- G rated
- Have a positive message
- No sexism, racism or other bias/bigotry
- Be exactly {{$paragraph_count}} paragraphs long
- Be written in this language: {{$language}}
- The two names of the corgis are {{GenerateNames.generate_names}}
"""

In [52]:

execution_settings = AzureChatPromptExecutionSettings(
    service_id=service_id,
    ai_model_id="gpt-4o-mini",
    max_tokens=2000,
    temperature=0.7,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="engineer-new",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="paragraph_count", description="The number of paragraphs", is_required=True),
        InputVariable(name="language", description="The language of the story", is_required=True),
    ],
    execution_settings=execution_settings,
)

engineer_story = kernel.add_function(
    function_name="EngineerStoryUpdated",
    plugin_name="EngineerPluginUpdated",
    prompt_template_config=prompt_template_config,
)

In [53]:
result = await generate_number.invoke(kernel, min=1, max=5)
num_paragraphs = result.value

In [54]:
desired_language = "French"
story = await engineer_story.invoke(kernel, paragraph_count=num_paragraphs, language=desired_language)

In [55]:
print(f"Generating an engineer story {num_paragraphs} paragraphs long in {desired_language}.")
print("=====================================================")
print(story)

Generating an engineer story 1 paragraphs long in French.
Un jour ensoleillé, deux ingénieurs passionnés, Clara et Lucas, décidèrent d'emmener leurs adorables corgis, Pizza et Boots, en randonnée dans la forêt voisine pour tester leur nouveau drone. En survolant les arbres, ils découvrirent un vieux pont en bois qui semblait prêt à s'effondrer. Au lieu de l'ignorer, ils décidèrent de le réparer ensemble, utilisant leurs compétences en ingénierie et en travail d'équipe. Grâce à leur détermination et à l'aide de Pizza et Boots, qui aboyaient joyeusement, ils restaurèrent le pont, permettant à d'autres randonneurs de traverser en toute sécurité. Cette aventure leur rappela que la collaboration et la curiosité peuvent transformer des défis en opportunités, tout en renforçant les liens d'amitié et en rendant le monde un peu meilleur.


### Recap

A quick review of what we've learned here:

- We've learned how to create native and prompt functions and register them to the kernel
- We've seen how we can use Kernel Arguments to pass in more custom variables into our prompt
- We've seen how we can call native functions within a prompt.


# Groundedness Checking Plugins

Large language models (LLMs) are known to sometimes generate information that isn't supported by their input—often referred to as "hallucinations" or, more precisely, "ungrounded additions." These are details in the output that cannot be directly verified. To determine if something in an LLM's response is accurate, we can either check if it appears in the provided prompt ("narrow grounding") or rely on general world knowledge ("broad grounding").

In this section, we'll implement a basic grounding pipeline to identify and address ungrounded additions in summary texts compared to their original sources. The process involves three main steps:

1. Extract a list of entities from the summary text.
2. Check whether these entities are present in the original (grounding) text.
3. Remove any entities from the summary that are not grounded in the original text.

Here, an "entity" refers to a named object, such as a person or place (e.g., "Dean" or "Seattle"). While entities can also include claims that connect concepts (like "Dean lives near Seattle"), this section will focus on the simpler case of named objects.


Let us define our grounding text:

In [56]:
grounding_text = """I was born in the rolling hills of Alsace, into a family whose name carried both honor and modesty in equal measure. For generations, our kin had stewarded the vineyards of the region, tending the earth with patience and a quiet devotion. My father, when not in the fields, could be found at the communal table in Colmar, dispensing wisdom earned through years of measured hardship and principled living. Though he held no titles, his reputation as a judicious man was universally acknowledged.

In the autumn of his life, he made a decision that would redefine us. A neighbor, once renowned for his craftsmanship, had fallen upon misfortune—his hands, once steady and exacting, now trembled with age and wear. He withdrew from public life, ashamed that he could no longer fashion vessels of beauty and use. My father, moved by compassion and respect, insisted on restoring the craftsman’s dignity. He offered him work in our cooperage—not as a mere hand, but as a teacher to the younger apprentices, whose eagerness matched only their inexperience.

At first, the neighbor resisted, his pride stubborn like the ancient oaks. Yet, day by day, he lent his expertise to the apprentices, shaping their skill as deftly as he once shaped oak staves. The workshop, once just a place of labor, transformed into a hall of shared stories—echoed with laughter, hushed regret, and cautious hope. In time, the man regained a portion of his old strength, not in the flexibility of his joints, but in the steady pride he carried once more.

When he passed in winter’s depth, it was not grief alone that filled the room, but a sense of quiet triumph: of pride reclaimed, skill passed forward, and a life tenderly redeemed by community and care."""

In [57]:
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, OpenAIChatCompletion

kernel = Kernel()

service_id = None

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
kernel.add_service(
    AzureChatCompletion(
        service_id=service_id,
    ),
)

## Import the Plugins

We are going to be using the grounding plugin, to check its quality, and remove ungrounded additions:


In [None]:
# note: this is an official Microsoft plugin taken from the semantic-kernel repository. You can use it as a reference for your own plugins.
plugins_directory = "./plugins"

groundingSemanticFunctions = kernel.add_plugin(parent_directory=plugins_directory, plugin_name="GroundingPlugin")

We can also extract the individual semantic functions for our use:


In [59]:
entity_extraction = groundingSemanticFunctions["ExtractEntities"]
reference_check = groundingSemanticFunctions["ReferenceCheckEntities"]
entity_excision = groundingSemanticFunctions["ExciseEntities"]

## Calling Individual Semantic Functions

We will start by calling the individual grounding functions in turn, to show their use. For this we need to create a same summary text:


In [60]:
summary_text = """
My father, a respected Genevese statesman, was devoted to his friend Beaufort, a merchant who—ruined by misfortune—had withdrawn in poverty to Lucerne. Troubled by Beaufort’s plight, my father searched until he found him in a mean street near the Reuss. Beaufort had saved only a small remnant of his fortune, insufficient to support himself and his steadfast daughter, Caroline. Caroline eked out a living with plain work and straw-plaiting, but within ten months Beaufort died, leaving her penniless. My father placed her under his relatives’ care in Geneva and, two years later, took her as his wife.
"""

summary_text = summary_text.replace("\n", " ").replace("  ", " ")
print(summary_text)

 My father, a respected Genevese statesman, was devoted to his friend Beaufort, a merchant who—ruined by misfortune—had withdrawn in poverty to Lucerne. Troubled by Beaufort’s plight, my father searched until he found him in a mean street near the Reuss. Beaufort had saved only a small remnant of his fortune, insufficient to support himself and his steadfast daughter, Caroline. Caroline eked out a living with plain work and straw-plaiting, but within ten months Beaufort died, leaving her penniless. My father placed her under his relatives’ care in Geneva and, two years later, took her as his wife. 


The grounding plugin operates in three steps:

1. Identify entities within the summary text.
2. Check if these entities are present in the grounding text.
3. Remove any entities from the summary that are not supported by the grounding text.

Let's proceed to call each semantic function individually.

### Extracting the Entities

The first function we need is entity extraction. We are going to take our summary text, and get a list of entities found within it. For this we use `entity_extraction()`:


In [61]:
extraction_result = await kernel.invoke(
    entity_extraction,
    input=summary_text,
    topic="people and places",
    example_entities="John, Jane, mother, brother, Paris, Rome",
)

print(extraction_result)

<entities>
- Genevese: Refers to a person from Geneva, Switzerland, indicating a place of origin.
- Beaufort: The name of a merchant and a friend of the narrator's father.
- Lucerne: A city in Switzerland where Beaufort had withdrawn.
- Reuss: A river in Switzerland, near which Beaufort was found.
- Caroline: The name of Beaufort's steadfast daughter.
- Geneva: A city in Switzerland where Caroline was placed under the care of her relatives.
</entities>


So we have our list of entities in the summary


### Performing the reference check

We now use the grounding text to see if the entities we found are grounded. We start by adding the grounding text to our context:


With this in place, we can run the reference checking function. This will use both the entity list in the input, and the `reference_context` in the context object itself:


In [62]:
grounding_result = await kernel.invoke(reference_check, input=extraction_result.value, reference_context=grounding_text)

print(grounding_result)

<ungrounded_entities>
- Genevese
- Beaufort
- Lucerne
- Reuss
- Caroline
- Geneva
</ungrounded_entities>


### Excising the ungrounded entities

Finally we can remove the ungrounded entities from the summary text:


In [63]:
excision_result = await kernel.invoke(entity_excision, input=summary_text, ungrounded_entities=grounding_result.value)

print(excision_result)

My father, a respected statesman, was devoted to his friend, a merchant who—ruined by misfortune—had withdrawn in poverty to a small town. Troubled by his plight, my father searched until he found him in a mean street near a river. He had saved only a small remnant of his fortune, insufficient to support himself and his steadfast daughter. She eked out a living with plain work and straw-plaiting, but within ten months he died, leaving her penniless. My father placed her under his relatives’ care and, two years later, took her as his wife.


# Streaming Multiple Results


In this section you will see how you can in a single request, have the LLM model return multiple results per prompt. This is useful for running experiments where you want to evaluate the robustness of your prompt and the parameters of your config against a particular large language model.


First, we will set up the text and chat services we will be submitting prompts to.


In [None]:
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import (
    AzureChatCompletion,
    AzureChatPromptExecutionSettings,  # noqa: F401
    AzureTextCompletion,
)

kernel = Kernel()

# Configure Azure LLM service
service_id = None

from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion

service_id = "default"
aoai_chat_service = AzureChatCompletion(
    service_id="aoai_chat",
)
aoai_text_service = AzureTextCompletion(
    service_id="aoai_text",
)

## Multiple Azure OpenAI Chat Completions


In [65]:
az_oai_prompt_execution_settings = AzureChatPromptExecutionSettings(
    service_id="aoai_chat",
    max_tokens=80,
    temperature=0.7,
    top_p=1,
    frequency_penalty=0.5,
    presence_penalty=0.5,
    number_of_responses=3,
)

In [None]:

content = (
    "I am going to complete all my projects tomorrow. I will wake up, start working on them, build a plan, have coffee..."
)
chat = ChatHistory()
chat.add_user_message(content)
results = await aoai_chat_service.get_chat_message_contents(
    chat_history=chat, settings=az_oai_prompt_execution_settings
)

for i, result in enumerate(results):
    print(f"Result {i + 1}: {result!s}")

Result 1: ...enjoy a healthy breakfast to kickstart my day! After that, I’ll take some time to set my goals for the day and prioritize my tasks. Maybe I’ll dive into a good book or listen to an inspiring podcast while I sip on my coffee. Later, I might meet up with friends for lunch or explore a new part of town. In the evening, I'll unwind with some yoga
Result 2: ...enjoy a delicious breakfast to fuel my day. After that, I plan to tackle some tasks I've been putting off, like organizing my workspace and catching up on reading. In the afternoon, I might meet up with friends for lunch or explore a new spot in town. I'll make sure to take some time for myself too—maybe meditate or do some journaling. As the day winds down,
Result 3: ...enjoy a delicious breakfast to fuel my day. After that, I plan to tackle my to-do list with energy and focus. Maybe I'll take some time to read a few chapters of a book I've been meaning to dive into. In the afternoon, I could meet up with friends for cof