# Semantic Kernel Tool Use Example

This document provides an overview and explanation of the code used to create a Semantic Kernel-based tool that integrates with Azure AI Search for Retrieval-Augmented Generation (RAG). The example demonstrates how to build an AI agent that retrieves travel documents from an Azure AI Search index, augments user queries with semantic search results, and streams detailed travel recommendations.

## Initializing the Environment

### Importing Packages
The following code imports the necessary packages:

In [1]:
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import SearchIndex, SimpleField, SearchFieldDataType, SearchableField

from openai import AsyncOpenAI

from semantic_kernel.kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.functions import kernel_function
from semantic_kernel.functions.kernel_arguments import KernelArguments
from semantic_kernel.connectors.ai import FunctionChoiceBehavior
from semantic_kernel.contents.function_call_content import FunctionCallContent
from semantic_kernel.contents.function_result_content import FunctionResultContent
from semantic_kernel.agents import ChatCompletionAgent

from typing import Annotated

### Creating the Semantic Kernel and AI Service

A Semantic Kernel instance is created and configured with an asynchronous OpenAI chat completion service. The service is added to the kernel for use in generating responses.

In [2]:
# Initialize the asynchronous OpenAI client
client = AsyncOpenAI(
    api_key=os.environ["GITHUB_TOKEN"],
    base_url="https://models.inference.ai.azure.com/"
)

# Create a Semantic Kernel instance and add an OpenAI chat completion service.
kernel = Kernel()
chat_completion_service = OpenAIChatCompletion(
    ai_model_id="gpt-4o-mini",
    async_client=client,
    service_id="agent",
)
kernel.add_service(chat_completion_service)

### Defining the Prompt Plugin

The PromptPlugin is a native plugin that defines a function to build an augmented prompt using retrieval context

In [3]:
class PromptPlugin:
    @kernel_function(
        name="build_augmented_prompt",
        description="Build an augmented prompt using retrieval context or function results.",
    )
    @staticmethod
    def build_augmented_prompt(query: str, retrieval_context: str) -> str:
        return (
            f"Retrieved Context:\n{retrieval_context}\n\n"
            f"User Query: {query}\n\n"
            "First review the retrieved context, if this does not answer the query, try calling an available plugin functions that might give you an answer. If no context is available, say so."
        )

# Register the plugin with the kernel.
kernel.add_plugin(PromptPlugin(), plugin_name="promptPlugin")

KernelPlugin(name='promptPlugin', description=None, functions={})

In [4]:
class WeatherInfoPlugin:
    """A Plugin that provides the average temperature for a travel destination."""

    def __init__(self):
        # Dictionary of destinations and their average temperatures
        self.destination_temperatures = {
            "maldives": "82°F (28°C)",
            "swiss alps": "45°F (7°C)",
            "african safaris": "75°F (24°C)"
        }

    @kernel_function(description="Get the average temperature for a specific travel destination.")
    def get_destination_temperature(self, destination: str) -> Annotated[str, "Returns the average temperature for the destination."]:
        """Get the average temperature for a travel destination."""
        # Normalize the input destination (lowercase)
        normalized_destination = destination.lower()

        # Look up the temperature for the destination
        if normalized_destination in self.destination_temperatures:
            return f"The average temperature in {destination} is {self.destination_temperatures[normalized_destination]}."
        else:
            return f"Sorry, I don't have temperature information for {destination}. Available destinations are: Maldives, Swiss Alps, and African safaris."
        
kernel.add_plugin(WeatherInfoPlugin(), plugin_name="weatherplugin")

KernelPlugin(name='weatherplugin', description=None, functions={'get_destination_temperature': KernelFunctionFromMethod(metadata=KernelFunctionMetadata(name='get_destination_temperature', plugin_name='weatherplugin', description='Get the average temperature for a specific travel destination.', parameters=[KernelParameterMetadata(name='destination', description=None, default_value=None, type_='str', is_required=True, type_object=<class 'str'>, schema_data={'type': 'string'}, include_in_function_choices=True)], is_prompt=False, is_asynchronous=False, return_parameter=KernelParameterMetadata(name='return', description='Returns the average temperature for the destination.', default_value=None, type_='str', is_required=True, type_object=<class 'str'>, schema_data={'type': 'string', 'description': 'Returns the average temperature for the destination.'}, include_in_function_choices=True), additional_properties={}), invocation_duration_histogram=<opentelemetry.metrics._internal.instrument._Pro

## Vector Database Initialization

We initialize Azure AI Search with persistent storage and add enhanced sample documents. Azure AI Search will be used to store and retrieve documents that provide context for generating accurate responses.

In [5]:
# Initialize Azure AI Search with persistent storage
search_service_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")
search_api_key = os.getenv("AZURE_SEARCH_API_KEY")
index_name = "travel-documents"

search_client = SearchClient(
    endpoint=search_service_endpoint,
    index_name=index_name,
    credential=AzureKeyCredential(search_api_key)
)

index_client = SearchIndexClient(
    endpoint=search_service_endpoint,
    credential=AzureKeyCredential(search_api_key)
)

# Define the index schema
fields = [
    SimpleField(name="id", type=SearchFieldDataType.String, key=True),
    SearchableField(name="content", type=SearchFieldDataType.String)
]

index = SearchIndex(name=index_name, fields=fields)

# Check if index already exists if not, create it
try:
    existing_index = index_client.get_index(index_name)
    print(f"Index '{index_name}' already exists, using the existing index.")
except Exception as e:
    # Create the index if it doesn't exist
    print(f"Creating new index '{index_name}'...")
    index_client.create_index(index)


# Enhanced sample documents
documents = [
    {"id": "1", "content": "Contoso Travel offers luxury vacation packages to exotic destinations worldwide."},
    {"id": "2", "content": "Our premium travel services include personalized itinerary planning and 24/7 concierge support."},
    {"id": "3", "content": "Contoso's travel insurance covers medical emergencies, trip cancellations, and lost baggage."},
    {"id": "4", "content": "Popular destinations include the Maldives, Swiss Alps, and African safaris."},
    {"id": "5", "content": "Contoso Travel provides exclusive access to boutique hotels and private guided tours."}
]

# Add documents to the index
search_client.upload_documents(documents)

Index 'travel-documents' already exists, using the existing index.


[<azure.search.documents._generated.models._models_py3.IndexingResult at 0x1a3d8bf6ff0>,
 <azure.search.documents._generated.models._models_py3.IndexingResult at 0x1a3d8ce4ad0>,
 <azure.search.documents._generated.models._models_py3.IndexingResult at 0x1a3d8ce4b60>,
 <azure.search.documents._generated.models._models_py3.IndexingResult at 0x1a3d8ce4b30>,
 <azure.search.documents._generated.models._models_py3.IndexingResult at 0x1a3d8ce4b00>]

A helper function `get_retrieval_context` is defined to query the index and return the top two relevant documents based on the user query:

In [6]:
def get_retrieval_context(query: str) -> str:
    results = search_client.search(query)
    context_strings = []
    for result in results:
        context_strings.append(f"Document: {result['content']}")
    return "\n\n".join(context_strings) if context_strings else "No results found"

## Setting the Function Choice Behavior 

In Semantic Kernel, we have the ability to have some control of the agent choice of functions. This is done by using the `FunctionChoiceBehavior` class. 

The code below sets it to `Auto` which allows the agent to choose among the available functions or not choose any. 

This can also be set to:
`FunctionChoiceBehavior.Required` - to require the agent to choose at least one function 
`FunctionChoiceBehavior.NoneInvoke` - instructs the agent to not choose any function. (good for testing)

In [7]:
settings = kernel.get_prompt_execution_settings_from_service_id("agent")
settings.function_choice_behavior = FunctionChoiceBehavior.Auto()
arguments = KernelArguments(settings=settings)

In [8]:
AGENT_NAME = "TravelAgent"
AGENT_INSTRUCTIONS = (
    "Answer travel queries using the provided tools and context. If context is provided, do not say 'I have no context for that.'"

)
agent = ChatCompletionAgent(
    kernel=kernel,
    name=AGENT_NAME,
    instructions=AGENT_INSTRUCTIONS,
    arguments=arguments,
)

A helper function `get_augmented_prompt` forces a call to the plugin to build the augmented prompt. It directly calls the static plugin method:

In [9]:
async def get_augmented_prompt(query: str) -> str:
    retrieval_context = get_retrieval_context(query)
    return PromptPlugin.build_augmented_prompt(query, retrieval_context)

### Running the Agent with Streaming Chat History
The main asynchronous loop creates a chat history for the conversation and, for each user input, first adds the augmented prompt (as a system message) to the chat history so that the agent sees the retrieval context. The user message is also added, and then the agent is invoked using streaming. The output is printed as it streams in.

In [10]:
from IPython.display import display, HTML


async def main():
    # Create a chat history.
    chat_history = ChatHistory()

    user_inputs = [
        # Retrieval context available.
        "Can you explain Contoso's travel insurance coverage?",
        "What is the average temperature of the Maldives?",
        "What is a good cold destination offered by Contoso and what is it average temperature?"
        # "What is Neural Network?"  # No retrieval context available.
    ]

    for user_input in user_inputs:
        # Add the user message to chat history
        chat_history.add_user_message(user_input)
        augmented_prompt = await get_augmented_prompt(user_input)
        
        chat_history.add_system_message(
            f"Here is relevant information to help answer the user's question: {augmented_prompt}")


        # Display the augmented prompt in a collapsible section
        html_output = f"<div style='margin-bottom:10px'>"
        html_output += f"<details>"
        html_output += f"<summary style='cursor:pointer; font-weight:bold; color:#0066cc;'>RAG Context (click to expand)</summary>"
        html_output += f"<div style='margin:10px; padding:10px; background-color:#f8f8f8; border:1px solid #ddd; border-radius:4px; white-space:pre-wrap;'>{augmented_prompt}</div>"

        html_output += f"</details>"
        html_output += f"</div>"

        # Show user query
        html_output += f"<div style='margin-bottom:10px'>"
        html_output += f"<div style='font-weight:bold'>User:</div>"
        html_output += f"<div style='margin-left:20px'>{user_input}</div>"
        html_output += f"</div>"

        agent_name: str | None = None
        full_response = ""
        function_calls = []
        function_results = {}

        # Collect the agent's response with improved content handling
        async for content in agent.invoke_stream(chat_history):
            if not agent_name and hasattr(content, 'name'):
                agent_name = content.name

            # Track function calls and results
            for item in content.items:
                if isinstance(item, FunctionCallContent):
                    call_info = f"Calling: {item.function_name}({item.arguments})"
                    function_calls.append(call_info)
                elif isinstance(item, FunctionResultContent):
                    result_info = f"Result: {item.result}"
                    function_calls.append(result_info)
                    # Store function results to possibly add to chat history
                    function_results[item.function_name] = item.result

            # Better content extraction - make sure we're getting the actual text
            if hasattr(content, 'content') and content.content and content.content.strip():
                # Check if this is a regular text message (not function related)
                if not any(isinstance(item, (FunctionCallContent, FunctionResultContent))
                           for item in content.items):
                    full_response += content.content

        # Add function call info to chat history
        if function_results:
            # Even if we have some response text, we want to make sure function results are incorporated
            function_results_message = "Function calls completed with the following results: " + \
                str(function_results)
            chat_history.add_system_message(function_results_message)

            # Get final response from agent that incorporates the function results
            collected_response = ""
            async for content in agent.invoke_stream(chat_history):
                if hasattr(content, 'content') and content.content and content.content.strip():
                    collected_response += content.content

            if collected_response:
                full_response = collected_response

        # Add function calls to HTML if any occurred
        if function_calls:
            html_output += f"<div style='margin-bottom:10px'>"
            html_output += f"<details>"
            html_output += f"<summary style='cursor:pointer; font-weight:bold; color:#0066cc;'>Function Calls (click to expand)</summary>"
            html_output += f"<div style='margin:10px; padding:10px; background-color:#f8f8f8; border:1px solid #ddd; border-radius:4px; white-space:pre-wrap;'>"
            html_output += "<br>".join(function_calls)
            html_output += f"</div></details></div>"

        # Add agent response to HTML - make sure we have a valid response
        html_output += f"<div style='margin-bottom:20px'>"
        html_output += f"<div style='font-weight:bold'>{agent_name or 'Assistant'}:</div>"
        html_output += f"<div style='margin-left:20px; white-space:pre-wrap'>{full_response}</div>"
        html_output += f"</div>"
        html_output += "<hr>"

        # Add agent's response to chat history
        if full_response:
            chat_history.add_assistant_message(full_response)

        # Display formatted HTML
        display(HTML(html_output))

await main()