# Introduction to Using Semantic Kernel with MLflow

Welcome to this interactive tutorial designed to introduce you to [SemanticKernel](https://learn.microsoft.com/en-us/semantic-kernel/overview/) and its integration with MLflow. This tutorial is structured as a notebook to provide a hands-on, practical learning experience that focuses on tracing.

Note that MLflow currently only supports tracing support for Semantic Kernel.

### Setup

First, we must install the required dependencies, enable MLflow autologging, and input an OpenAI API key. In this tutorial, we will leverage OpenAI for simplicity; however, other LLM providers can easily be leveraged.

In [None]:

%pip install mlflow -qU
%pip install semantic_kernel openai nest_asyncio -qU

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [None]:
import mlflow
import os
from getpass import getpass

# Enable MLflow autologging for Semantic Kernel
mlflow.semantic_kernel.autolog()

# Set the OpenAI API key as an environment variable
os.environ["OPENAI_API_KEY"] = getpass("openai_api_key: ")

### Quickstart

Next, we will create a simple quickstart to show MLflow tracing. 

MLflow tracing logs granular telemetry to the MLflow servers, allowing debugging within agentic steps. At each agentic step, both Semantic Kernal and MLflow metadata is logged, allowing users to speed up their development and debugging.

In [None]:
import asyncio
import nest_asyncio # This library is needed to run async calls in a Jupyter notebook

import openai
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
from semantic_kernel.functions.function_result import FunctionResult

# Allow nested event loops (needed in e.g. notebooks or certain test runners)
nest_asyncio.apply()

# Create a basic OpenAI client
openai_client = openai.AsyncOpenAI()

# Create a Semantic Kernel instance and register the OpenAI chat completion service
kernel = Kernel()
kernel.add_service(
    OpenAIChatCompletion(
        service_id="chat-gpt",
        ai_model_id="gpt-4o-mini",
        async_client=openai_client,
    )
)

# Define an async function that invokes your prompt
with mlflow.start_run(run_name="semantic_kernel simple example"):
    async def run_query() -> FunctionResult:
        return await kernel.invoke_prompt("Is sushi the best food ever?")

# Call this via asyncio.run(), which is required for a Jupyter notebook environment.
# If you are using a script, you can simply call `await run_query()`.
answer = asyncio.run(run_query())
print("AI says:", answer)



AI says: Whether sushi is the "best" food ever is highly subjective and depends on individual tastes and preferences. Many people love sushi for its unique flavors, textures, and the skill involved in its preparation. Sushi can also be seen as a healthier option, often featuring fresh fish and vegetables. 

However, others may prefer different cuisines or dishes based on their cultural background, dietary restrictions, or personal preferences. Ultimately, the "best" food is a matter of personal opinion, and there are countless culinary delights to explore around the world!


### Explore Traces

Next, let's open the MLflow UI and explore the logged traces. In the cell below, we will open up the MLflow UI in an iFrame for easy access.

To find our trace, see the rendered iFrame below and follow these steps.
1. Click on the MLflow experiment. If this is not the first time running the notebook, there may be multiple experiments. In this case, please find the most recently logged experiment. 
2. Click on your run of interest.
3. Click on your trace.

In [None]:
import subprocess

from IPython.display import IFrame

# Start the MLflow UI in a background process
mlflow_ui_command = ["mlflow", "ui", "--port", "5000"]
subprocess.Popen(
    mlflow_ui_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, preexec_fn=os.setsid
)

<Popen: returncode: None args: ['mlflow', 'ui', '--port', '5000']>

In [None]:
# Wait for the MLflow server to start then run the following command
# Note that cached results don't render, so you need to run this to see the UI
IFrame(src="http://localhost:5000", width=1000, height=600)

If everything executed properly, you should see a single MLflow trace. The trace should have two spans:
1. A top-level **parent** span with a name similar to `execute_tool bjmtaiGXxiTwWooE`. This top level span represents the semantic kernel invocation.
2. A **child** span with our chat payload. This span represents the single call via our `OpenAIChatCompletion` semantic kernel service and contains robust metadata about the invocation. 

Given semantic kernel typically involves complex async agentic calls, MLflow tracing is an invaluable tool when determine how internal calls impact the overall Kernel invocation.

For more about tracing, please see the [MLflow tracing docs](https://mlflow.org/docs/latest/genai/tracing).