# 1. Install the Required libraries

In [68]:
%pip install semantic-kernel==0.9.6b1

Note: you may need to restart the kernel to use updated packages.


# 2. Create your environment variables .env file

Add your environment variables then run the cell to create the *.env* file with your environment variable.

In [81]:
%%writefile .env
# Environment variables obtained from Azure OpenAI
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME="chat-model"
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME="text-embedding-ada-002"
AZURE_OPENAI_DEPLOYMENT_NAME="cl-dtp-ais-dev-ws-openai-east"
AZURE_OPENAI_ENDPOINT="https://cl-dtp-ais-dev-ws-openai-east-cdn.openai.azure.com/"
AZURE_OPENAI_API_KEY="4251b9cd1fec4835aeb98bf1f0c40809"
# Environment variable obtained from Azure Cosmos DB for MongoDB vCore
AZCOSMOS_CONNSTR="mongodb+srv://aidevadmin:CanadaLife2024@cl-dtp-ais-dev-ws-cosmosdb-csa.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
# Environment variables you set to be used by the code
AZCOSMOS_DATABASE_NAME="ragdatabase"
AZCOSMOS_CONTAINER_NAME="ragcontainer"



Overwriting .env


# 3. Load the environment variables

In [69]:
# load the environment variables file
from dotenv import load_dotenv
import os

load_dotenv()

True

Some of the parameters needed by [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/vector-search) to create the vector search index are handled by semantic kernel.

In this guide, we are using `text-embedding-ada-002` embedding model to generate the embeddings which uses a 1536-dimensional embedding vector.

The `num_lists` is an integer that represents of clusters that the inverted file (IVF) index uses to group the vector data.

The `similarity` used with IVF index here is the `COS` (cosine distance) but you can also try `L2` (Euclidean distance), and `IP` (inner product). For more information see the [Understand embeddings in Azure OpenAI Service article](https://learn.microsoft.com/azure/ai-services/openai/concepts/understand-embeddings#cosine-similarity).

In [70]:
# collection name will be used multiple times in the code so we store it in a variable
collection_name = os.environ.get("AZCOSMOS_CONTAINER_NAME")

# Vector search index parameters
index_name = "VectorSearchIndex"
vector_dimensions = 1536  # text-embedding-ada-002 uses a 1536-dimensional embedding vector
num_lists = 1
similarity = "COS"  # cosine distance

# 4. Create Helper Functions

This function takes in a json file of NoSQL records and checks if your data exists in the database using the id of the record then skips the record if it exists or generates embeddings and uploads the database record along with it's embedding.

The `save_information` function does two things: generate embeddings + upload the data to your database.

Learn more about the semantic kernel memory store [here](https://learn.microsoft.com/semantic-kernel/memories/) and the embeddings [here](https://learn.microsoft.com/semantic-kernel/memories/embeddings).

In [71]:
import json
from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory
from semantic_kernel.memory.memory_store_base import MemoryStoreBase


async def upsert_data_to_memory_store(memory: SemanticTextMemory, store: MemoryStoreBase, data_file_path: str) -> None:
    """
    This asynchronous function takes two memory stores and a data file path as arguments.
    It is designed to upsert (update or insert) data into the memory stores from the data file.

    Args:
        memory (callable): A callable object that represents the semantic kernel memory.
        store (callable): A callable object that represents the memory store where data will be upserted.
        data_file_path (str): The path to the data file that contains the data to be upserted.

    Returns:
        None. The function performs an operation that modifies the memory stores in-place.
    """
    with open(file=data_file_path, mode="r", encoding="utf-8") as f:
        data = json.load(f)
        n = 0
        for item in data:
            n += 1
            # check if the item already exists in the memory store
            # if the id doesn't exist, it throws an exception
            try:
                already_created = bool(await store.get(collection_name, item["id"], with_embedding=True))
            except Exception:
                already_created = False
            # if the record doesn't exist, we generate embeddings and save it to the database
            if not already_created:
                await memory.save_information(
                    collection=collection_name,
                    id=item["id"],
                    # the embedding is generated from the text field
                    text=item["content"],
                    description=item["title"],
                )
                print(
                    "Generating embeddings and saving new item:",
                    n,
                    "/",
                    len(data),
                    end="\r",
                )
            else:
                print("Skipping item already exits:", n, "/", len(data), end="\r")

# 5. Add the Chat and Embedding models to the Semantic Kernel

Import the semantic kernel, and initialize the semantic kernel.

In [72]:
from semantic_kernel import Kernel

# Intialize the kernel
kernel = Kernel()

Import the needed libraries.

We need the chat completion for having a conversation and text embeddings for generating embeddings.

In [73]:
from semantic_kernel.connectors.ai.open_ai import (
    AzureChatCompletion,
    AzureTextEmbedding,
)

Load the chat deployment name, initialize the chat completions with the required parameters, and add the created chat service to the semantic kernel instance.

In [79]:
# adding azure openai chat service
chat_model_deployment_name = os.environ.get("AZURE_OPENAI_CHAT_DEPLOYMENT_NAME")
endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
api_key = os.environ.get("AZURE_OPENAI_API_KEY")

kernel.add_service(
    AzureChatCompletion(
        service_id="chat_completion",
        deployment_name=chat_model_deployment_name,
        endpoint=endpoint,
        api_key=api_key,
    )
)
print("Added Azure OpenAI Chat Service...")

KernelFunctionAlreadyExistsError: Service with service_id 'chat_completion' already exists

Load the embeddings deployment name and initialize the text embedding with the required parameters, and add the created embedding service to the semantic kernel instance.

In [75]:
# adding azure openai text embedding service
embedding_model_deployment_name = os.environ.get("AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME")

kernel.add_service(
    AzureTextEmbedding(
        service_id="text_embedding",
        deployment_name=embedding_model_deployment_name,
        endpoint=endpoint,
        api_key=api_key,
    )
)
print("Added Azure OpenAI Embedding Generation Service...")

Added Azure OpenAI Embedding Generation Service...


# 6. Create or Update Azure Cosmos DB for MongoDB

The semantic kernel can handel the database, collection, index creation.

Import the Azure CosmosDB memory store and initialize it with the parameters defined before.

If the database, collection, and index exist it won't overwrite it.

In [76]:
from semantic_kernel.connectors.memory.azure_cosmosdb import (
    AzureCosmosDBMemoryStore,
)

print("Creating or updating Azure Cosmos DB Memory Store...")
# create azure cosmos db for mongo db vcore api store and collection with vector ivf
# currently, semantic kernel only supports the ivf vector kind
store = await AzureCosmosDBMemoryStore.create(
    cosmos_connstr=os.environ.get("AZCOSMOS_CONNSTR"),
    cosmos_api="mongo-vcore",
    database_name=os.environ.get("AZCOSMOS_DATABASE_NAME"),
    collection_name=collection_name,
    index_name=index_name,
    vector_dimensions=vector_dimensions,
    num_lists=num_lists,
    similarity=similarity,
)
print("Finished updating Azure Cosmos DB Memory Store...")

Creating or updating Azure Cosmos DB Memory Store...
Finished updating Azure Cosmos DB Memory Store...


Add the created memory store to the semantic kernel instance.

In [77]:
from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory
from semantic_kernel.core_plugins.text_memory_plugin import TextMemoryPlugin

memory = SemanticTextMemory(storage=store, embeddings_generator=kernel.get_service("text_embedding"))
kernel.add_plugin(TextMemoryPlugin(memory), "TextMemoryPluginACDB")
print("Registered Azure Cosmos DB Memory Store...")

Registered Azure Cosmos DB Memory Store...


# 7. Generate embeddings and Create Database records

Call the helper function with the JSON data file to generate embeddings and create or update the database records.

If the records already exit it will skip it.

Records are identified by their ids.

The data used here is a dummy data which you can replace with your own.

**Note that you need to specify id, text, and description fields.
The text field is what gets converted to embeddings.**

See the helper function definition for more information.

In [78]:
# cleaned-top-movies-chunked.json contains the top 344 movie from the IMDB movies dataset
# You can also try the text-sample.json which contains 107 Azure Service.
# Replace the file name cleaned-top-movies-chunked.json with text-sample.json

print("Upserting data to Azure Cosmos DB Memory Store...")
await upsert_data_to_memory_store(memory, store, "./src/data/cleaned-top-movies-chunked.json")

Upserting data to Azure Cosmos DB Memory Store...
Skipping item already exits: 165 / 344

ServiceResponseException: ("<class 'semantic_kernel.connectors.ai.open_ai.services.azure_text_embedding.AzureTextEmbedding'> service failed to generate embeddings", APIConnectionError('Connection error.'))

# 8. Test the Vector Database

The search function converts the query_term to a vector embedding and finds the similarity between it and the database records.

In [66]:
# each time it calls the embedding model to generate embeddings from your query
query_term = "What do you know about the godfather?"
result = await memory.search(collection_name, query_term)

ServiceResponseException: ("<class 'semantic_kernel.connectors.ai.open_ai.services.azure_text_embedding.AzureTextEmbedding'> service failed to generate embeddings", APIConnectionError('Connection error.'))

In [11]:
print(
    f"Result is: {result[0].text}\nRelevance Score: {result[0].relevance}\nFull Record: {result[0].additional_metadata}"
)

Result is: The Godfather: The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son.
Relevance Score: 0.875003419815884
Full Record: {"text": "The Godfather: The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son.", "description": "The Godfather", "additional_metadata": null}


# 9. Create chat function with Azure OpenAI chat model

In [12]:
prompt = """
    You are a chatbot that can have a conversations about any topic related to the provided context.
    Give explicit answers from the provided context or say 'I don't know' if it does not have an answer.
    provided context: {{$db_record}}

    User: {{$query_term}}
    Chatbot:"""

In [13]:
from semantic_kernel.connectors.ai.open_ai import OpenAITextPromptExecutionSettings

execution_settings = OpenAITextPromptExecutionSettings(
    service_id="chat_completion", ai_model_id=chat_model_deployment_name, max_tokens=500, temperature=0.0, top_p=0.5
)

In [14]:
from semantic_kernel.prompt_template import PromptTemplateConfig
from semantic_kernel.prompt_template.input_variable import InputVariable

chat_prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="grounded_response",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="db_record", description="The database record", is_required=True),
        InputVariable(name="query_term", description="The user input", is_required=True),
    ],
    execution_settings=execution_settings,
)

In [15]:
chat_function = kernel.add_function(
    function_name="ChatGPTFunc", plugin_name="chatGPTPlugin", prompt_template_config=chat_prompt_template_config
)

In [16]:
from semantic_kernel.functions import KernelArguments


completions_result = await kernel.invoke(
    chat_function, KernelArguments(query_term=query_term, db_record=result[0].additional_metadata)
)

In [17]:
print(completions_result)

The Godfather is a movie about the aging patriarch of an organized crime dynasty who transfers control of his clandestine empire to his reluctant son.


# 10. Testing the RAG flow 

In [18]:
import time

query_term = ""
search_result = ""
completions_result = ""

while query_term != "exit":
    query_term = input("Enter a query: ")
    search_result = await memory.search(collection_name, query_term)
    completions_result = kernel.invoke_stream(
        chat_function, KernelArguments(query_term=query_term, db_record=search_result[0].additional_metadata)
    )
    print(f"Question:\n{query_term}\nResponse:")
    async for completion in completions_result:
        print(str(completion[0]), end="")
    print("\n")
    time.sleep(5)

Question:
Hey
Response:
Hello! How can I assist you today?

Question:
Do you know any crime dynasty movies?
Response:
Yes, "The Godfather" is a classic crime dynasty movie.

Question:
can you recommend me movies like the god father?
Response:
Sure, if you enjoyed The Godfather, you might also like movies such as Goodfellas, The Departed, Scarface, and The Sopranos (TV series).

Question:
thanks, bye!
Response:
You're welcome! Goodbye!

Question:
exit
Response:
Goodbye!



# **[Optional]** Adding Chat History

This chat history is local (i.e. in your computer's RAM) and not persisted anywhere beyond the life of this Jupyter session.
In this chat scenario, as the user talks back and forth with the bot, the chat context gets populated with the history of the conversation. During each new run of the kernel, the kernel arguments and chat history can provide the AI with its variables' content.

In [19]:
history_prompt = """
    You are a chatbot that can have a conversations about any topic related to the provided context.
    Give explicit answers from the provided context or say 'I don't know' if it does not have an answer.
    provided context: {{$db_record}}

    {{$history}}
    
    User: {{$query_term}}
    Chatbot:"""

In [20]:
chat_prompt_hist_template_config = PromptTemplateConfig(
    template=history_prompt,
    name="grounded_response_history",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="db_record", description="The database record", is_required=True),
        InputVariable(name="query_term", description="The user input", is_required=True),
        InputVariable(name="history", description="The chat histroy", is_required=True),
    ],
    execution_settings=execution_settings,
)

chat_history_function = kernel.add_function(
    function_name="ChatGPTFuncHist", plugin_name="chatGPTPluginHist", prompt_template_config=chat_prompt_hist_template_config
)

In [21]:
from semantic_kernel.contents import ChatHistory

chat_history = ChatHistory()
chat_history.add_system_message("You are a helpful chatbot who is good about giving movie recommendations.")

In [22]:
import time

query_term = ""
search_result = ""
completions_result = ""

while query_term != "exit":
    query_term = input("Enter a query: ")
    chat_history.add_user_message(query_term)

    search_result = await memory.search(collection_name, query_term) # vector search
    
    completions_result = await kernel.invoke(
        chat_history_function, KernelArguments(query_term=query_term, db_record=search_result[0].additional_metadata, history=chat_history)
    ) # RAG
    chat_history.add_assistant_message(str(completions_result))

    print(f"Question:\n{query_term}\nResponse:")
    print(str(completions_result), end="")
    print("\n")
    time.sleep(5)

Question:
Hey
Response:
Hello! How can I assist you today?

Question:
Do you know any comedy movies?
Response:
Yes, I can recommend some comedy movies for you. What type of comedy are you in the mood for? Romantic comedy, slapstick, satire, parody, or something else?

Question:
paradoy
Response:
Great choice! Here are some parody movies that you might enjoy:

1. Airplane! (1980)
2. The Naked Gun: From the Files of Police Squad! (1988)
3. This Is Spinal Tap (1984)
4. Austin Powers: International Man of Mystery (1997)
5. Spaceballs (1987)
6. Robin Hood: Men in Tights (1993)
7. Hot Shots! (1991)
8. Scary Movie (2000)
9. Shaun of the Dead (2004)
10. The Princess Bride (1987)

I hope you find one that you enjoy!

Question:
can you tell me more about the first movie?
Response:
Sure! "Airplane!" is a classic comedy movie from 1980 that is a parody of disaster movies. The movie follows the story of a former fighter pilot named Ted Striker, who is traumatized by his experiences in the war and i

After chatting for a while, we have built a growing history, which we are attaching to each prompt and which contains the full conversation. Let's take a look!

In [23]:
print(chat_history)

<chat_history><message role="system">You are a helpful chatbot who is good about giving movie recommendations.</message><message role="user">Hey</message><message role="assistant">Hello! How can I assist you today?</message><message role="user">Do you know any comedy movies?</message><message role="assistant">Yes, I can recommend some comedy movies for you. What type of comedy are you in the mood for? Romantic comedy, slapstick, satire, parody, or something else?</message><message role="user">paradoy</message><message role="assistant">Great choice! Here are some parody movies that you might enjoy:

1. Airplane! (1980)
2. The Naked Gun: From the Files of Police Squad! (1988)
3. This Is Spinal Tap (1984)
4. Austin Powers: International Man of Mystery (1997)
5. Spaceballs (1987)
6. Robin Hood: Men in Tights (1993)
7. Hot Shots! (1991)
8. Scary Movie (2000)
9. Shaun of the Dead (2004)
10. The Princess Bride (1987)

I hope you find one that you enjoy!</message><message role="user">can you t