

####  Lab 04b: Semantic Advanced Topics (Chat with your data Scenario using Vector Search)

In this lab, we will learn how to create a virtual agent using Semantic Kernel, Azure OpenAI models and a AI Search as a Vector Store.

**Create Resources**

First create an Azure OpenAI service with a gpt-4 or gpt-35-turbo deployment using Azure Portal. For a better performance use a gpt-4 model.

In the same Azure OpenAI service create an text-embedding-ada-002 version 2 model deployment with text-embedding-ada-002 name.

Finally create an Azure AI Search service. 

**Kernel Configuration**

Configure the environment variables that Semantic Kernel will use to connect to this service by creating an **.env** file. 

You can use **.env.template** as a template for your **.env** file, just rename it and replace the variables accordingly. 

**Initialize the Kernel**

Initialize the kernel and register a persistent **Semantic Memory**

In [1]:
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, AzureTextEmbedding
from semantic_kernel.connectors.memory.azure_cognitive_search import AzureCognitiveSearchMemoryStore
from semantic_kernel.core_skills import ConversationSummarySkill, TextMemorySkill
import time

# initalize and immport TextMemorySkill

kernel = sk.Kernel()

# Configure AI service used by the kernel

deployment, api_key, endpoint, api_version = sk.azure_openai_settings_from_dot_env(include_api_version=True)
kernel.add_chat_service("chat-completion", AzureChatCompletion(deployment_name=deployment, endpoint=endpoint, api_key=api_key, api_version=api_version))
kernel.add_text_embedding_generation_service("ada",AzureTextEmbedding("text-embedding-ada-002", endpoint=endpoint, api_key=api_key, api_version=api_version))

# register AI Search as a memory store

azure_ai_search_api_key, azure_ai_search_url = sk.azure_aisearch_settings_from_dot_env()
kernel.register_memory_store(
    memory_store=AzureCognitiveSearchMemoryStore(
        vector_size=1536,
        search_endpoint=azure_ai_search_url,
        admin_key=azure_ai_search_api_key
    )
)

Create RAGChat plugin

In [2]:
!mkdir -p plugins/RAGChat/Chat

In [3]:
%%writefile plugins/RAGChat/Chat/config.json
{
     "schema": 1,
     "type": "completion",
     "description": "Based on the user ask, conversation history, search the memory for sources and answer the user.",
     "completion": {
          "max_tokens": 200,
          "temperature": 0.8,
          "top_p": 0.0,
          "presence_penalty": 0.0,
          "frequency_penalty": 0.0
     },
     "input": {
          "parameters": [
               {
                    "name": "ask",
                    "description": "The user's ask.",
                    "defaultValue": "",
                    "required": true
               },
               {
                    "name": "chat_history",
                    "description": "The conversation history.",
                    "defaultValue": "",
                    "required": true
               }                
          ]
     }
}

Overwriting plugins/RAGChat/Chat/config.json


In [4]:
## Task Goal
The task goal is to generate an ANSWER based on the user QUESTION and the provided SOURCES.
 
## Task instructions
 
You will be given a list of SOURCES that you can use to ANSWER the QUESTION. 
You will be given a conversation HISTORY to give you more context. 
You must use the SOURCES to ANSWER the QUESTION. 
You must not use any other SOURCES.
You must not use your own knowledge to ANSWER the QUESTION.
Do not include the word "ANSWER" in your response.
Always include the SOURCE name for each fact in the response, referencing it with square brackets, e.g., [info1.txt]. 
Do not combine SOURCES; list each source separately, e.g., [info1.txt][info2.pdf].

## Task Input:
"QUESTION": "{{$ask}}"
"HISTORY": "{{ConversationSummaryPlugin.SummarizeConversation $chat_history}}"
"SOURCES": "{{TextMemoryPlugin.recall $ask}}"
 
## Task Output:


Overwriting plugins/RAGChat/Chat/skprompt.txt


Load documents to memory

In [5]:
print("Loading documents into memory")

await kernel.memory.save_information_async(
    "kb", id="https://learn.microsoft.com/en-us/semantic-kernel/overview/", text="[https://learn.microsoft.com/en-us/semantic-kernel/overview/] Semantic Kernel is an open-source SDK that lets you easily combine AI services like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C# and Python to build LLM AI models. By doing so, you can create AI apps that combine the best of both worlds."
)
await kernel.memory.save_information_async(
    "kb", id="https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/", text="[https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/] Effective prompt design is essential to achieving desired outcomes with LLM AI models. Prompt engineering, also known as prompt design, is an emerging field that requires creativity and attention to detail. It involves selecting the right words, phrases, symbols, and formats that guide the model in generating high-quality and relevant texts."
)
await kernel.memory.save_information_async(
    "kb", id="https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/llm-models/", text="[https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/llm-models/] A GPT model is a type of neural network that uses the transformer architecture to learn from large amounts of text data. The model has two main components: an encoder and a decoder. The encoder processes the input text and converts it into a sequence of vectors, called embeddings, that represent the meaning and context of each word. The decoder generates the output text by predicting the next word in the sequence, based on the embeddings and the previous words. The model uses a technique called attention to focus on the most relevant parts of the input and output texts, and to capture long-range dependencies and relationships between words. The model is trained by using a large corpus of texts as both the input and the output, and by minimizing the difference between the predicted and the actual words. The model can then be fine-tuned or adapted to specific tasks or domains, by using smaller and more specialized datasets."
)
await kernel.memory.save_information_async(
    "kb", id="https://learn.microsoft.com/en-us/semantic-kernel/memories/", text="[https://learn.microsoft.com/en-us/semantic-kernel/memories/] Embeddings are a way of representing words or other data as vectors in a high-dimensional space. Vectors are like arrows that have a direction and a length. High-dimensional means that the space has many dimensions, more than we can see or imagine. The idea is that similar words or data will have similar vectors, and different words or data will have different vectors. This helps us measure how related or unrelated they are, and also perform operations on them, such as adding, subtracting, multiplying, etc. Embeddings are useful for AI models because they can capture the meaning and context of words or data in a way that computers can understand and process."
)

print("Done")

Loading documents into memory


Done


Load Plugins

In [6]:
# Import plugins and initialize context

text_memory_skill = kernel.import_skill(TextMemorySkill(), skill_name="TextMemoryPlugin")
conversation_summary_plugin = kernel.import_skill(ConversationSummarySkill(kernel=kernel), skill_name="ConversationSummaryPlugin")
ragchat_plugin = kernel.import_semantic_skill_from_directory("./plugins", "RAGChat")

context = kernel.create_new_context()
context[sk.core_skills.TextMemorySkill.COLLECTION_PARAM] = "kb"
context[sk.core_skills.TextMemorySkill.RELEVANCE_PARAM] = 0.8
context[sk.core_skills.TextMemorySkill.LIMIT_PARAM] = 3
context["chat_history"] = ""

Test memory search

In [7]:
ask = "What are LLM AI Apps?"

# first test the recall function.
variables = sk.ContextVariables()
variables["collection"] = "kb"
variables["relevance"] = 0.8
variables["limit"] = 3
variables["input"] = ask
output_context = await kernel.run_async(
    text_memory_skill["recall"],
    input_vars = variables
)
if output_context.error_occurred:
    print(output_context.last_error_description)
else:
    print("Recall:", output_context.result)

# memory search
result = await kernel.memory.search_async("kb", ask)
print(f"Memory search: {result[0].text}\n")

Recall: ["[https://learn.microsoft.com/en-us/semantic-kernel/overview/] Semantic Kernel is an open-source SDK that lets you easily combine AI services like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C# and Python to build LLM AI models. By doing so, you can create AI apps that combine the best of both worlds.", "[https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/] Effective prompt design is essential to achieving desired outcomes with LLM AI models. Prompt engineering, also known as prompt design, is an emerging field that requires creativity and attention to detail. It involves selecting the right words, phrases, symbols, and formats that guide the model in generating high-quality and relevant texts.", "[https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/llm-models/] A GPT model is a type of neural network that uses the transformer architecture to learn from large amounts of text data. The model has two main

Create chat conversation loop    

In [8]:
async def chat(
    kernel: sk.Kernel, chat_func: sk.SKFunctionBase, context: sk.SKContext
) -> bool:
    try:
        user_input = input("User:> ")
        context["ask"] = user_input
        print(f"User:> {user_input}")

    except KeyboardInterrupt:
        print("\n\nExiting chat...")
        return False
    
    except Exception:
        print("\n\nExiting chat...")
        return False

    if user_input == "exit":
        print("\n\nExiting chat...")
        return False

    answer = await kernel.run_async(chat_func, input_vars=context.variables)
    if answer.error_occurred:
        answer = answer.last_error_description
    
    context["chat_history"] += f"\nUser:> {user_input}\nChatBot:> {answer}\n"

    print(f"ChatBot:> {answer}")
    
    return True

Run chat loop

In [9]:
chatting = True
while chatting:
    chatting = await chat(kernel, ragchat_plugin["Chat"], context)

User:> hello
ChatBot:> Since there are no SOURCES provided and the QUESTION "hello" does not request specific information, a factual answer cannot be generated. If you have a specific question or need information, please provide more context or sources for reference.
User:> what is semanti kernel?
ChatBot:> Semantic Kernel is an open-source Software Development Kit (SDK) that enables the integration of AI services such as OpenAI, Azure OpenAI, and Hugging Face with traditional programming languages, including C# and Python. This integration facilitates the development of Large Language Model (LLM) AI applications that leverage the strengths of both AI services and conventional programming approaches [https://learn.microsoft.com/en-us/semantic-kernel/overview/].
User:> and who is pele?
ChatBot:> Since there are no SOURCES provided, I cannot give you a sourced answer about who Pelé is. If you can provide a source, I'd be happy to help with information from it.
User:> ok
ChatBot:> Since t

Run it on terminal

In [1]:
!mkdir -p src/

In [3]:
%%writefile src/ChatBot.py

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, AzureTextEmbedding
from semantic_kernel.connectors.memory.azure_cognitive_search import AzureCognitiveSearchMemoryStore
from semantic_kernel.core_skills import ConversationSummarySkill, TextMemorySkill
import asyncio

# initalize and immport TextMemorySkill

kernel = sk.Kernel()

# Configure AI service used by the kernel

deployment, api_key, endpoint, api_version = sk.azure_openai_settings_from_dot_env(include_api_version=True)
kernel.add_chat_service("chat-completion", AzureChatCompletion(deployment_name=deployment, endpoint=endpoint, api_key=api_key, api_version=api_version))
kernel.add_text_embedding_generation_service("ada",AzureTextEmbedding("text-embedding-ada-002", endpoint=endpoint, api_key=api_key, api_version=api_version))

# register AI Search as a memory store

azure_ai_search_api_key, azure_ai_search_url = sk.azure_aisearch_settings_from_dot_env()
kernel.register_memory_store(
    memory_store=AzureCognitiveSearchMemoryStore(
        vector_size=1536,
        search_endpoint=azure_ai_search_url,
        admin_key=azure_ai_search_api_key
    )
)

# Import plugins and initialize context

text_memory_skill = kernel.import_skill(TextMemorySkill(), skill_name="TextMemoryPlugin")
conversation_summary_plugin = kernel.import_skill(ConversationSummarySkill(kernel=kernel), skill_name="ConversationSummaryPlugin")
ragchat_plugin = kernel.import_semantic_skill_from_directory("./plugins", "RAGChat")

context = kernel.create_new_context()
context[sk.core_skills.TextMemorySkill.COLLECTION_PARAM] = "kb"
context[sk.core_skills.TextMemorySkill.RELEVANCE_PARAM] = 0.8
context[sk.core_skills.TextMemorySkill.LIMIT_PARAM] = 3
context["chat_history"] = ""

# Chat flow

async def chat(
    kernel: sk.Kernel, chat_func: sk.SKFunctionBase, context: sk.SKContext
) -> bool:
    try:
        user_input = input("User:> ")
        context["ask"] = user_input

    except KeyboardInterrupt:
        print("\n\nExiting chat...")
        return False
    
    except Exception:
        print("\n\nExiting chat...")
        return False

    if user_input == "exit":
        print("\n\nExiting chat...")
        return False

    answer = await kernel.run_async(chat_func, input_vars=context.variables)
    if answer.error_occurred:
        answer = answer.last_error_description
    
    context["chat_history"] += f"\nUser:> {user_input}\nChatBot:> {answer}\n"

    print(f"ChatBot:> {answer}")
    
    return True

async def main():
    chatting = True
    while chatting:
        chatting = await chat(kernel, ragchat_plugin["Chat"], context)

# Create an event loop
loop = asyncio.get_event_loop()
# Use the event loop to run the main function until it completes
loop.run_until_complete(main())
# Close the loop
loop.close()


Writing src/ChatBot.py


run this on your terminal:

```
cd lesson_04
python src/ChatBot.py
```