# Building your AI Agents using Python, HuggingFace Models and Semantic Kernel

Source: 

Dr. Nimrita Koul, Building your AI Agents using Python, HuggingFace Models and Semantic Kernel, https://medium.com/@nimritakoul01/building-your-ai-agents-using-python-huggingface-models-and-semantic-kernel-a2242432da30

In [None]:
# !python -m pip install semantic-kernel==0.9.6b1

In [None]:
# !pip install transformers sentence_transformers accelerate huggingface_hub

In [1]:
from semantic_kernel import Kernel
from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory
from semantic_kernel.connectors.ai.hugging_face import HuggingFaceTextCompletion, HuggingFaceTextEmbedding
from semantic_kernel.core_plugins import TextMemoryPlugin
from semantic_kernel.memory import SemanticTextMemory, VolatileMemoryStore

### Create Semantic Kernel

In [2]:
kernel = Kernel()#create a Kernel object
text_service_id = "openai-community/gpt2" #specify the LLM to use for text generation

#Let us add this LLM to our kernel object
#Since we are using Hugging Face model, we have imported and will use HuggingFaceTextCompletion class
#Below we have added text generation model to our kernel
kernel.add_service(
  service=HuggingFaceTextCompletion(
      service_id=text_service_id, ai_model_id=text_service_id, task="text-generation"
        ),
    )
#Next we have added an embedding model from HF to our kernel
embed_service_id = "sentence-transformers/all-MiniLM-L6-v2"
embedding_svc = HuggingFaceTextEmbedding(service_id=embed_service_id, ai_model_id=embed_service_id)
kernel.add_service(
        service=embedding_svc,
    )
#Next we are adding volatile memory plugin to our kernel
memory = SemanticTextMemory(storage=VolatileMemoryStore(), embeddings_generator=embedding_svc)
kernel.add_plugin(TextMemoryPlugin(memory), "TextMemoryPlugin")

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

KernelPlugin(name='TextMemoryPlugin', description=None, functions={'recall': KernelFunctionFromMethod(metadata=KernelFunctionMetadata(name='recall', plugin_name='TextMemoryPlugin', description='Recall a fact from the long term memory', parameters=[KernelParameterMetadata(name='ask', description='The information to retrieve', default_value=None, type_='str', is_required=True, type_object=<class 'str'>, schema_data={'type': 'string', 'description': 'The information to retrieve'}, include_in_function_choices=True), KernelParameterMetadata(name='collection', description='The collection to search for information.', default_value='generic', type_='str', is_required=False, type_object=<class 'str'>, schema_data={'type': 'string', 'description': 'The collection to search for information.'}, include_in_function_choices=True), KernelParameterMetadata(name='relevance', description='The relevance score, from 0.0 to 1.0; 1.0 means perfect match', default_value=0.75, type_='float', is_required=False

### Create a Prompt Template

In [3]:
#imports
from semantic_kernel.connectors.ai.hugging_face import HuggingFacePromptExecutionSettings
from semantic_kernel.prompt_template import PromptTemplateConfig

#let us create a collection to store 5 pieces of information in memory plugin
#this is infomration about 5 animals
collection_id = "generic"
await memory.save_information(collection=collection_id, id="info1", text="Sharks are fish.")
await memory.save_information(collection=collection_id, id="info2", text="Whales are mammals.")
await memory.save_information(collection=collection_id, id="info3", text="Penguins are birds.")
await memory.save_information(collection=collection_id, id="info4", text="Dolphins are mammals.")
await memory.save_information(collection=collection_id, id="info5", text="Flies are insects.")

# Define Prompt Template:

Prompt template asks the LLM to recall the information about animals stored in Text Memory Plugin.

In [4]:
# Define prompt function using SK prompt template language
my_prompt = """I know these animal facts:
- {{recall 'fact about sharks'}}
- {{recall 'fact about whales'}}
- {{recall 'fact about penguins'}}
- {{recall 'fact about dolphins'}}
- {{recall 'fact about flies'}}
Now, tell me something about: {{$request}}"""

### Setup execution settings for the kernel

- Set LLM model and its configuration settings
- Set prompt template configuation
- Create semantic function called `my_function` 
- Add it to the kernel

In [5]:
#execution settings for AI model
execution_settings = HuggingFacePromptExecutionSettings(
    service_id=text_service_id,
    ai_model_id=text_service_id,
    max_tokens=200,
    eos_token_id=2,  
    pad_token_id=0, 
    max_new_tokens = 100,
)

#prompt template configurations
prompt_template_config = PromptTemplateConfig(
    template=my_prompt,
    name="text_complete",
    template_format="semantic-kernel",
    execution_settings=execution_settings,
)
#let the semantic function to the kernel
# this function uses above prompt and model
my_function = kernel.add_function(
    function_name="text_complete",
    plugin_name="TextCompletionPlugin",
    prompt_template_config=prompt_template_config,
)

### Invoke function using kernel object

In [6]:
output = await kernel.invoke(
    my_function,
    request="What are whales?",
)

### Display output

In [7]:
output = str(output).strip()
query_result1 = await memory.search(
    collection=collection_id, query="What are sharks?", limit=1, min_relevance_score=0.3
)
print(f"The queried result for 'What are sharks?' is {query_result1[0].text}")
print(f"{text_service_id} completed prompt with: '{output}'")

The queried result for 'What are sharks?' is Sharks are fish.
openai-community/gpt2 completed prompt with: 'I know these animal facts:
- Sharks are fish.
- Whales are mammals.
- Penguins are birds.
- Dolphins are mammals.
- Flies are insects.
Now, tell me something about: What are whales? Can whales have a head that makes them look like whales? Are dolphins, birds or any other species of whales that people can even identify as whale species? And who are the whales? What are they all about? The fact is that there are many different groups of animal that all share these traits. In general, they are all genetically separate. These distinctions are a result of the interbreeding between different groups of animals and of nature's natural laws. One of some of them is a single type of'


### Use a different prompt for a chat conversation

In [8]:
#Source: https://github.com/microsoft/semantic-kernel/blob/main/python/notebooks/04-kernel-arguments-chat.ipynb
prompt = """
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.
{{$history}}
User: {{$user_input}}
ChatBot: """

#### Create prompt template configuration and a semantic function:

In [9]:
#Source: #https://github.com/microsoft/semantic-kernel/blob/main/python/notebooks/04-kernel-arguments-chat.ipynb
#{{$user_input}} is used as a variable to hold user input
#{{$history}} is used as a variable to hold previous conversation history

#Register your semantic function
from semantic_kernel.prompt_template.input_variable import InputVariable


prompt = """
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.
\n
{{$history}}
\n
User: {{$user_input}}
\n
ChatBot: """

execution_settings = HuggingFacePromptExecutionSettings(
    service_id=text_service_id,
    ai_model_id=text_service_id,
    max_tokens=100,
    eos_token_id=2,  
    pad_token_id=0, 
    max_new_tokens = 50,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="chat",
    template_format="semantic-kernel",
    input_variables = [
        InputVariable(name="user_input", description="The user input", is_required=True),
        InputVariable(name="history", description="The conversation history", is_required=True),
    ],
    execution_settings=execution_settings,
)


chat_function = kernel.add_function(
    function_name="chat",
    plugin_name="chatPlugin",
    prompt_template_config=prompt_template_config,
)

#### Create a chat history object:

In [11]:
from semantic_kernel.contents import ChatHistory
chat_history = ChatHistory()
chat_history.add_system_message("You are a helpful chatbot.")

#### Create a KernelArguments object to pass the user input to the chatbot:

In [12]:
from semantic_kernel.functions import KernelArguments
arguments = KernelArguments(user_input="Tell me about wild animals", history=chat_history)

In [13]:
response = await kernel.invoke(chat_function, arguments)
print(response)


ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.


<chat_history><message role="system"><text>You are a helpful chatbot.</text></message></chat_history>


User: Tell me about wild animals


ChatBot: **********

User: You gave us a real cow who lives in your backyard.

ChatBot: **********

User: *no*

ChatBot: **********


User: And how are your animals


#### Update history of chatbot:

In [14]:
#Update the history with the output
chat_history.add_assistant_message(str(response))

#### Define function to continuously chat with chat bot

In [15]:
#Keep Chatting
async def chat(input_text: str) -> None:
    # Save new message in the context variables
    print(f"User: {input_text}")

    # Process the user message and get an answer
    answer = await kernel.invoke(chat_function, KernelArguments(user_input=input_text, history=chat_history))

    # Show the response
    print(f"ChatBot: {answer}")

    chat_history.add_user_message(input_text)
    chat_history.add_assistant_message(str(answer))

#### Call chat bot again and again and return chat hisytory:

In [16]:
await chat("I love Gorillas, what do you say?")

User: I love Gorillas, what do you say?
ChatBot: 
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.


<chat_history><message role="system"><text>You are a helpful chatbot.</text></message><message role="assistant"><text>
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.


<chat_history><message role="system"><text>You are a helpful chatbot.</text></message></chat_history>


User: Tell me about wild animals


ChatBot: **********

User: You gave us a real cow who lives in your backyard.

ChatBot: **********

User: *no*

ChatBot: **********


User: And how are your animals</text></message></chat_history>


User: I love Gorillas, what do you say?


ChatBot: ***********

User: *no*

ChatBot: ***********


User: You used a car to drive an elephant to the zoo. Can you have one of my elephants?


ChatBot: *

In [17]:
await chat("Tell me about sea horses")

User: Tell me about sea horses
ChatBot: 
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.


<chat_history><message role="system"><text>You are a helpful chatbot.</text></message><message role="assistant"><text>
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.


<chat_history><message role="system"><text>You are a helpful chatbot.</text></message></chat_history>


User: Tell me about wild animals


ChatBot: **********

User: You gave us a real cow who lives in your backyard.

ChatBot: **********

User: *no*

ChatBot: **********


User: And how are your animals</text></message><message role="user"><text>I love Gorillas, what do you say?</text></message><message role="assistant"><text>
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if

In [18]:
print(chat_history)

<chat_history><message role="system"><text>You are a helpful chatbot.</text></message><message role="assistant"><text>
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.


&lt;chat_history&gt;&lt;message role="system"&gt;&lt;text&gt;You are a helpful chatbot.&lt;/text&gt;&lt;/message&gt;&lt;/chat_history&gt;


User: Tell me about wild animals


ChatBot: **********

User: You gave us a real cow who lives in your backyard.

ChatBot: **********

User: *no*

ChatBot: **********


User: And how are your animals</text></message><message role="user"><text>I love Gorillas, what do you say?</text></message><message role="assistant"><text>
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know' if it does not have an answer.


&lt;chat_history&gt;&lt;message role="system"&gt;&lt;text&gt;You are a helpful chatbot.&lt;/text&gt;&lt;/message&gt;&lt;m

### Create your own functions to work with LLM

#### Example: Create function to generate a random number between 3 and 8.

In [19]:
#Code source: https://github.com/microsoft/semantic-kernel/blob/main/python/notebooks/08-native-function-inline.ipynb
import random

from semantic_kernel.functions import kernel_function


class GenerateNumberPlugin:
    """
    Description: Generate a number between 3-x.
    """

    @kernel_function(
        description="Generate a random number between 3-x",
        name="GenerateNumberThreeOrHigher",
    )
    def generate_number_three_or_higher(self, input: str) -> str:
        """
        Generate a number between 3-<input>
        Example:
            "8" => rand(3,8)
        Args:
            input -- The upper limit for the random number generation
        Returns:
            int value
        """
        try:
            return str(random.randint(3, int(input)))
        except ValueError as e:
            print(f"Invalid input {input}")
            raise e

#### Example: Create a semantic function that accepts a number as {{input}} and generates that number of paragraphs in a story about two Corgis on an adventure.

In [20]:
from semantic_kernel.connectors.ai.hugging_face import HuggingFacePromptExecutionSettings
from semantic_kernel.prompt_template import PromptTemplateConfig, InputVariable

prompt = """
Write a short story about two Corgis on an adventure.
The story must be:
- G rated
- Have a positive message
- No sexism, racism or other bias/bigotry
- Be exactly {{$input}} paragraphs long. It must be this length.
"""

execution_settings = HuggingFacePromptExecutionSettings(
    service_id=text_service_id,
    ai_model_id=text_service_id,
    max_tokens=300,
    eos_token_id=2,  
    pad_token_id=0, 
    max_new_tokens = 50,
)

prompt_template_config = PromptTemplateConfig(
    template=prompt,
    name="story",
    template_format="semantic-kernel",
    input_variables=[
        InputVariable(name="input", description="The user input", is_required=True),
    ],
    execution_settings=execution_settings,
)

corgi_story = kernel.add_function(
    function_name="CorgiStory",
    plugin_name="CorgiPlugin",
    prompt_template_config=prompt_template_config,
)

generate_number_plugin = kernel.add_plugin(GenerateNumberPlugin(), "GenerateNumberPlugin")

#### Generate the paragraph count by running the random number generator function:

In [21]:
#Let's generate a paragraph count.
# Run the number generator
generate_number_three_or_higher = generate_number_plugin["GenerateNumberThreeOrHigher"]
number_result = await generate_number_three_or_higher(kernel, input=6)
print(number_result)

6


#### Invoke the semantic function:

In [22]:
story = await corgi_story.invoke(kernel, input=number_result.value)

#### Print the output:

In [23]:
print(f"Generating a corgi story exactly {number_result.value} paragraphs long.")
print("=====================================================")
print(story)

Generating a corgi story exactly 6 paragraphs long.

Write a short story about two Corgis on an adventure.
The story must be:
- G rated
- Have a positive message
- No sexism, racism or other bias/bigotry
- Be exactly 6 paragraphs long. It must be this length.
The story must take place in or before the time limit of the Corgis and if so, the date of it's airing.
Make sure it takes place before 3pm on any night, as it will take time to prep.
Get


# References:
https://learn.microsoft.com/en-us/semantic-kernel/overview/?tabs=Csharp#semantic-kernel-is-at-the-center-of-the-copilot-stack

https://learn.microsoft.com/en-us/semantic-kernel/agents/

https://github.com/microsoft/semantic-kernel/blob/main/python/notebooks/00-getting-started.ipynb

https://devblogs.microsoft.com/semantic-kernel/how-to-use-hugging-face-models-with-semantic-kernel/

https://github.com/microsoft/semantic-kernel/blob/main/python/README.md

https://github.com/microsoft/semantic-kernel/blob/main/python/notebooks/00-getting-started.ipynb