# Tech Bytes 2: Lets build a chatbot!

Today, we're going to learn how to build a chatbot in python. We won't have time to do a new UI but we will build some functions that may allow you to call openAI from python :)
Let's remember a little of what we saw last time.

Last time we talked about requests, and how they are the backbone of internet communication. ChatGPT is no different than this, so the way we interact with it is by sending a request to a specific endpoint. There are however some differences, we need to send some authentication as well as some parameters that influence the response of the model.

But let's start with importing the libraries we need

In [None]:
import requests
import math

In [None]:
BASE_URL = "https://oai-tech-bytes.openai.azure.com/"
URL = f"{BASE_URL}openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-08-01-preview"
AZURE_KEY = ""

In [None]:
payload = {
    "messages": [{
        "role": "user",
        "content": "Tell me a joke about mango",
    }],

}

headers = {
    "Content-Type": "application/json",
    "api-key": AZURE_KEY,
}

resp = requests.post(URL, headers=headers, json=payload)

resp.json()

As you can see, there's a lot of information in the response. The most important part is the choices key, which contains the response from the model. 
The response is a list of messages, with each message containing the role (user or AI) and the content (the text of the message).
To extract the messages we do something like this:

In [None]:
reply = resp.json()["choices"][0]["message"]["content"]
print(reply)

# Using OpenAI library
OpenAI has a library that allows us to call the model in a little bit of more convenient or `pythonic` way. Here is how we can use the OpenAI library to call the model.

In [None]:
from openai import AzureOpenAI

In [None]:
client = AzureOpenAI(
    api_key=AZURE_KEY,  
    api_version="2024-08-01-preview",
    azure_endpoint = BASE_URL
    )
    

In [None]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": "Tell me a joke about mango",
    }]
)

In [None]:
reply = response.choices[0].message.content
print(reply)

# What else can you do?
The API not only accepts messages, but also other inputs such as temperature, max_tokens, etc... You can find the full list [here](https://platform.openai.com/docs/api-reference/chat/create). Let's play around with them.

## Tempterature
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

In [None]:
prompt = "If the the Mona Lisa was alive today, which social media platform would she use?"

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": prompt,
    }],
    temperature=0.9
)

reply = response.choices[0].message.content
print(reply)

## n : (Choices)
How many completions to generate for each prompt.


In [None]:
prompt = "Tell me a joke about engineers"

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": prompt,
    }],
    n=3
)

for message in response.choices:
    print("New choice")
    reply = message.message.content
    print(reply)


## logprobs
Include the log probabilities on the logprobs most likely output tokens, as well the chosen tokens. For example, if logprobs is 5, the API will return a list of the five most likely tokens. The API will always return the logprob of the sampled token, so there may be up to logprobs+1 elements in the response.

In [None]:
prompt = "Answer only with yes or no. Do you have a conscience?"

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user",
        "content": prompt,
    }],
    logprobs=True,
    top_logprobs=3
)

reply = response.choices[0].message.content
print(reply)

Let's see the different probablities of the model's response
The probabiliy we get is the logprobs value, so we convert it to a percentage by taking the exponent of the logprobs value.

In [None]:
first_token = response.choices[0].logprobs.content[0]
for probabilities in first_token.top_logprobs:
    print(f"{probabilities.token}: {math.exp(probabilities.logprob)*100}%")

# Utilities

There's way too many moving parts so far in this code snippet. Let's simplify it by creating a utility function that will handle the API call for us. You can modify it to take into consideration some of the parameters we saw before

In [None]:
def query_chatgpt(prompt):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": prompt,
        }]
    )
    return response.choices[0].message.content

In [None]:
reply = query_chatgpt("Tell me a joke about mango")
print(reply)

# GPT memory
Let's do some experiments to see if GPT has memory. We will ask GPT to remember a fact and then ask a question about it.

In [None]:
reply = query_chatgpt("Remember: my favorite artist is Taylor Swift")
print(reply)

In [None]:
reply = query_chatgpt("Who is my favorite artist?")
print(reply)

What happened?? Did it forgot what we just talked about? No... It's simpler than that. It's just that are little pal is quite forgetful. It actually doesn't know anything about our previous conversations.

But... How could we solve this you may ask? Well... essentially we need to send our entire conversation history to the model every time we want to chat with it. This way, it can keep track of the context and give us more coherent responses.

There are smarter ways to do this, but for now, let's just keep it simple and send the entire conversation history to the model. This means that the longer the conversation, the more tokens we will use and the mroe likely the model is to get fixated into weird loops.

In [None]:
GLOBAL_CONVERSATION = []

In [None]:
def query_chatgpt_with_memory(prompt):
    # We save the new message in our global conversation
    GLOBAL_CONVERSATION.append(
        {
            "role": "user",
            "content": prompt,
        }
    )
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=GLOBAL_CONVERSATION
    )
    reply = response.choices[0].message.content
    
    # We save the new response in our global conversation as well!
    GLOBAL_CONVERSATION.append(
        {
            "role": "assistant",
            "content": reply,
        }
    )
    return reply

Let's try again the previous examples with the new function:

In [None]:
reply = query_chatgpt_with_memory("Remember: my favorite artist is Taylor Swift")
print(reply)

In [None]:
reply = query_chatgpt_with_memory("Who is my favorite artist?")
print(reply)

Let's check our memory,

In [None]:
for message in GLOBAL_CONVERSATION:
    print(f"{message['role']}: {message['content']}")

# A toi de jouer!
How can you start a new conversation? Can you imagine a smarter way to keep track of the memory?
Does this explains some issues you have seen with chatGPT? How would you fix them?

In [None]:
# Your code goes here

# C'est fini!

And that's it :) Hope you learned more about chatbots and how to use them with Azure OpenAI! If you have any questions, feel free to ask in the comments below. Happy coding!