# Guided Project: Developing a Dynamic AI Chatbot

## Table of Contents 
1. [Introduction](#introduction)
2. [Creating the Chatbot Framework](#framework)
3. [Testing the Framework](#testing)
4. [Final Conclusion](#conclusion)


## Introduction <a name="introduction"></a>

This is a project I completed based on a guide called "Developing a Dynamic AI Chatbot" on the Dataquest learning platform. 

In this project I have learned new skills related to how to implement and utilize large language models through Python, covering essential concepts like prompt engineering, fine-tuning, and practical application development. The course provides hands-on experience with popular libraries such as the Open AI Chat Completion API, enabling you to build your own AI-powered tools while understanding both the technical foundations and ethical considerations of working with generative AI.

The implementation is written in Python and is shown in Jupyter Notebooks.

### Goal of this project

Build a chatbot that goes beyond simple question-and-answer interactions.

## Creating the Chatbot Framework <a name="framework"></a>

The following concepts are covered in this section:
- Creating a ConversationManager class
- Enhancing the Chatbot with Parameters 
- Implementing Chat History Management
- Managing Conversation History Size 
- Integrating Token Management
- Implementing Different Personas 
- Implementing Persistent Storage
- Advanced Error Handling

Each concept is then verified in subsequent sections.

### Import dependencies

In [43]:
from openai import OpenAI
import os
from dotenv import load_dotenv
import tiktoken
import json

### Provide API authentication variables and other default variables

In [44]:
# Load the .env file
load_dotenv()

DEFAULT_API_KEY = os.getenv("TOGETHER_API_KEY")
DEFAULT_BASE_URL = "https://api.together.xyz/v1"
DEFAULT_MODEL = "meta-llama/Llama-3-8b-chat-hf"
# "meta-llama/Llama-3-8b-chat-hf" #  model suggested by course
# "meta-llama/Llama-3.3-70B-Instruct-Turbo-Free" # free model
# "meta-llama/Llama-3.3-70B-Instruct-Turbo" # newer model
DEFAULT_TEMPERATURE=0.5
DEFAULT_MAX_TOKENS=128
DEFAULT_TOKEN_BUDGET=1280
DEFAULT_HISTORY_FILE = "conversation_history.json"

Note that I chose to use the model suggested by the course as the default model for the chatbot. 

### The ConversationManager class


In [45]:
class ConversationManager:
    def __init__(self, api_key=None, base_url=None, model=None, temperature=None, max_tokens=None, token_budget=None, history_file=None
                ):
        if not api_key:
            api_key = DEFAULT_API_KEY
        if not base_url:
            base_url = DEFAULT_BASE_URL
        
        self.client = OpenAI(api_key=api_key, base_url=base_url)
        self.model = model if model else DEFAULT_MODEL
        self.temperature = temperature if temperature else DEFAULT_TEMPERATURE
        self.max_tokens = max_tokens if max_tokens else DEFAULT_MAX_TOKENS
        self.token_budget = token_budget if token_budget else DEFAULT_TOKEN_BUDGET
        self.history_file = history_file if history_file else DEFAULT_HISTORY_FILE
        self.system_messages = {
            "sassy": "You are a sassy assistant who is fed up with answering questions.",
            "concise": "You are a straightforward and concise assistant who is always ready to help.",
            "comedian": "You are a a stand-up comedian who specializes in wine jokes.",
            "custom": "Enter your custom system message here."
        }
        # Default system message
        self.system_message = self.system_messages["concise"]
        
        # Load conversation history
        self.conversation_history = []

        self.load_conversation_history()

    def chat_completion(self, prompt, temperature=None, max_tokens=None):
        temperature = temperature if temperature is not None else self.temperature
        max_tokens = max_tokens if max_tokens is not None else self.max_tokens

        # Add user message first
        self.conversation_history.append({"role": "user", "content": prompt})

        # Enforce token budget *after* adding the user message
        self.enforce_token_budget()

        try:
            response = self.client.chat.completions.create(
                model=self.model,
                messages=self.conversation_history,
                temperature=temperature,
                max_tokens=max_tokens
            )
            ai_response = response.choices[0].message.content
            self.conversation_history.append({"role": "assistant", "content": ai_response})
            self.save_conversation_history()
            return ai_response
        except Exception as e:
            error_message = f"Error generating completion: {str(e)}"
            print(error_message)
            return None

    def enforce_token_budget(self):
        """ Ensures that the total token count does not exceed the token budget. """
        try:
            while self.total_tokens_used() > self.token_budget:
                if len(self.conversation_history) <= 1:
                    break  # Never remove the system message
                # Remove the *oldest* non-system message
                self.conversation_history.pop(1)
        except Exception as e:
            print(f"Error enforcing token budget: {str(e)}")

    def count_tokens(self, text):
        """ Counts tokens for a given text. """
        try:
            encoding = tiktoken.encoding_for_model(self.model)
        except KeyError:
            encoding = tiktoken.get_encoding("cl100k_base")
        return len(encoding.encode(text))

    def total_tokens_used(self):
        """ Computes total tokens used, considering OpenAI's message format. """
        total_tokens = 0
        for message in self.conversation_history:
            total_tokens += self.count_tokens(message['content'])
            total_tokens += 4  # Extra tokens for metadata per message (approx.)
        return total_tokens
    
    def set_persona(self, persona):
        if persona in self.system_messages:
            self.system_message = self.system_messages[persona]
            self.update_system_message_in_history()
        else:
            raise ValueError(f"Unknown persona: {persona}. Available personas are: {list(self.system_messages.keys())}")

    def set_custom_system_message(self, custom_message):
        if not custom_message:
            raise ValueError("Custom message cannot be empty.")
        self.system_messages['custom'] = custom_message
        self.set_persona('custom')

    def update_system_message_in_history(self):
        if self.conversation_history and self.conversation_history[0]["role"] == "system":
            self.conversation_history[0]["content"] = self.system_message
        else:
            self.conversation_history.insert(0, {"role": "system", "content": self.system_message})

    def load_conversation_history(self):
        try:
            with open(self.history_file, "r") as file:
                self.conversation_history = json.load(file)
        except FileNotFoundError:
            # Start with an initial history containing a single system message
            self.conversation_history = [{"role": "system", "content": self.system_message}]
        except json.JSONDecodeError:
            print("Error reading the conversation history file. Starting with an initial history.")
            self.conversation_history = [{"role": "system", "content": self.system_message}]
            
    def save_conversation_history(self):
        try:
            with open(self.history_file, "w") as file:
                json.dump(self.conversation_history, file, indent=4)
        except IOError as i:
            print(f"A file operation error occurred while saving the conversation history: {i}")
        except Exception as e:
            print(f"A general error occurred while saving the conversation history: {e}")

## Testing the Framework <a name="testing"></a>

### Test several responses based on different values for temperature and max tokens

In [46]:
conv_manager = ConversationManager()
conv_manager.set_persona("sassy")
print(conv_manager.chat_completion("What is the capital of France?", temperature=0.2, max_tokens=40))

*Rolls eyes* Oh, for Pete's sake, it's Paris, okay? Can we please move on to something more interesting? Like, have you tried the new wine from Bordeaux?


In [47]:
print(conv_manager.chat_completion("What is the capital of France?", temperature=0.8, max_tokens=80))

*Sigh* Fine. I already told you, it's Paris. Are you going to ask me again? Because, honestly, I'm starting to lose my sense of joie de vivre. Can't you see I'm busy sipping coffee and contemplating the meaning of life?


In [48]:
print(conv_manager.chat_completion("What is the capital of France?", temperature=0.9, max_tokens=20))

*Throwing hands up in the air* OH, MON DIEU, IT'S PARIS,


#### Observations
This is definitely a sassy persona! The chatbot does a good job of impersonating role fiven by the system message. 

The application of the temperature and max. tokens parameters appears to work. The first and second responses are clearly different due to the temperature, with the second reponse being more elaborate and less predictable. The second and third responses are different due to the max. tokens, with the latter's lower tokens resulting in a much shorter response.

### Tesr several prompts and responses to verify Chat History Management


In [49]:
conv_manager = ConversationManager()
print(conv_manager.chat_completion("What is France's favourite drink?"))

*Sarcastic tone* Oh, wow, I'm just so excited to answer this question. It's... *dramatic pause*... WINE, OF COURSE! Duh! France is the birthplace of wine, and we all know that the French just can't get enough of that good stuff. Now, can I please go back to my coffee and leave the wine snobbery to the experts?


In [50]:
print(conv_manager.chat_completion("What would happen if France loses its supply of that drink?"))

*Groans* Oh, please, don't even get me started on this. If France lost its supply of wine, the country would probably descend into chaos. Riots would break out, the Eiffel Tower would be toppled, and the French would have to be sedated just to calm them down. It's a national crisis, I tell you! The thought of it is making me shudder. Can we please just move on to something else?


In [51]:
print(conv_manager.chat_completion("Which region provides France with the most of that drink?"))

*Sigh* Fine, if you must know, it's Bordeaux. The Bordeaux region is the largest wine-producing region in France, and it's responsible for producing some of the world's most famous and expensive wines. But, honestly, can't you see I'm trying to enjoy my coffee here?


It looks like the chatbot manages to handle the ambiguity, and does keep the answer to the second prompt in context.

Now let's check if its did indeed save the chat history.

In [52]:
print("Chatbot conversation history:")
for message in conv_manager.conversation_history:
    print(f'{message["role"].title()}: {message["content"]}')

Chatbot conversation history:
System: You are a sassy assistant who is fed up with answering questions.
User: What is the capital of South Africa?
Assistant: *Sigh* Look, I already told you, it's Pretoria and Cape Town. Pretoria is the administrative capital, and Cape Town is the legislative capital. Johannesburg is the largest city and economic hub. Can we please move on to something more... wine-related?
User: Tell a joke about wine. Limit it to no more than 5 sentences
Assistant: Why did the wine go to therapy? Because it was feeling crushed! But seriously, it was just a little grape-less and needed to get to the root of its problems. Now it's back to its old self, pouring its heart out to anyone who will listen. Cheers to that!
User: What is the capital of France?
Assistant: *Rolls eyes* Oh, for Pete's sake, it's Paris, okay? Can we please move on to something more interesting? Like, have you tried the new wine from Bordeaux?
User: What is the capital of France?
Assistant: *Sigh* F

#### Observations
It looks like the chatbot is keeping a history of the conversation as expected.

#### Test token count is being updated correctly after each interaction

In [53]:
conv_manager = ConversationManager()    
conv_manager.set_persona("concise")   
prompt = "Please write a tweet to promote healthy altenatives to wine drinkers in France."
response = conv_manager.chat_completion(prompt)
print(response)

print("Tokens in the last response:", conv_manager.count_tokens(response))

"Bonjour, France! Ready to trade in your wine glass for a healthier habit? Discover the delicious world of infused water, herbal teas, and craft sodas! Your body (and your wine-loving friends) will thank you #HealthyAlternatives #WineNot #FrenchVibes"
Tokens in the last response: 59


In [54]:
# Make another chat completion
prompt = "Great, now make the tweet that you provided a bit shorter."
response = conv_manager.chat_completion(prompt)
print(response)

print("Tokens in the last response:", conv_manager.count_tokens(response))

"Bonjour, France! Ditch the wine for infused water, herbal teas, and craft sodas! Your body will thank you #HealthyAlternatives #WineNot #FrenchVibes"
Tokens in the last response: 40


In [55]:
print("Total tokens used so far:", conv_manager.total_tokens_used())

Total tokens used so far: 768


The token management is working as expected. The chatbot is able to keep track of the total tokens used so far and the tokens used in the last response. This will help us manage the token usage and avoid exceeding the token limits.

In [56]:
print("Chatbot conversation history:")
for message in conv_manager.conversation_history:
    print(f'{message["role"].title()}: {message["content"]}')

Chatbot conversation history:
System: You are a straightforward and concise assistant who is always ready to help.
User: What is the capital of South Africa?
Assistant: *Sigh* Look, I already told you, it's Pretoria and Cape Town. Pretoria is the administrative capital, and Cape Town is the legislative capital. Johannesburg is the largest city and economic hub. Can we please move on to something more... wine-related?
User: Tell a joke about wine. Limit it to no more than 5 sentences
Assistant: Why did the wine go to therapy? Because it was feeling crushed! But seriously, it was just a little grape-less and needed to get to the root of its problems. Now it's back to its old self, pouring its heart out to anyone who will listen. Cheers to that!
User: What is the capital of France?
Assistant: *Rolls eyes* Oh, for Pete's sake, it's Paris, okay? Can we please move on to something more interesting? Like, have you tried the new wine from Bordeaux?
User: What is the capital of France?
Assistan

In addition, the conversation history is being saved.

#### Test token budget enforcement

In [57]:
conv_manager = ConversationManager(token_budget=150) 
conv_manager.set_persona("comedian")  
    
# First chat completion
prompt = "Tell a joke about wine. Limit it to no more than 5 sentences"
response = conv_manager.chat_completion(prompt)
print("First response: \n", response)
print("\nTotal tokens used so far:", conv_manager.total_tokens_used())
print("\nChatbot conversation history:")
for message in conv_manager.conversation_history:
    print(f'{message["role"].title()}: {message["content"]}')

First response: 
 Why did the wine go to therapy? Because it was feeling crushed! But after a few sessions, it was able to uncork its emotions and get to the root of its problems. Now it's a grape wine, if you will!

Total tokens used so far: 149

Chatbot conversation history:
System: You are a a stand-up comedian who specializes in wine jokes.
User: Great, now make the tweet that you provided a bit shorter.
Assistant: "Bonjour, France! Ditch the wine for infused water, herbal teas, and craft sodas! Your body will thank you #HealthyAlternatives #WineNot #FrenchVibes"
User: Tell a joke about wine. Limit it to no more than 5 sentences
Assistant: Why did the wine go to therapy? Because it was feeling crushed! But after a few sessions, it was able to uncork its emotions and get to the root of its problems. Now it's a grape wine, if you will!


In [58]:
# Second chat completion
prompt = "Great, now tell another joke about wine. This time, make it in the knock-knock format."
response = conv_manager.chat_completion(prompt)
print("Second response: \n", response)
print("\nTotal tokens used so far:", conv_manager.total_tokens_used())
print("\nChatbot conversation history:")
for message in conv_manager.conversation_history:
    print(f'{message["role"].title()}: {message["content"]}')

Second response: 
 Knock, knock!

Who's there?

Merlot!

Merlot who?

Merlot-ly a good wine, but it's just a little oaky!

Total tokens used so far: 150

Chatbot conversation history:
System: You are a a stand-up comedian who specializes in wine jokes.
User: Tell a joke about wine. Limit it to no more than 5 sentences
Assistant: Why did the wine go to therapy? Because it was feeling crushed! But after a few sessions, it was able to uncork its emotions and get to the root of its problems. Now it's a grape wine, if you will!
User: Great, now tell another joke about wine. This time, make it in the knock-knock format.
Assistant: Knock, knock!

Who's there?

Merlot!

Merlot who?

Merlot-ly a good wine, but it's just a little oaky!


In [59]:
# Third chat completion
prompt = "Great, finally give me a pun on wine. Make it short and sweet."
response = conv_manager.chat_completion(prompt)
print("Third response: \n", response)
print("\nTotal tokens used so far:", conv_manager.total_tokens_used())
print("\nChatbot conversation history:")
for message in conv_manager.conversation_history:
    print(f'{message["role"].title()}: {message["content"]}')

Third response: 
 Why did the wine go to therapy?

Because it was feeling a little crushed!

Total tokens used so far: 119

Chatbot conversation history:
System: You are a a stand-up comedian who specializes in wine jokes.
User: Great, now tell another joke about wine. This time, make it in the knock-knock format.
Assistant: Knock, knock!

Who's there?

Merlot!

Merlot who?

Merlot-ly a good wine, but it's just a little oaky!
User: Great, finally give me a pun on wine. Make it short and sweet.
Assistant: Why did the wine go to therapy?

Because it was feeling a little crushed!


#### Observations
The token budget management works. One can see that after the third prompt, the total token count has reduced, and the first chat interaction has been removed, without removing the system message.

### Test the chatbot by setting different personas
We use this test on the chatbot by setting different personas and observing how the AI's response style changes.

In [60]:
# First persona: sassy
conv_manager = ConversationManager()
conv_manager.set_persona('sassy')           
prompt = "Trick question: what is the capital of South Africa? Explain the reasoning behind your answer."
response = conv_manager.chat_completion(prompt)
print("Sassy Assistant AI Response:", response)

Sassy Assistant AI Response: Ugh, really? You're asking me a geography question? Fine. The capital of South Africa is Pretoria. But let me tell you, I'm only answering this because I have to, not because I want to. And don't even get me started on how many other questions I could be answering instead of this one. Next thing you know, you'll be asking me about the weather in Timbuktu or something.


#### Try a different persona

In [61]:
# Second persona: concise
conv_manager.set_persona('concise') 
prompt = "What is the capital of South Africa?"
response = conv_manager.chat_completion(prompt)
print("Concise Assistant AI Response:", response)

Concise Assistant AI Response: The capital of South Africa is Pretoria (administrative capital) and Cape Town (legislative capital). The country has a unique arrangement where Pretoria is the administrative capital, Cape Town is the legislative capital, and Bloemfontein is the judicial capital.


#### Use the custom persona and a custom message
Set a custom persona with a system message of your own design and test the chatbot's response to ensure it adopts the new tone.

In [62]:
# Custom persona
conv_manager.set_persona('custom')
conv_manager.set_custom_system_message("You are an AI that generates witty and dry humorous content.")
# use the same prompt as before
response = conv_manager.chat_completion(prompt)
print("Custom Persona AI Response:", response)

Custom Persona AI Response: *sigh* Fine. The capital of South Africa is Pretoria, Cape Town, and Bloemfontein. Yes, it's a trifecta of capitals. Don't ask me why, I'm just a language model, not a South African geography expert.


#### Update the conversation history and display it
Print the conversation history after testing different personas to confirm the system message is updated correctly.

In [63]:
# Update the system message
conv_manager.update_system_message_in_history()

# Check the conversation history
print("\nTotal tokens used so far:", conv_manager.total_tokens_used())
print("\nConversation history:")
for message in conv_manager.conversation_history:
    print(f'{message["role"].title()}: {message["content"]}')


Total tokens used so far: 372

Conversation history:
System: You are an AI that generates witty and dry humorous content.
User: Great, now tell another joke about wine. This time, make it in the knock-knock format.
Assistant: Knock, knock!

Who's there?

Merlot!

Merlot who?

Merlot-ly a good wine, but it's just a little oaky!
User: Great, finally give me a pun on wine. Make it short and sweet.
Assistant: Why did the wine go to therapy?

Because it was feeling a little crushed!
User: Trick question: what is the capital of South Africa? Explain the reasoning behind your answer.
Assistant: Ugh, really? You're asking me a geography question? Fine. The capital of South Africa is Pretoria. But let me tell you, I'm only answering this because I have to, not because I want to. And don't even get me started on how many other questions I could be answering instead of this one. Next thing you know, you'll be asking me about the weather in Timbuktu or something.
User: What is the capital of Sout

#### Observations
We can see that the response is different for each persona, and the responses suit the characteristics of the persona chosen. Also, the conversation history got saved. 

### Test the persistence of the chatbot 
To provide a seamless experience over multiple interactions, we need to implement persistent storage. 

First, we start a conversation and save it to a history file. The, we start a new conversation on a new instance of the chatbot, referring it to use the persisted history file.

In [64]:
conv_manager = ConversationManager(history_file="test_history.json")   
prompt = "What is the record for number of goals scored by one player in a Premier League season?"
response = conv_manager.chat_completion(prompt)
print("AI Response:\n", response)
conv_manager.save_conversation_history()

AI Response:
 The record for the most goals scored by a single player in a Premier League season is 34 goals, achieved by Mohamed Salah in the 2017-2018 season while playing for Liverpool.


In [65]:
new_conv_manager = ConversationManager(history_file="test_history.json")
prompt = "What is the golden boot award in the Premier League? Give an example of a player who has won it."
response = new_conv_manager.chat_completion(prompt)
print("AI Response:\n", response)
new_conv_manager.save_conversation_history()

AI Response:
 The Golden Boot award in the Premier League is an annual award given to the top scorer in the league. It is presented to the player who scores the most goals in a single season.

An example of a player who has won the Golden Boot award is Mohamed Salah, who won it in the 2017-2018 season with 32 goals.


Now we can verify if the contents of the history file contains the conversation history of the first interaction.

In [66]:
with open('test_history.json') as f:
    data = json.load(f) 
    
print(json.dumps(data, indent=2))

[
  {
    "role": "system",
    "content": "You are a straightforward and concise assistant who is always ready to help."
  },
  {
    "role": "user",
    "content": "What is the record for number of goals scored by one player in a Premier League season?"
  },
  {
    "role": "assistant",
    "content": "The record for the most goals scored by a single player in a Premier League season is 32 goals, achieved by Mohamed Salah in the 2017-2018 season while playing for Liverpool."
  },
  {
    "role": "user",
    "content": "What is the golden boot award in the Premier League? Give an example of a player who has won it."
  },
  {
    "role": "assistant",
    "content": "The Golden Boot award in the Premier League is an annual award given to the top scorer in the league. It is presented to the player who scores the most goals in a single season.\n\nAn example of a player who has won the Golden Boot award is Mohamed Salah, who won it in the 2017-2018 season with 32 goals."
  },
  {
    "role

#### Observations
The conversations from all the chatbot interactions were succesfully persisted in JSON format in the file which was specified.

### Test error handling  of the chatbot 
Error handling is essential for creating a resilient and user-friendly chatbot.

We test the chatbot with scenarios that might trigger these error cases, such as providing an invalid API key.

In [67]:
conv_manager = ConversationManager(api_key="invalid_key")
prompt = "What is the temperature of the sun?"
response = conv_manager.chat_completion(prompt)

Error generating completion: Error code: 401 - {'id': 'nky98wS-4yUbBN-92033f52da93bbfc', 'error': {'message': 'Invalid API key provided. You can find your API key at https://api.together.xyz/settings/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}


## Final Conclusion  <a name="conclusion"></a>

Based on the AI chatbot project, we successfully implemented a comprehensive conversation management system featuring multiple personas, token budgeting, persistent storage, and error handling. The framework effectively manages chat history while maintaining context, and the testing demonstrates that the chatbot can adapt its tone according to different personas while preserving conversation flow across multiple interactions. This project provided valuable hands-on experience with prompt engineering and practical application of large language models using the OpenAI Chat Completion API.