<a href="https://www.kaggle.com/code/mubasherbajwa/conversational-chatbot-using-mistral?scriptVersionId=201689882" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# 1.0 Introduction
In this notebook, I will build a chatbot that can maintain the conversation history over multiple interactions and in different sessions.

- Key Steps invloded in building conversational chatboat are : 
    1. Define a function to create memory store
    2. Define a function to take user inputs
    3. Define a function to initialize random session
    4. Define a function to get chat history
    5. Define a function to generate response from LLM
    6. Take input and generate response
    7. Generate multiple responses
    8. Retrieve history of multiple generated responses

# 2.0 Getting the HuggingFace Token

In [1]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("hf_key")

# 3.0 Downloading Mistral and Tokenizer
- Pass the token in the `token` argument of both the `AutoModelForCausalLM` and `AutoTokenizer`.

## 3.1 Mistral-7B-Instruct-v0.1
The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets.

In [2]:
## Importing necessary libraries
from transformers import pipeline ## For sequential text generation
from transformers import AutoModelForCausalLM, AutoTokenizer # For leading the model and tokenizer from huggingface repository
import warnings
warnings.filterwarnings("ignore") ## To remove warning messages from output


## Providing the huggingface model repository name for mistral 7B
model_name = "mistralai/Mistral-7B-Instruct-v0.3"

## Downloading the model and tokenizer
model = AutoModelForCausalLM.from_pretrained(model_name, token = secret_value_0,  device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(model_name, token = secret_value_0)

config.json:   0%|          | 0.00/601 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/3 [00:00<?, ?it/s]

model-00001-of-00003.safetensors:   0%|          | 0.00/4.95G [00:00<?, ?B/s]

model-00002-of-00003.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00003.safetensors:   0%|          | 0.00/4.55G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/141k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/587k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.96M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

# 4.0 In-memory store for storing conversation history

    Now we will create an in-memory store for storing conversation history specific to a session
- `conv_store` is the main dictionary storing all the history
- This can be even connected to external databases for storing in some persistent database

In [3]:
conv_store = {}

def get_session_specific_chat(session_id) : 
    if session_id not in conv_store: 
        conv_store[session_id]  = [{"role": "system", "content": "You are an AI assistant."}] ## This is how Mistral's prompt template need to be structured
    return conv_store[session_id]

# 5.0 User inputs to the in-memory store
- Creating a function to store the user's input in the in-memory dictionary in the specific format as desired by the Mistral tokenizer for creating the prompt template

In [4]:
## Query (the prompt) to be passed to the model

def prompt_template(user_input , history ):
    chats = history
    chats.append({"role": "user", "content": f"{user_input}"})
    return chats


## 5.1 Generating random session ids if chat history is cleared out
- After clearning out the chat history (if user asks it to), a new session needs to be created. For that we need to generate a random session_id : here we're using the uuid library for that

In [5]:
# Import the UUID module for generating unique identifiers
import uuid

# Generate a random UUID
x = uuid.uuid1()

# Convert the UUID to a string
str(x)

# Set a default session ID (can be replaced with a more suitable value)
session_id_1 = 'a19ea3ce1c1ef-92af-0242ac130202'

# Create a list to store previous session IDs
session_ids = []

def clear_chat_history():
    "To assign another session id to the user"
    global session_id_1
    # Store the current session ID in the list of previous session IDs
    session_ids.append(session_id_1)

    # Generate a new random UUID for the new session ID
    session_id_1 = str(uuid.uuid1())

In [6]:
clear_chat_history()
session_id_1

'7e64d064-8c83-11ef-912a-0242ac130202'

# 6.0 Chat History
Defining a function to get chat history

In [7]:
def get_chat_history(session_id) : 
    """To fetch all the messages of a particular session """
    try : 
        return conv_store[session_id]
    except : 
        print('Session does not exist')

# 7.0 Generate Response from LLM

Now we will set up a Large Language Model (for Inference) to get the response.
- Here we're making sure that model gets the intital message with the system prompt attached to it
- Afterwards, it should follow the user-assistant-user-assistant structure as desired by the Mistral chat template
- We want to make sure that we're storing the user's input and the response sequentially till the session continues

In [8]:


def print_llm_response(user_input_1, session_id) : 
    """Expects the user's input and the particular session id"""
    chat_history = get_session_specific_chat(session_id).copy() ## Creating a copy to avoid inconsistencies (if not, conv_store variable can be interpreted as chat_history)
    if chat_history == [{"role": "system", "content": "You are an AI assistant."}]  : 
        messages = prompt_template(user_input_1, history = [chat_history[0]]) 
    else  :  
        messages = prompt_template(user_input_1, history = chat_history )
        
    inputs = tokenizer.apply_chat_template(
        messages,  # Passing the initial prompt or conversation context as a list of messages. 
        add_generation_prompt=True,  # Whether to add a system generation prompt to guide the model in generating appropriate responses based on the tools or input.
        return_dict=True,  # Return the results in dictionary format, which allows easier access to tokenized data, inputs, and other outputs.
        return_tensors="pt"  # Specifies that the output should be returned as PyTorch tensors. This is useful if you're working with models in a PyTorch-based environment.
    )

    inputs = {k: v.to(model.device) for k, v in inputs.items()} #  Moves all the input tensors to the same device (CPU/GPU) as the model.
    outputs = model.generate(**inputs, max_new_tokens=128)
    
    response = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True)# Decodes the model's output tokens back into human-readable text.
    print(response)
    conv_store[session_id].append(messages[-1]) ## Saving the user input
    conv_store[session_id].append({"role": "assistant", "content": f"{response}"}) ## Saving the model's output

    

# 8.0 Getting Initial input
- User initiates the conversation to the model
- The input is passed to the model alongwith a unique session id

In [9]:
user_input_2  = 'What are conversational chatbots'
print_llm_response(user_input_2, session_id = session_id_1)

Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)


Conversational chatbots are software applications designed to simulate human conversation with users. They are programmed to understand and respond to text or voice inputs, allowing them to engage in natural, back-and-forth dialogues. These chatbots can be used in various applications, such as customer service, sales, and entertainment, to provide quick and efficient responses to user inquiries, guide users through processes, or entertain them.

Conversational chatbots use natural language processing (NLP) and machine learning algorithms to understand and generate human-like responses. They can be trained on large datasets of convers


In [10]:
user_input_3  = 'What are LLMs'
print_llm_response(user_input_3, session_id = session_id_1)

Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


In the context of artificial intelligence (AI), LLMs (Large Language Models) refer to AI models that are trained on a large corpus of text data. These models are designed to generate human-like text based on the input they receive. They are used in various applications, such as chatbots, language translation, summarization, and content generation.

Large Language Models are trained using unsupervised learning techniques, where the model learns patterns and relationships in the data without being explicitly told what to learn. They are typically pre-trained on a large dataset of text, such as books, websites, and other


In [11]:
user_input_4  = 'What was my previous question'
print_llm_response(user_input_4, session_id = session_id_1)

Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


Your previous question was: "What are LLMs?" In the context of artificial intelligence (AI), LLMs (Large Language Models) refer to AI models that are trained on a large corpus of text data. These models are designed to generate human-like text based on the input they receive. They are used in various applications, such as chatbots, language translation, summarization, and content generation. Large Language Models are trained using unsupervised learning techniques, where the model learns patterns and relationships in the data without being explicitly told what to learn. They are typically pre-trained on a large dataset of text


# 9.0 Retrieving Chat History
- All the inteactions and responses are attached and stored in the same conversation list, which s passed to the model again and again.
- This is how the prompt size increases with the onset of the conversation

In [12]:
get_chat_history(session_id_1)

[{'role': 'system', 'content': 'You are an AI assistant.'},
 {'role': 'user', 'content': 'What are conversational chatbots'},
 {'role': 'assistant',
  'content': 'Conversational chatbots are software applications designed to simulate human conversation with users. They are programmed to understand and respond to text or voice inputs, allowing them to engage in natural, back-and-forth dialogues. These chatbots can be used in various applications, such as customer service, sales, and entertainment, to provide quick and efficient responses to user inquiries, guide users through processes, or entertain them.\n\nConversational chatbots use natural language processing (NLP) and machine learning algorithms to understand and generate human-like responses. They can be trained on large datasets of convers'},
 {'role': 'user', 'content': 'What are LLMs'},
 {'role': 'assistant',
  'content': 'In the context of artificial intelligence (AI), LLMs (Large Language Models) refer to AI models that are t

In [13]:
clear_chat_history()

# 10. Resetting Previous Session_id 
- The chat history was cleared : it means new session id was generated and value of the global `session_id_1` was reset
- All the session_ids created are still in the `session_ids` list, which can be used to fetch the content from the `conv_store`

In [14]:
get_chat_history(session_id_1)

Session does not exist


# 11.0 Conclusion
In this notebook, we have build a chatbot that can maintain the conversation history over multiple interactions and in different sessions.

**Your feedback means a lot to me!**

**Please feel free to share your valueable input or suggestions so that I could be able to code even better.**