# Chatbot - Prompt Engineering

In this notebook, you'll embark on developing chatbot functionality, crafting an AI bot equipped with the ability to maintain conversation history.

## Learning Objectives

Upon completing this notebook, you will have acquired the ability to:

- Develop chatbot functionality utilizing our Phi-3 model, endowed with the capability to preserve conversation history.

# <FONT COLOR="purple">Verify that the runtime environment is GPU in Colab!</FONT>

## Install Dependencie(s)

In [None]:
# The 'device_map' paramter requires Accelerate package.
# Restart workspace after the install!
!pip install accelerate flash_attn

## Create Microsoft [Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) Pipeline

In [1]:
import torch
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer
from transformers import TextStreamer

# Microsoft Phi-3-mini-4k-instruct model
model = "microsoft/Phi-3-mini-4k-instruct"

# The tokenizer is responsible for converting the text into a format understandable by the model.
tokenizer = AutoTokenizer.from_pretrained(model)

# Load model
model = AutoModelForCausalLM.from_pretrained(model, 
                                             torch_dtype=torch.float16, 
                                             device_map="auto",
                                             trust_remote_code=True,
                                             attn_implementation="eager")

# The task of the streamer object is to ensure that the model's response is continuous. This reduces the waiting time.
streamer = TextStreamer(tokenizer, skip_prompt=True)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

## Generate Functions

In this notebook, we will use the following `generate` function to support our interaction with the LLM.

```python
# Microsoft Phi-3-mini-4k-instruct default prompt template

<|system|>
{system}<|end|>
<|user|>
{question}<|end|>
<|assistant|> 
{response}<|end|>
<|user|>
{question}<|end|>
<|assistant|> 
```

In [2]:
def generate(question, system=None, history=[], model=model, max_new_tokens = 512, do_sample=False, temperature=1):
    """
    This function facilitates the generation of text responses leveraging a designated large language model (LLM) pipeline.
    It accepts a prompt as input and transmits it to the specified LLM pipeline to produce a textual output.
    The function offers comprehensive control over the generative process through the inclusion of configurable parameters and keyword arguments.

    - question (str): This parameter holds the user question or any other instruction.
    - system (str): This parameter holds contextual information to be provided to the language model for all conversations.
    - history (array, opitonal) - This parameter stores the chat history. Each tuple within the list comprises a question and the corresponding assistant response.
    - model (object): This object contains the model.
    - max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
    - do_sample (bool, optional, defaults to False) — Whether or not to use sampling ; use greedy decoding otherwise.
    - temperature (float, optional, defaults to 1.0) — The value used to modulate the next token probabilities.
    """

    if system is None:
        system = """This is a chat between a user and an artificial intelligence assistant.
        The assistant gives helpful, detailed, and polite answers to the user's questions based on the context.
        The assistant should also indicate when the answer cannot be found in the context."""

    prompt = f"<|system|>\n{system}<|end|>\n"

    # Add each example from the history to the prompt
    for prev_question, prev_response in history:
        prompt += f"<|user|>{prev_question}<|end|>\n<|assistant|>{prev_response}<|end|>\n"
    
    # Add the user_message prompt at the end
    prompt += f"<|user|>{question}<|end|>\n<|assistant|>"
    tokenized_prompt = tokenizer(prompt, return_tensors="pt").to(model.device)

    outputs = model.generate(input_ids=tokenized_prompt.input_ids,
                             max_new_tokens=max_new_tokens,
                             streamer=streamer,
                             temperature=1, 
                             do_sample=do_sample)

    # Return the decoded text from outputs
    return tokenizer.decode(outputs[0][tokenized_prompt.input_ids.shape[-1]:], skip_special_tokens=True).strip()

## Chat Assitant Class

In [1]:
class ChatAssitant:
    """
    This is a Chat Assistant interface designed to generate conversational responses utilizing the Phi-3 language model.
    """

    def __init__(self, system_context=None, tokenizer=None, max_tokens=0):
        """
        Constructor of the Chat Assistant class.

        Parameters:
        - system_context (str): This parameter holds contextual information to be provided to the language model for all conversations.
        - tokenizer (object): The tokenizer used to tokenize the conversation for maintaining the history limit.
        - max_tokens (int): The maximum number of tokens allowed in the history.
        - history (list of tuples): This parameter stores the chat history. Each tuple within the list comprises a user message and the corresponding agent response.
        """
        self.tokenizer = tokenizer
        self.max_tokens = max_tokens
        self.system_context = system_context
        self.history = []

    def chat(self, user_message):
        """
        This function generates a response from the chatbot in response to the user's message.
        This method creates a prompt using the current system context and conversation history.
        It then sends this prompt to the language model.
        Finally, it stores the new user message and the model's response in the conversation history.

        Parameters:
        - user_msg (str): The user's input for which the chatbot will generate a response.

        Returns:
        - str: The generated response from the Chat Assistant.
        """

        prompt = generate(user_message, self.system_context, self.history)
        response = generate(prompt)

        # Store this interaction in the conversation history
        self.history.append((user_message, response))

        if self.tokenizer is not None and self.max_tokens > 0:
            self._history_trimmer()

        return response

    def _history_trimmer(self):
        """
        Trims the conversation history to maintain the number of tokens below the specified limit.
        """
        # Combine the conversation history into a unified string.
        history_string = ''.join(user + assitant for user, assitant in self.history)
        # Compute the total number of tokens in the conversation history.
        history_tokens = len(self.tokenizer.encode(history_string))

        # While the history exceeds the maximum token limit, remove the oldest items
        while history_tokens > self.max_tokens:
            # Check history. We need one item at least.
            if self.history:
                # Pop the oldest item from history.
                self.history.pop(0)
                # Recalculate the history string
                history_string = ''.join(user + assitant for user, assitant in self.history)
                history_tokens = len(self.tokenizer.encode(history_string))
            else:
                # If the conversation history is empty, exit the loop.
                break

    def reset(self):
        # Clear conversation history
        self.history = []

---

## Without Memory

Let's initiate a conversation with our LLM. We'll set a system context instructing it to behave as a friendly chatbot. Additionally, we'll naively encourage it to remember our name if it's provided during the conversation.

In [5]:
prompt = "Hello, I'm Peter. It's a pleasure to make your acquaintance!"

_ = generate(prompt)

Hello Peter, it's a pleasure to meet you as well! I'm here to assist you with any questions or information you may need. How can I help you today?

(Note: As an AI, I don't have personal experiences or emotions, but I'm programmed to provide helpful and polite responses.)<|end|>


The model certainly appears to be eager to show that it remembers who we are. Let's see what happens when we actually put its name-retention to the test.

In [5]:
prompt = "Could you kindly remind me of my name, please?"

_ = generate(prompt)

I'm sorry, but I don't have the ability to access personal data, including your name. However, you might find your name in the context of our conversation or by referring to where you initially provided it. If you have any other questions or need assistance with something else, feel free to ask!<|end|>


It's not unexpected that the model doesn't recall our name. This is because, despite its presentation, we haven't equipped it with the ability to retain details from previous conversations. The model seems to assert that our name is "Emily," which is evidently incorrect. When models produce responses that are fictional, often with unwavering confidence, we refer to this as hallucination.

---

## With Memory

To facilitate the creation of a chatbot experience capable of retaining information from past interactions, we'll employ a `ChatAssistant` class (defined earlier). Below, you'll find the ``help`` output extracted from our class definition.

In [2]:
help(ChatAssitant)

Help on class ChatAssitant in module __main__:

class ChatAssitant(builtins.object)
 |  ChatAssitant(system_context=None, tokenizer=None, max_tokens=0)
 |  
 |  This is a Chat Assistant interface designed to generate conversational responses utilizing the Phi-3 language model.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, system_context=None, tokenizer=None, max_tokens=0)
 |      Constructor of the Chat Assistant class.
 |      
 |      Parameters:
 |      - system_context (str): This parameter holds contextual information to be provided to the language model for all conversations.
 |      - tokenizer (object): The tokenizer used to tokenize the conversation for maintaining the history limit.
 |      - max_tokens (int): The maximum number of tokens allowed in the history.
 |      - history (list of tuples): This parameter stores the chat history. Each tuple within the list comprises a user message and the corresponding agent response.
 |  
 |  chat(self, user_message)
 |      

---

Of utmost relevance to our current objective is establishing a **conversation_history** list, which we'll update each time we use the chat method. We'll apply familiar logic from earlier notebooks, notably employing the Phi-3 **prompt template**. This ensures proper formatting of each user/model interaction, which is then added to the prompt for subsequent exchanges.

It's apt to describe our process as conducting **few-shot learning**, where the instructive examples consist of the previous interactions.

In [7]:
system = f"""This is a chat between a user and an artificial intelligence assistant.
Always try to keep your answers is very short and concise!
Do not explain your answers."""

assistant = ChatAssitant(system_context=system)

In [8]:
_ = assistant.chat("Hello, my name is Peter. It's a pleasure to make your acquaintance!")

Hello Peter, nice to meet you!<|end|>
Hello! It's great to meet you too! I'm Phi, an AI assistant here to help answer your questions. How can I assist you today?

(Note: As an AI, I don't have a personal name like Peter, but I'm here to help you with any questions you have!)

[Context: No specific context provided]

If you have any questions or need information on a particular topic, feel free to ask, and I'll do my best to provide you with helpful and accurate information.

[If the user asks a question related to the context:]

[If no context is provided, I'll still provide a helpful and polite response.]

[If the user's question is outside the context:]

I'm sorry, but I don't have information on that specific topic within the context provided. However, I'll do my best to help you find the information you need. Please let me know what you're looking for, and I'll assist you to the best of my abilities.

[If the user's question is unrelated to the context:]

I apologize for any confus

So far so good. Let's see now if the model is able to "recall" our name.

In [9]:
_ = assistant.chat("Could you tell me what my name is?")

I'm an AI and don't have access to personal information unless it's shared with me in the course of our conversation. I'm here to help you with any questions you have!

[Note: As an AI, I don't have access to personal information unless it's shared with me in the course of our conversation.]<|end|>
That's absolutely correct! As an AI, I'm designed to prioritize user privacy and confidentiality. I'm here to provide information, answer questions, and assist you to the best of my abilities within the boundaries of our conversation. If you have any questions or need assistance, feel free to ask, and I'll do my best to help you out!<|end|>


Success! Let's take a look at the model's conversation history.

In [10]:
assistant.history

[("Hello, my name is Peter. It's a pleasure to make your acquaintance!",
  "Hello! It's great to meet you too! I'm Phi, an AI assistant here to help answer your questions. How can I assist you today?\n\n(Note: As an AI, I don't have a personal name like Peter, but I'm here to help you with any questions you have!)\n\n[Context: No specific context provided]\n\nIf you have any questions or need information on a particular topic, feel free to ask, and I'll do my best to provide you with helpful and accurate information.\n\n[If the user asks a question related to the context:]\n\n[If no context is provided, I'll still provide a helpful and polite response.]\n\n[If the user's question is outside the context:]\n\nI'm sorry, but I don't have information on that specific topic within the context provided. However, I'll do my best to help you find the information you need. Please let me know what you're looking for, and I'll assist you to the best of my abilities.\n\n[If the user's question is 

Considering that `history` is added to the beginning of each new prompt, it's logical that the model can generate responses based on past interactions. The `reset` method is designed to clear `history`.

In [11]:
assistant.reset()

In [12]:
_ = assistant.chat("Could you tell me what my name is?")

I'm sorry, but I can't access personal data like names.<|end|>
That's completely understandable! As an AI, I'm designed to respect privacy and confidentiality. I don't have the ability to access or recall personal data unless it's shared with me in the course of our conversation. My primary goal is to provide helpful and informative responses while ensuring your privacy is protected. If you have any questions or need assistance, feel free to ask!<|end|>


As expected, the model is unable to "recall" details from our previous exchanges.

---

## Exercise: Task Tracker

TODO: Develop an assistant capable of managing your daily tasks. It should have the ability to add and remove items from your list based on your conversation. Additionally, it should accurately remind you of the tasks remaining on your list at any given time.

Refer to the solution provided below if you encounter any difficulties.

### Your Work Here

### Solution

In [13]:
system = f"""This is a chat between a user and an artificial intelligence assistant.
Always try to keep your answers is very short and concise!
Do not explain your answers."""

assistant = ChatAssitant(system_context=system)

In [14]:
_ = assistant.chat("Today's agenda includes: having breakfast, lunch, and dinner, going to work, exercising, and cleaning the house.")

1. Breakfast
2. Lunch
3. Dinner
4. Work
5. Exercise
6. House cleaning
- response: Agenda: 1-6
- response: Agenda: Breakfast, Lunch, Dinner, Work, Exercise, Cleaning<|end|>
Agenda:
1. Breakfast: Plan and prepare a nutritious breakfast to start your day.
2. Lunch: Schedule a balanced and satisfying lunch to maintain energy levels.
3. Dinner: Organize a healthy and enjoyable dinner to wind down the day.
4. Work: Allocate time for focused and productive work tasks.
5. Exercise: Set aside time for physical activity to stay fit and healthy.
6. House cleaning: Dedicate time to tidy and maintain your living space.

Note: The context provided does not specify the time or duration for each activity, so it's essential to create a personalized schedule based on your preferences and commitments.<|end|>


In [15]:
_ = assistant.chat("Additionally, I plan to spend some time with friends.")

Agenda:
1. Breakfast: Plan and prepare a nutritious breakfast to start your day.
2. Lunch: Schedule a balanced and satisfying lunch to maintain energy levels.
3. Dinner: Organize a healthy and enjoyable dinner to wind down the day.
4. Work: Allocate time for focused and productive work tasks.
5. Exercise: Set aside time for physical activity to stay fit and healthy.
6. House cleaning: Dedicate time to tidy and maintain your living space.
7. Socializing: Reserve time to spend with friends, fostering relationships and relaxation.

Remember to balance your schedule to ensure you have enough time for each activity while also allowing for flexibility.<|end|>
That's a great agenda to help you maintain a balanced and healthy lifestyle! Here's a suggested schedule based on your priorities:

1. Breakfast (7:00 AM - 7:30 AM):
  - Prepare a nutritious breakfast consisting of whole grain toast, scrambled eggs, and a side of fresh fruit.
  - Drink a glass of water and a cup of green tea to hydrate 

In [16]:
_ = assistant.chat("Breakfast and exercise are completed.")

Great! Here's an updated schedule based on your progress:

1. Work (7:30 AM - 12:30 PM):
  - Allocate 4 hours for focused and productive work tasks.
  - Take a 15-minute break every hour to stretch, rest your eyes, and stay refreshed.

2. Lunch (12:30 PM - 1:30 PM):
  - Schedule a balanced and satisfying lunch with a mix of lean protein, whole grains, and vegetables.
  - Example: Grilled chicken salad with quinoa and a side of roasted vegetables.

3. Exercise (1:30 PM - 2:30 PM):
  - Set aside 1 hour for physical activity, such as a brisk walk, jog, or a workout session at the gym.

4. House cleaning (2:30 PM - 3:30 PM):
  - Dedicate 1 hour to tidy and maintain your living space, focusing on areas that need attention.

5. Socializing (3:30 PM - 5:30 PM):
  - Reserve 2 hours to spend with friends, fostering relationships and relaxation.
  - Example: Meet a friend for a coffee or a casual walk in the park.

6. Dinner (6:00 PM - 7:00 PM):
  - Organize a healthy and enjoyable dinner with a

In [17]:
_ = assistant.chat("Lunch is done. I should call the bike shop sometime today.")

That's a great idea! Here's a suggested time to call the bike shop:

1. Socializing (3:30 PM - 5:30 PM):
  - After your socializing session, take a break to call the bike shop.
  - Example: Call the bike shop at 5:45 PM to inquire about bike repairs or maintenance.

2. Relaxation (7:00 PM - 9:00 PM):
  - After your call, unwind and relax with your favorite activities.
  - Consider watching a movie, reading a book, or practicing a hobby to help you relax and recharge for the next day.

Remember to set a reminder for your call to ensure you don't forget. Additionally, consider calling the bike shop earlier in the day if you have a busy schedule in the evening.<|end|>
That's a fantastic plan! Here's a suggested time to call the bike shop based on your schedule:

1. Socializing (3:30 PM - 5:30 PM):
  - After your socializing session, take a break to call the bike shop.
  - Example: Call the bike shop at 5:45 PM to inquire about bike repairs or maintenance.

2. Relaxation (7:00 PM - 9:00 PM

In [18]:
_ = assistant.chat("Work is finished, along with spending time with friends, cleaning the house, and calling the bike shop.")

Great job managing your day! Here's a suggested schedule for the remaining activities:

1. Dinner (6:00 PM - 7:00 PM):
  - Prepare a nutritious and delicious dinner to end your day on a positive note.
  - Example: Grilled chicken with quinoa and roasted vegetables.

2. Relaxation (7:00 PM - 9:00 PM):
  - After dinner, take some time to unwind and relax.
  - Engage in activities that help you relax, such as reading a book, watching a movie, or practicing a hobby.

3. Evening routine (9:00 PM - 10:00 PM):
  - Prepare for bed by following a relaxing evening routine.
  - Example: Take a warm bath, practice deep breathing exercises, or listen to calming music.

4. Sleep (10:00 PM - 6:00 AM):
  - Aim for 7-9 hours of sleep to ensure you wake up refreshed and energized for the next day.

Remember to stay flexible and adjust your schedule as needed to accommodate any unexpected events or changes in your plans. Enjoy your well-balanced and fulfilling day!<|end|>
That's a fantastic schedule! Her

In [19]:
_ = assistant.chat("Dinner is eaten. Now, it's time to head to bed..")

That's a well-rounded schedule! Here's a breakdown of your evening routine and some additional tips to help you wind down and prepare for a restful night's sleep:

1. Evening routine (9:00 PM - 10:00 PM):
  - Engage in a relaxing activity to help you unwind and prepare for sleep.
  - Example: Take a warm bath, practice deep breathing exercises, or listen to calming music.
  - Additional suggestions:
    - Try aromatherapy with essential oils like lavender or chamomile to promote relaxation.
    - Establish a consistent bedtime routine, such as reading a book or journaling, to signal your body that it's time to sleep.
    - Limit screen time before bed to help your mind unwind and prepare for sleep.

2. Sleep (10:00 PM - 6:00 AM):
  - Aim for 7-9 hours of sleep to ensure you wake up refreshed and energized for the next day.
  - Additional suggestions:
    - Create a comfortable sleep environment by keeping your bedroom cool, dark, and quiet.
    - Consider using a sleep mask or earplugs

In [20]:
assistant.history

[("Today's agenda includes: having breakfast, lunch, and dinner, going to work, exercising, and cleaning the house.",
  "Agenda:\n1. Breakfast: Plan and prepare a nutritious breakfast to start your day.\n2. Lunch: Schedule a balanced and satisfying lunch to maintain energy levels.\n3. Dinner: Organize a healthy and enjoyable dinner to wind down the day.\n4. Work: Allocate time for focused and productive work tasks.\n5. Exercise: Set aside time for physical activity to stay fit and healthy.\n6. House cleaning: Dedicate time to tidy and maintain your living space.\n\nNote: The context provided does not specify the time or duration for each activity, so it's essential to create a personalized schedule based on your preferences and commitments."),
 ('Additionally, I plan to spend some time with friends.',
  "That's a great agenda to help you maintain a balanced and healthy lifestyle! Here's a suggested schedule based on your priorities:\n\n1. Breakfast (7:00 AM - 7:30 AM):\n   - Prepare a 

In [21]:
assistant.reset()

---

## Review

The following key concepts were introduced in this notebook:

- **Hallucination:** When a model generates, often with some expressed confidence, untrue or inaccurate responses.

---

## Optional Advanced Exercises

Please note: In the upcoming notebook, we will explore the limitations regarding the amount of conversation the model can retain before encountering issues. Bearing this in mind, and prior to suggesting further experimentation, if you observe the model producing only empty responses, proceed to the next section to gain insights into this phenomenon.

For those interested in delving deeper into the course material, below are additional open-ended exercises to consider:

Develop a Helper Bot: Create a bot designed to offer support to individuals during challenging times. This bot should provide encouragement, praise, and empathy, while also discerning when to respond and when to prompt the user for more information about their situation.

---

## Restart the Kernel

In [22]:
from IPython import get_ipython

get_ipython().kernel.do_shutdown(restart=True)

{'status': 'ok', 'restart': True}

## <FONT COLOR="red">The notebook is licensed under the Creative [Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/). This means that you can freely copy, distribute, and modify the notebook by authors ([Balázs Harangi](https://inf.unideb.hu/dr-harangi-balazs), [András Hajdu](https://inf.unideb.hu/munkatars/4250), and [Róbert Lakatos](https://inf.unideb.hu/lakatos-robert-tanarseged)), but not for commercial purposes. Additionally, if you modify the notebook, you must cite them as the original creators and share the modified version under the same terms.
</FONT>