# Project 2: Making Chatbots!

## Part 1: Rule Based Chatbot (ELIZA)
In this secttion, you will implement a rule-based Chatbot, ELIZA, using the provided eliza.py file. The eliza.py file contains the rules for the model to follow, you need to complete the code to utilize the file to implement a chat agent while saving the chat history.

In [4]:
import eliza

### Load the Eliza model from eliza.py

In [5]:
eliza = eliza.Eliza()

### Define and open the file to save chat history

In [14]:
from datetime import datetime

# Get the current date and time
now = datetime.now()

# Format the date and time to create the file name
file_name = f"ELIZA_CHAT_{now.strftime('%Y_%m_%d_%H_%M_%S')}.txt"
f = open(file_name, "w")

### Define the regex for exit
You need to define an appropriate regex expression that searches for the appropriate words such as 'bye' or 'exit' in the user input to recognize that the user wants to end the chat

In [11]:
import re

def is_end(input_string):
    pattern = re.compile(r'\b(bye|exit)\b')
    if re.search(pattern, input_string):
        return True
    return False

In [12]:
#Testing is_end
print(is_end("and bye exit a good day"))

True


### Define a chat loop
In this part, you need to define a loop that asks for user input and prints the user response till the user wants to end the chat

In [15]:
print("Hi - what would you like to talk about today?")
while True:
    user_input = input()
    if is_end(user_input):
        # Output the goodbye message, write the last inputs and outputs to the log and close the file
        goodbye_message = "Goodbye! Thank you for talking to Eliza."
        print(goodbye_message)
        f.write(user_input + "\n") # Log the user's last input
        f.write(goodbye_message) # Log the agent's goodbye message
        f.close()
        break
    response = eliza.respond(user_input)
    print("ELIZA:", response)
    f.write("USER: " + user_input + "\n") # Log the user's input
    f.write("ELIZA: " + response + "\n") # Log the agent's response

Hi - what would you like to talk about today?
Let's talk about you
ELIZA: Can you elaborate on that?
I want to talk about you
ELIZA: What would it mean to you if you got to talk about me?
I think i can understand you better
ELIZA: Do you really think so?
yes i think it will help me
ELIZA: You seem quite sure.
i am quite sure
ELIZA: Did you come to me because you are quite sure?
i came to you because i was bored
ELIZA: you came to me because you were bored.
yes i did
ELIZA: You seem quite sure.
i am pretty surre
ELIZA: How long have you been pretty surre?
i have been pretty sure for a while now
ELIZA: Why do you tell me that you've been pretty sure for a while now?
because you asked me
ELIZA: If me asked you, what else must be true?
i think many things
ELIZA: Do you doubt many things?
no i dont
ELIZA: How does that make you feel?
bye
Goodbye! Thank you for talking to Eliza.


Congrats, you are done with part 1. You now need to simply test out your model for 5 chat conversations (minimum 10 utterances in each conversation) and report the results of the human survey.

## Part 2: Corpus Based Chatbot

In this section, you will implement a corpus-based chatbot using the given dialogues.csv corpus. As a part of this task, you will first load the dataset, compute the sentence embeddings for the corpus sentences using the SentenceTransformer Library and then utilize these embeddings for retrieving the most appropriate response.

Note: This part will be slow to run on a CPU based environment (upto 5 minutes), however, it should be very fast on a Colab GPU environment (close to 5 seconds), because of the use of transformer architectures.

In [16]:
import pandas as pd
from sentence_transformers import SentenceTransformer, util
import numpy as np


### Load the dataset
Load the dialogues.csv file using the pandas library.

In [17]:
data = pd.read_csv('dialogues.csv')
data.head()

Unnamed: 0,emotion,User,Agent
0,sentimental,I remember going to see the fireworks with my ...,"Was this a friend you were in love with, or ju..."
1,sentimental,This was a best friend. I miss her.,Where has she gone?
2,sentimental,We no longer talk.,Oh was this something that happened because of...
3,sentimental,"Was this a friend you were in love with, or ju...",This was a best friend. I miss her.
4,sentimental,Where has she gone?,We no longer talk.


### Load the SentenceTransformer model
docs: https://sbert.net/docs/sentence_transformer/usage/usage.html

Load the ```all-MiniLM-L6-v2``` sentence transformer model for computing the contextual embeddings.

In [18]:
model = SentenceTransformer("all-MiniLM-L6-v2")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

### Compute the sentence embeddings
For the 'User' column of the dataset, compute the sentence embeddings using the sentence transformer model.

In [19]:
user_dialogues = data['User'].tolist()
user_embeddings = model.encode(user_dialogues)

### Retrieve the agent response
In the get_response() function, utilize the user_embeddings to retrieve the most similar instance from the data point using cosine similarity. For the selected data point, return the corresponding response in the 'Agent' column of the data as the agent's reponse.

In [20]:
def get_response(user_input, data, model, user_embeddings):
    # Convert the input of the user to its sentence embedding
    input_embedding = model.encode(user_input)

    # Compute cosine similarities
    cosine_scores = util.pytorch_cos_sim(input_embedding, user_embeddings)

    # Find the index of the highest cosine similarity using np.argmax.
    best_match_idx = np.argmax(cosine_scores.numpy())

    # Return the corresponding string for the 'Agent' column
    return data['Agent'][best_match_idx]


In [21]:
user_input = "How are you"
print(get_response(user_input, data, model, user_embeddings))

fine you?


### Define and open the file to save chat history

In [27]:
### Define and open the file to save chat history
from datetime import datetime

# Get the current date and time
now = datetime.now()

# Format the date and time to create the file name
file_name = f"CORPUS_CHAT_{now.strftime('%Y_%m_%d_%H_%M_%S')}.txt"
f = open(file_name, "w")

### Define a chat loop
In this part, you need to define a loop that asks for user input and prints the user response till the user wants to end the chat. Utilize the same regex expression as before to identify when the user wants to end the chat.

In [28]:
print("Hi - what would you like to talk about today?")
while True:
    user_input = input()
    if is_end(user_input):
        # Output the goodbye message, write the last inputs and outputs to the log and close the file
        goodbye_message = "Goodbye! Thank you for talking to Chat Corpus Bot."
        print(goodbye_message)
        f.write(user_input + "\n") # Log the user's last input
        f.write(goodbye_message) # Log the agent's goodbye message
        f.close()
        break
    response = get_response(user_input, data, model, user_embeddings)
    print("CORPUS AGENT: ", response)
    f.write("USER: " + user_input + "\n") # Log the user's input
    f.write("CORPUS AGENT: " + response + "\n") # Log the agent's response

Hi - what would you like to talk about today?
Can we talk about the weather
CORPUS AGENT:  I did have that, and I was right
what do you mean
CORPUS AGENT:  I was so worried about a little girl that was wandering around the neighborhood without her parents.
what happened to her
CORPUS AGENT:  She died in a car accident.
oh no how did that happen
CORPUS AGENT:  I don't know. But now I am upset because I wasn't able to make it to my favorite gym!
what about the girl?
CORPUS AGENT:  Ah ok, yeah my partner has had similar issues recently. I really hope it all works out for you!
you have a partner?
CORPUS AGENT:  yes I do. Do you?
i do yes
CORPUS AGENT:  Me too! I am very sad today, wanna guess why?
why?
CORPUS AGENT:  My grandmother just died.
oh no im so sorry
CORPUS AGENT:  yea i was quite mad
you were mad at her death?
CORPUS AGENT:  Not really, just a little scared. She was only like 7 at the time. I felt like the worst mother ever!
what your grandmother was 7?
CORPUS AGENT:  She is the

Congrats, you are done with part 2. You now need to simply test out your model for 5 chat conversations (minimum 10 utterances in each conversation) and report the results of the human survey.