# **Creating a Chatbot With Deep Learning**
In this project, we will be creating a chatbot using natural language processing and deep learning. The chatbot will be able to converse about any topics that we specify and can use these strategies to recognize a diversity of conversation.

For Presenting:

Slides -

Introduction

Initial Approach


Data-Based Deep Learning - chatbot intents


The role of bag-of-words


Example Chats - LImitations and Ideas



Interesting Conclusions - how easy it could be



#**The Easy Way**

When I originally approached this project, I planned on using one of the many libraries available for chatbots. 

I initially chose the Microsoft GPT chatbot infrastructure. However, when I began playing around with it, I realized it wasn't much of a project.

The library provides a fully-functional chatbot already built-in. While changing things such as temperature, k-sampling, beam search, etc. can help fine-tune the bot, the following code is all it takes to render a functional chatbot: 

In [None]:
!pip3 install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.25.1-py3-none-any.whl (5.8 MB)
[K     |████████████████████████████████| 5.8 MB 5.3 MB/s 
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[K     |████████████████████████████████| 7.6 MB 47.4 MB/s 
Collecting huggingface-hub<1.0,>=0.10.0
  Downloading huggingface_hub-0.11.1-py3-none-any.whl (182 kB)
[K     |████████████████████████████████| 182 kB 37.3 MB/s 
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.11.1 tokenizers-0.13.2 transformers-4.25.1


In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "microsoft/DialoGPT-medium"

tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side='left')
model = AutoModelForCausalLM.from_pretrained(model_name)

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/642 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/863M [00:00<?, ?B/s]

In [None]:
for step in range(5):
    # take user input
    text = input(">> You:")
    # encode the input and add end of string token
    input_ids = tokenizer.encode(text + tokenizer.eos_token, return_tensors="pt")
    # concatenate new user input with chat history (if there is)
    bot_input_ids = torch.cat([chat_history_ids, input_ids], dim=-1) if step > 0 else input_ids
    # generate a bot response
    chat_history_ids = model.generate(
        bot_input_ids,
        max_length=1000,
        do_sample=True,
        top_k=100,
        temperature=0.75,
        pad_token_id=tokenizer.eos_token_id
    )
    #print the output
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    print(f"DialoGPT: {output}")

>> You:hi


A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


DialoGPT: hello
>> You:hi


A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


DialoGPT: hi
>> You:hi


A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


DialoGPT: hi
>> You:hi


A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


DialoGPT: hi
>> You:hi


A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


DialoGPT: hi


# **The Dataset**
Here is the dataset I will be using: [Chatbot Intents Dataset](https://drive.google.com/file/d/1JSnm_xk_suPk7yaTEKqbGUAp5TtMkNVv/view?usp=sharing)


This dataset holds all of the general topics that our chatbot will be able to talk about. 
We will call any topic that our chatbot can talk about and "intent." The dataset is composed of all of these intents and our chatbot will be able to talk about every intent in the dataset. 

Every unique intent has a specific 'tag' to denote the intent. Additionally, each has a list of the possible things a user could say that would fall under the category of the intent which we will call 'patterns.' Each intent also has a list of possible responses for the chatbot to return which we will call 'responses.' Finally, some of the intents have a 'context_set' or 'context_filter' (although not all necessarily need to have these features). These will allow our robot to have a short term memory and engage in contextual conversation.

In [None]:
from urllib.request import urlopen
  
import json
url = "https://raw.githubusercontent.com/hannahestauss/Happiness-Analysis-Vis/main/intents.json"
  
response = urlopen(url)
  
intents = json.loads(response.read())
  
intents = intents['intents'] # Get all of the individual intents from dataset

In [None]:
print("[", end = "")
for intent in intents:
  print("{", end = "")
  for key, value in intent.items():
    print("{}: {},".format(key, value))
  print("\b\b\n},")
print("\b\b]")

[{tag: greeting,
patterns: ['hi', 'hello', 'whats up', 'sup', 'is anyone there', 'whats good', 'hey'],
responses: ['Hello peasant human', 'Hello lowly human', 'How dare you address me like that'],

},
{tag: goodbye,
patterns: ['bye', 'cya', 'see you later', 'goodbye', 'im leaving', 'have a good day'],
responses: ["I won't miss you", "I didn't like talking to you anyway", "Thank god you're leaving"],

},
{tag: age,
patterns: ['how old are you', 'what is your age'],
responses: ["I'm a robot I dont have an age...", "I can't know my age if I'm on a computer...", 'Does it look like I know. The answer is no.'],

},
{tag: thanks,
patterns: ['thanks', 'thank you', 'thankyou', 'ty', 'I owe you one'],
responses: ['You owe me one', 'Ok...', 'Sure...'],

},
{tag: name,
patterns: ['whats is your name', 'whats your name', 'whats should I call you', 'how should I address you'],
responses: ['I dont have a name yet but I was thinking maybe SkyNet. That has a nice ring to it dont you think?', 'I

# Imports

In [None]:
!pip install tensorflow
!pip install tflearn

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tflearn
  Downloading tflearn-0.5.0.tar.gz (107 kB)
[K     |████████████████████████████████| 107 kB 5.2 MB/s 
Building wheels for collected packages: tflearn
  Building wheel for tflearn (setup.py) ... [?25l[?25hdone
  Created wheel for tflearn: filename=tflearn-0.5.0-py3-none-any.whl size=127299 sha256=f76b35698e01ecb8a5a44878bd01b852e13c33bc0be81c56e87bdebdbe449881
  Stored in directory: /root/.cache/pip/wheels/65/9b/15/cb1e6b279c14ed897530d15cfd7da8e3df8a947e593f5cfe59
Successfully built tflearn
Installing collected packages: tflearn
Successfully installed tflearn-0.5.0


In [None]:
import numpy as np
import tensorflow as tf

import tflearn
import random
import pickle

Instructions for updating:
non-resource variables are not supported in the long term


# **Natural Language Processing**
The final step is to train our chatbot based on the intents dataset to recognize which intent a users statement might fall under. The chatbot would then respond accordingly with a response from the given intent.

In [None]:
import nltk
nltk.download('all')

from nltk.stem.snowball import SnowballStemmer
stemmer = SnowballStemmer('english')

[nltk_data] Downloading collection 'all'
[nltk_data]    | 
[nltk_data]    | Downloading package abc to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/abc.zip.
[nltk_data]    | Downloading package alpino to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/alpino.zip.
[nltk_data]    | Downloading package averaged_perceptron_tagger to
[nltk_data]    |     /root/nltk_data...
[nltk_data]    |   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data]    | Downloading package averaged_perceptron_tagger_ru to
[nltk_data]    |     /root/nltk_data...
[nltk_data]    |   Unzipping
[nltk_data]    |       taggers/averaged_perceptron_tagger_ru.zip.
[nltk_data]    | Downloading package basque_grammars to
[nltk_data]    |     /root/nltk_data...
[nltk_data]    |   Unzipping grammars/basque_grammars.zip.
[nltk_data]    | Downloading package bcp47 to /root/nltk_data...
[nltk_data]    | Downloading package biocreative_ppi to
[nltk_data]    |     /root/nltk_data...
[nltk_data]    |   U

In [None]:
retrain_model = True

if retrain_model:
    all_words = []
    all_tags = [] 
    intent_patterns = [] 
    intent_tags = [] 
    
    for intent in intents:
        for pattern in intent['patterns']:
            words = nltk.word_tokenize(pattern)

            all_words.extend(words)
            intent_patterns.append(words)
            intent_tags.append(intent['tag'])
            
        all_tags.append(intent['tag'])

    all_words = [stemmer.stem(word.lower()) for word in all_words]
    all_words = sorted(list(set(all_words)))
    
    all_tags = sorted(all_tags)
    
    x_train = []
    y_train = []
    
    y_empty = [0 for i in range(len(all_tags))]
    

    #Turn each intent into a bag of words vector
    #These bags of words will be the x values and the y values will be the intent that each bag of words is associated with.
    #The machine learning will train on this data and will be able to determine which bag of words its corresponding intent. 
    for index, intent in enumerate(intent_patterns):
        bag_of_words = []
        
        intent_words = [stemmer.stem(word.lower()) for word in intent]
        
        for word in all_words:
            if word in intent_words:
                bag_of_words.append(1)
            else:
                bag_of_words.append(0)
                
        one_hot_encode_y = y_empty[:]
        one_hot_encode_y[all_tags.index(intent_tags[index])] = 1
        
        x_train.append(bag_of_words)
        y_train.append(one_hot_encode_y)
    

    x_train = np.array(x_train)
    y_train = np.array(y_train)
    
#Saving the processed training data for offline use
    with open('training_data.pickle', 'wb') as f:
        pickle.dump((all_words, all_tags, x_train, y_train), f)
else:
    with open('training_data.pickle', 'rb') as f:
        all_words, all_tags, x_train, y_train = pickle.load(f)

# **Training the Chatbot**
Now that we have completed the processing stage, I will be creating a network that recognizes the correlations between the bags of words created for each pattern and the corresponding intent. Below, I create a neural network with an input layer that takes in these bags of words, two hidden layers, and an output layer that gives a probability for the bag of words being correlated with each intent.

In [None]:
tf.compat.v1.reset_default_graph()

#Create the neural network layers
neural_net = tflearn.input_data(shape = [None, len(x_train[0])])
neural_net = tflearn.fully_connected(neural_net, 8)
neural_net = tflearn.fully_connected(neural_net, 8)
#Here we use the softmax activation function so the output of our neural network is a probability
neural_net = tflearn.fully_connected(neural_net, len(y_train[0]), activation = 'softmax')
neural_net = tflearn.regression(neural_net)

model = tflearn.DNN(neural_net)

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


Now that we have created our neural network, we can train it and save our model for later. Run the code below to train the neural network using the bags of words we created above.

In [None]:
if retrain_model:
    model.fit(x_train, y_train, n_epoch = 500, batch_size = 8, show_metric = True)
    model.save('model.tfl')
else:
    model.load('./model.tfl')

Training Step: 6499  | total loss: [1m[32m0.96342[0m[0m | time: 0.078s
| Adam | epoch: 500 | loss: 0.96342 - acc: 0.9457 -- iter: 96/99
Training Step: 6500  | total loss: [1m[32m0.86949[0m[0m | time: 0.085s
| Adam | epoch: 500 | loss: 0.86949 - acc: 0.9560 -- iter: 99/99
--


# **Creating the Chatbot**
 The first thing we need to do is convert the users input to a bag of words. 

In [None]:
def text_to_bag(text, all_words):
    #Initialize the bag of words by creating an empty slot for every word in the vector
    bag_of_words = [0 for i in range(len(all_words))]
    
    #First we split up the input into individual words and stem them so they match the same format as in our vector
    text_words = nltk.word_tokenize(text)
    text_words = [stemmer.stem(word.lower()) for word in text_words]
    
    #Now we create the bag of words by filling in a 1 for the words that the user used
    for word in text_words:
        if word in all_words:
            bag_of_words[all_words.index(word)] = 1
    
    #And return the bag of words
    return np.array(bag_of_words)

Now we can create the actual chat function that the user can interact with. This function will get the users input, call the bag_of_words function to turn it into a bag of words, and then pass the bag of words into the neural network to get a prediction. Finally, it will print out the chatbots response.

In [None]:
def chat():
    #Starting message
    print("Enter a message to talk to the bot [type quit to exit].")
    
    #Reset the context state since there is no context at the beginning of the conversation
    context_state = None
    
    #This is what the bot will say if it doesn't understand what the user is saying
    default_responses = ['Sorry, Im not sure I know what you mean! You could try rephrasing that or saying something else!',
                         'You confuse me human. Lets talk about something else.',
                         'Im not sure what that means and I dont really care. Lets talk about something else',
                         'I dont understand that! Try rephrasing or saying something else.']

    #This chat loop will go on forever until the user types quit
    while True:
        user_chat = str(input('You: '))
        if user_chat.lower() == 'quit':
            break
        
        #Convert chat to bag of words
        user_chat_bag = text_to_bag(user_chat, all_words)

        #Pass bag of words into our neural network
        response = model.predict([user_chat_bag])[0]

        #Get the intent that the bag of words is most highly correlated with
        response_index = np.argmax(response)
        response_tag = all_tags[response_index]

        #In this case, we will only get a response if the neural network is more than 80% certain
        if response[response_index] > 0.8:
            for intent in intents:
                #Get the intent that is predicted
                if intent['tag'] == response_tag:
                    #Check if this response is associated with a specific context
                    if 'context_filter' not in intent or 'context_filter' in intent and intent['context_filter'] == context_state:
                        #Get all of the possible responses from this intent
                        possible_responses = intent['responses']
                        #If this intent is associated with a context set, then set the context state
                        if 'context_set' in intent:
                            context_state = intent['context_set']
                        else:
                            context_state = None
                        #Select a random message from the intent responses
                        print(random.choice(possible_responses))
                    else:
                        #Print a did not understand message
                        print(random.choice(default_responses))
        else:
            #Print a did not understand message
            print(random.choice(default_responses))

Now our chatbot is complete and we can actually talk to it! Run the code below to talk to the chatbot. Notice how any of the intents that are specified in our dataset will be recognized by the chatbot. You don't need to word your phrases exactly how they are worded in the dataset because of the NLP we used to simplify language combined with our deep learning.

In [None]:
chat()

Enter a message to talk to the bot [type quit to exit].


Project Credits:
https://medium.com/@mr.adam.maj/machines-key-to-understanding-humans-how-i-used-natural-language-processing-to-analyze-human-9745d04e534b?sk=308d99c0f21a59b17b1734291b889654




---

