<a href="https://colab.research.google.com/github/AceroMike/Natural-Language-Processing/blob/main/Building_a_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook I will be building a chatbot using the [Cornell Movie Dialog Corpus](http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html). Chatbots have been around for a long time and we have all seen examples of them by now from Amazon Echo to Siri. We will not be building anything complicated here. We will be using [ChatterBot]((https://github.com/gunthercox/ChatterBot), a popular Python package that makes building chatbots easier.

Now to import what we will need. 

In [8]:
# Text Preprocessing
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import nltk
import spacy
import re
import random

import warnings
warnings.filterwarnings(action="ignore")

!python -m spacy download en

# Chatterbot
!pip install chatterbot
!pip install chatterbot-corpus
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer, ChatterBotCorpusTrainer
from chatterbot.conversation import Statement

[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_sm')
[38;5;2m✔ Linking successful[0m
/usr/local/lib/python3.6/dist-packages/en_core_web_sm -->
/usr/local/lib/python3.6/dist-packages/spacy/data/en
You can now load the model via spacy.load('en')
Collecting chatterbot
[?25l  Downloading https://files.pythonhosted.org/packages/7c/21/85c2b114bd9dfabdd46ba58fc4519acdaed45d8c70898d40079e37a45e67/ChatterBot-1.0.8-py2.py3-none-any.whl (63kB)
[K     |████████████████████████████████| 71kB 4.1MB/s 
[?25hCollecting mathparse<0.2,>=0.1
  Downloading https://files.pythonhosted.org/packages/c3/e5/4910fb85950cb960fcf3f5aabe1c8e55f5c9201788a1c1302b570a7e1f84/mathparse-0.1.2-py3-none-any.whl
Installing collected packages: mathparse, chatterbot
Successfully installed chatterbot-1.0.8 mathparse-0.1.2
Collecting chatterbot-corpus
[?25l  Downloading https://files.pythonhosted.org/packages/ed/19/f8b41daf36fe4b0f43e283a820362ffdb2c1128600ab4ee18

The data was loaded by accesing a server so the code is not shown. We have called the Cornell Movie Dialogs `dialogs_df`

In [6]:
dialogs_df.head()

Unnamed: 0,index,dialogs
0,0,Can we make this quick? Roxanne Korrine and A...
1,1,"Well, I thought we'd start with pronunciation,..."
2,2,Not the hacking and gagging and spitting part....
3,3,Okay... then how 'bout we try out some French ...
4,4,You're asking me out. That's so cute. What's ...


As we can see the dialogs are still in a raw form. As usual we will want to process the text data to have it ready to use with Chatbot.

In [20]:
nlp = spacy.load('en_core_web_sm', disable=['parser', 'ner'])
nlp.add_pipe(nlp.create_pipe('sentencizer'))
nlp.max_length = 20000000
doc = nlp(" ".join(dialogs_df.dialogs))

In [21]:
doc[0:100]

Can we make this quick?  Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad.  Again. Well, I thought we'd start with pronunciation, if that's okay with you. Not the hacking and gagging and spitting part.  Please. Okay... then how 'bout we try out some French cuisine.  Saturday?  Night? You're asking me out.  That's so cute. What's your name again? Forget it. No, no, it

To make the Chatbot work faster, let's group the document into sentences. 

In [29]:
dialogs = [sent.text for sent in doc.sents if len(sent.text) > 1]
dialogs

['Can we make this quick?',
 ' Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad.',
 ' Again.',
 "Well, I thought we'd start with pronunciation, if that's okay with you.",
 'Not the hacking and gagging and spitting part.',
 ' Please.',
 "Okay... then how 'bout we try out some French cuisine.",
 ' Saturday?',
 ' Night?',
 "You're asking me out.",
 " That's so cute.",
 "What's your name again?",
 'Forget it.',
 "No, no, it's my fault -- we didn't have a proper introduction --- Cameron.",
 "The thing is, Cameron -- I'm at the mercy of a particularly hideous breed of loser.",
 ' My sister.',
 " I can't date until she does.",
 'Seems like she could get a date easy enough... Why?',
 'Unsolved mystery.',
 ' She used to be really popular when she started high school, then it was just like she got sick of it or something.',
 "That's a shame.",
 'Gosh, if only we could find Kat a boyfriend... Let me see what I can do.',
 "C'esc ma tete.",
 'This 

In [38]:
# Training on only first 2000 sentences. 
dialogs_temp = dialogs[0:2000]

Now we can train a Chatbot on the dialogs!

In [39]:
# Create a chatbot
chatbot = ChatBot('Dialogs')
# This is to remove the accumulated knowledge base
chatbot.storage.drop()

# Create a new trainer for the chatbot
trainer = ListTrainer(chatbot)

# Train the chatbot based on Persuasion
trainer.train(dialogs_temp)

List Trainer: [####################] 100%


First, let's define some greetings to get the chatbot started. 

In [41]:
greeting_inputs = ['hello', 'hi', 'greetings', "what's up", 'hey']
greeting_responses = ['hello', 'hi', 'hey', 'hi there']

def greeting(sentence):
  for word in sentence.split():
    if word.lower() in greeting_inputs:
      return random.choice(greeting_responses)

In [44]:
print("Chatbot: Let's see how well I do as a chatbot! If you want to exit, type bye.")

# Below is the chatting
while True:
    
    user_input = input("User: ")
    user_input=user_input.lower()
    
    if(user_input!='bye'):
        if(user_input == 'thanks' or user_input == 'thank you'):
            break
            print("Chatbot: You're welcome.")
        else:
            if(greeting(user_input) != None):
                print("Chatbot: " + greeting(user_input))
            else:
                print("Chatbot: ", end = "")
                print(chatbot.get_response(user_input))
    else:
        print("Chatbot: Bye! It was a great chat.")
        break

Chatbot: Let's see how well I do as a chatbot! If you want to exit, type bye.
User: hi
Chatbot: hey
User: How are you?
Chatbot: As a gentleman You're sweet.
User: Why thank you
Chatbot: Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad.
User: Are they?!
Chatbot: As a gentleman You're sweet.
User: I am?
Chatbot: You never wanted to go out with 'me, did you?
User: Maybe
Chatbot: No, you weren't Then why'd you ask?
User: I am sorry
Chatbot: You never wanted to go out with 'me, did you?
User: I did
Chatbot: Seems like she could get a date easy enough... Why?
User: She is a cool person
Chatbot: Before I had kids when friends talked about their kids, I wanted to vomit.
User: Do you like kids?
Chatbot: Then I guess we're doin' this the right way, aren't we?
User: You are not making sense
Chatbot: That's me.
User: Now that makes sense
Chatbot: A down payment now, then a bonus for sleeping with me?
User: Take it easy there Chatbot. 
Chatbot: You

I hope you have found my imaginary conversation with the Chatbot. Although the Chatbot didn't do so well (it often got off topic or had weird responses, this is likely because we trained in on only a few sentences. Try adjusting the sentences and see if it does better! Let's find out. 

In [45]:
# Training on only first 10000 sentences. 
dialogs_temp = dialogs[0:10000]

In [46]:
# Create a chatbot
chatbot = ChatBot('Dialogs')
# This is to remove the accumulated knowledge base
chatbot.storage.drop()

# Create a new trainer for the chatbot
trainer = ListTrainer(chatbot)

# Train the chatbot based on Persuasion
trainer.train(dialogs_temp)

List Trainer: [####################] 100%


In [47]:
print("Chatbot: Let's see how well I do as a chatbot! If you want to exit, type bye.")

# Below is the chatting
while True:
    
    user_input = input("User: ")
    user_input=user_input.lower()
    
    if(user_input!='bye'):
        if(user_input == 'thanks' or user_input == 'thank you'):
            break
            print("Chatbot: You're welcome.")
        else:
            if(greeting(user_input) != None):
                print("Chatbot: " + greeting(user_input))
            else:
                print("Chatbot: ", end = "")
                print(chatbot.get_response(user_input))
    else:
        print("Chatbot: Bye! It was a great chat.")
        break

Chatbot: Let's see how well I do as a chatbot! If you want to exit, type bye.
User: Hi Chatbot
Chatbot: hi
User: How are you?
Chatbot: Takin' a bath.
User: With bubbles?
Chatbot: They booby trapped her apartment!
User: On no! 
Chatbot: Ever been to Club Skunk?
User: No I have not, have you?
Chatbot: As a gentleman You're sweet.
User: Have you been to Club Skunk?
Chatbot: I don't give a damn.
User: How rude
Chatbot: That is great.
User: Very very rude
Chatbot: Why?
User: Because I said so 
Chatbot: They do to!
User: What are you talking about?
Chatbot: She's partial to Joey, not me Sure Then, go get her What'd you do to her?
User: Who is Joey?
Chatbot: You never wanted to go out with 'me, did you?
User: This again 
Chatbot: Again?
User: Last chatbot asked me the same!
Chatbot: Oh, so now you think you know me?
User: I created you
Chatbot: Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad.
User: thanks


There does not seem to be a difference, although I was a bit more entertained by this chatbot. I think that in some cases it did a better job at keeping with the conversation. I wonder what not so simple chatbots can do and how you build one! Stay tuned!