<a href="https://colab.research.google.com/github/yulucy19/Thinkful-Project-2019/blob/master/Chatbot_Cornell_Movie.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this assignment, I am going to work with a dataset called Cornell Movie--Dialogs Corpus released by the Cornell University to build a chatbot. The dataset contains conversations from more than 600 movies. You should access the dataset from the Thinkful database using the following credentials:

In [5]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sqlalchemy import create_engine
import nltk
import spacy
import re

import warnings
warnings.filterwarnings(action="ignore")

!python -m spacy download en

Collecting en_core_web_sm==2.1.0
[?25l  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.1.0/en_core_web_sm-2.1.0.tar.gz (11.1MB)
[K     |████████████████████████████████| 11.1MB 608kB/s 
[?25hBuilding wheels for collected packages: en-core-web-sm
  Building wheel for en-core-web-sm (setup.py) ... [?25l[?25hdone
  Created wheel for en-core-web-sm: filename=en_core_web_sm-2.1.0-cp36-none-any.whl size=11074435 sha256=5e20ea22bc8eae306d32b0c9bde381327a503b1d17125ce95efebd4cdc36aabf
  Stored in directory: /tmp/pip-ephem-wheel-cache-1g8o0edk/wheels/39/ea/3b/507f7df78be8631a7a3d7090962194cf55bc1158572c0be77f
Successfully built en-core-web-sm
Installing collected packages: en-core-web-sm
  Found existing installation: en-core-web-sm 2.2.5
    Uninstalling en-core-web-sm-2.2.5:
      Successfully uninstalled en-core-web-sm-2.2.5
Successfully installed en-core-web-sm-2.1.0
[38;5;2m✔ Download and installation successful[0m
You can now load the model

In [6]:
postgres_user = 'dsbc_student'
postgres_pw = '7*.8G9QH21'
postgres_host = '142.93.121.174'
postgres_port = '5432'
postgres_db = 'cornell_movie_dialogs'

engine = create_engine('postgresql://{}:{}@{}:{}/{}'.format(
    postgres_user, postgres_pw, postgres_host, postgres_port, postgres_db))

dialogs_df = pd.read_sql_query('select * from dialogs',con=engine)

# no need for an open connection, as we're only doing a single query
engine.dispose()

dialogs_df.head(5)

Unnamed: 0,index,dialogs
0,0,Can we make this quick? Roxanne Korrine and A...
1,1,"Well, I thought we'd start with pronunciation,..."
2,2,Not the hacking and gagging and spitting part....
3,3,Okay... then how 'bout we try out some French ...
4,4,You're asking me out. That's so cute. What's ...


In [0]:
nlp = spacy.load('en', disable=['parser', 'ner'])

# below is necessary to avoid memory error of SpaCy
nlp.add_pipe(nlp.create_pipe('sentencizer'))
nlp.max_length = 20000000

# all the processing work is done below, so it may take a while
dialogs_doc = nlp(" ".join(dialogs_df.dialogs))

In [8]:
# let's explore the objects we've built.
print("The dialogs_doc object is a {} object.".format(type(dialogs_doc)))
print("It is {} tokens long".format(len(dialogs_doc)))
print("The first three tokens are '{}'".format(dialogs_doc[:3]))
print("The type of each token is {}".format(type(dialogs_doc[0])))

The dialogs_doc object is a <class 'spacy.tokens.doc.Doc'> object.
It is 4272659 tokens long
The first three tokens are 'Can we make'
The type of each token is <class 'spacy.tokens.token.Token'>


In [0]:
# # group into sentences.
# we use the sentences that has more than 1 character
dialog_sents = [sent.text for sent in dialogs_doc.sents if len(sent.text) > 1]
#dialog_sents

**Building a chatbot using ChatterBot**

In [0]:
# import libraries
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer, ChatterBotCorpusTrainer
from chatterbot.conversation import Statement

In [11]:
# create a chatbot
chatbot = ChatBot("Dialogs")
# This is to remove the accumulated knowledge base
chatbot.storage.drop()

# create a new trainer for the chatbot 
trainer = ListTrainer(chatbot)

# train the chatbot based on Cornell Movie Dialogs
trainer.train(dialog_sents)

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
List Trainer: [####################] 100%


In [0]:
import random
GREETING_INPUTS = ["hello", "hi", "greetings", "what's up","hey"]
GREETING_RESPONSES = ["hello", "hi", "hey", "hi there"]
def greeting(sentence):
    for word in sentence.split():
        if word.lower() in GREETING_INPUTS:
            return random.choice(GREETING_RESPONSES)

In [14]:
print("ChatterBot: I will try to respond you reasonably. If you want to exit, type bye please.")

# below is the chatting
while True:
    
    user_input = input("User: ")
    user_input=user_input.lower()
    
    if(user_input!='bye'):
        if(user_input == 'thanks' or user_input == 'thank you'):
            break
            print("ChatterBot: You're welcome.")
        else:
            if(greeting(user_input) != None):
                print("ChatterBot: " + greeting(user_input))
            else:
                print("ChatterBot: ", end = "")
                print(chatbot.get_response(user_input))
    else:
        print("ChatterBot: Bye! It was a great chat.")
        break

ChatterBot: I will try to respond you reasonably. If you want to exit, type bye please.
User: hello
ChatterBot: hey
User: Are you like to watch movie?
ChatterBot: Fine.
User: Oh! Well, what kind of movie you like?
ChatterBot: Fine.
User: So how are you today?
ChatterBot: I'd like to commission a work.
User: What kind of work?
ChatterBot: Buttholus extremus.
User: Can we talk more about movie?
ChatterBot: She's partial to Joey, not me Sure Then, go get her What'd you do to her?
User: Oh no. I prefer talking about movies.
ChatterBot: Why?
User: Because, I'm kind of a descent person and like to respect the rule of law.
ChatterBot: Who?
User: me.
ChatterBot: This endless ...blonde babble.
User: Anyway, do you know the movie Superman?
ChatterBot: I believe we share an art instructor Have fun tonight?
User: Do you know something about technology?
ChatterBot: I believe we share an art instructor Have fun tonight?
User: No. What about artificial intelligence?
ChatterBot: Something, apperently,