<a href="https://colab.research.google.com/github/sudhirtakke/Building-a-Chatbot-and-Deploying-as-a-Flask-Web-App/blob/main/Building_a_Chatbot_and_Deploying_as_a_Flask_Web_App.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <center>Building a Chatbot and Deploying as a Flask Web App</center>

## Table of Contents

1. [Importing Libraries](#section1)<br><br>
2. [Importing Data](#section2)<br><br>
3. [Preprocessing the Data](#section3)
  - 3.1 [Saving the clean_conversations list into a pickle file](#section301)<br><br>
4. [Building the Model](#section4)
  - 4.1 [Importing the clean_conversations pickle file](#section401)<br><br>
  - 4.2 [Training](#section402)<br><br>
5. [Chatting](#section5)<br><br>
5. [Deploying as a Flask Web App](#section6)

<br><br>

<center><img src="https://raw.githubusercontent.com/insaid2018/DeepLearning/master/images/chatbot.gif" width="350" height="400"/></center>

<br><br>

<a id=section1></a>
## 1. Importing Libraries

- **Installing the required packages**

- The **`requirements.txt`** file will be present along with the notebook.

In [3]:
!pip install -r requirements.txt

Collecting chatterbot_corpus
  Cloning git://github.com/gunthercox/chatterbot-corpus (to revision master) to /tmp/pip-install-k0k49iec/chatterbot-corpus
  Running command git clone -q git://github.com/gunthercox/chatterbot-corpus /tmp/pip-install-k0k49iec/chatterbot-corpus
Collecting chatterbot
[?25l  Downloading https://files.pythonhosted.org/packages/7c/21/85c2b114bd9dfabdd46ba58fc4519acdaed45d8c70898d40079e37a45e67/ChatterBot-1.0.8-py2.py3-none-any.whl (63kB)
[K     |████████████████████████████████| 71kB 4.5MB/s 
[?25hCollecting mathparse<0.2,>=0.1
  Downloading https://files.pythonhosted.org/packages/c3/e5/4910fb85950cb960fcf3f5aabe1c8e55f5c9201788a1c1302b570a7e1f84/mathparse-0.1.2-py3-none-any.whl
Collecting sqlalchemy<1.4,>=1.3
[?25l  Downloading https://files.pythonhosted.org/packages/31/44/a86070dda790ce94cd7d9fb9281cd614c7d30850ed774ace9a84d0d5d491/SQLAlchemy-1.3.24-cp37-cp37m-manylinux2010_x86_64.whl (1.3MB)
[K     |████████████████████████████████| 1.3MB 25.6MB/s 
Buil

In [4]:
import urllib
import re
import random

In [5]:
import pickle


- **Importing chatterbot's chatbot libraries**

<center><img src="https://raw.githubusercontent.com/insaid2018/DeepLearning/master/images/chatterbot.png" width="600" height="250"></center>


<br> 

- We are **using chatterbot library to build** our **Chatbot** because **it makes** it really **easy** to **train** and **build a chatbot without** any **major barriers**.

<br>


<br> 

In [6]:
from chatterbot import ChatBot
from chatterbot.trainers import ListTrainer
from chatterbot.trainers import ChatterBotCorpusTrainer

- **Importing Flask**

In [7]:
from flask import Flask, render_template, request

<a id=section2></a>
## 2. Importing Data

This corpus comes from the paper, "**Chameleons in imagined conversations**: *A new approach to understanding coordination of linguistic style in dialogs*" by **Cristian Danescu-Niculescu-Mizil** and **Lillian Lee**.

The paper and up-to-date data can be found here: [http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html](http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html)

Please see the **README** for more information on the authors' collection procedures.

In [8]:
# Importing the dataset from github.
response = urllib.request.urlopen('https://raw.githubusercontent.com/insaid2018/DeepLearning/master/Data/movie_lines.txt')
lines = response.read()
    
lines = lines.decode('utf8', errors='ignore').split('\n')

In [9]:
type(lines)

list

In [10]:
response = urllib.request.urlopen('https://raw.githubusercontent.com/insaid2018/DeepLearning/master/Data/movie_conversations.txt')
conversations = response.read()
    
conversations = conversations.decode('utf8', errors='ignore').split('\n')

In [11]:
type(conversations)

list

- Checking a few **samples** from the **dataset**.


- Each element of the **lines** list contain a **dialog** said by a **character** in a movie.

In [12]:
print(lines[0])
print(lines[1])
print(lines[2])

L1045 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ They do not!
L1044 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ They do to!
L985 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ I hope so.


- Each element of the **conversations** list contains a **list of interactions** between different **characters**.

In [13]:
print(conversations[0])
print(conversations[1])
print(conversations[2])

u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L194', 'L195', 'L196', 'L197']
u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L198', 'L199']
u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L200', 'L201', 'L202', 'L203']


- Checking the **number of samples** in the dataset.

In [14]:
len(lines)

304714

In [15]:
len(conversations)

83098

<a id=section3></a>
## 3. Preprocessing the Data

- Creating a dictionary that **maps each line** and its **id**.

In [16]:
id2line = {}
for line in lines:
    _line = line.split(' +++$+++ ')
    if len(_line) == 5:
        id2line[_line[0]] = _line[4]

In [17]:
id2line

{'L1045': 'They do not!',
 'L1044': 'They do to!',
 'L985': 'I hope so.',
 'L984': 'She okay?',
 'L925': "Let's go.",
 'L924': 'Wow',
 'L872': "Okay -- you're gonna need to learn how to lie.",
 'L871': 'No',
 'L870': 'I\'m kidding.  You know how sometimes you just become this "persona"?  And you don\'t know how to quit?',
 'L869': 'Like my fear of wearing pastels?',
 'L868': 'The "real you".',
 'L867': 'What good stuff?',
 'L866': "I figured you'd get to the good stuff eventually.",
 'L865': 'Thank God!  If I had to hear one more story about your coiffure...',
 'L864': "Me.  This endless ...blonde babble. I'm like, boring myself.",
 'L863': 'What crap?',
 'L862': 'do you listen to this crap?',
 'L861': 'No...',
 'L860': 'Then Guillermo says, "If you go any lighter, you\'re gonna look like an extra on 90210."',
 'L699': 'You always been this selfish?',
 'L698': 'But',
 'L697': "Then that's all you had to say.",
 'L696': 'Well, no...',
 'L695': "You never wanted to go out with 'me, did y

- Creating a list of **all** of the **conversations**.

In [18]:
conversations_ids = []
for conversation in conversations[:-1]:
    _conversation = conversation.split(' +++$+++ ')[-1][1:-1].replace("'", "").replace(" ", "")
    conversations_ids.append(_conversation.split(','))

In [19]:
conversations_ids[:5]

[['L194', 'L195', 'L196', 'L197'],
 ['L198', 'L199'],
 ['L200', 'L201', 'L202', 'L203'],
 ['L204', 'L205', 'L206'],
 ['L207', 'L208']]

- Creating a list of **conversations** in **textual** format in the **sequence** they take place.

In [20]:
conversations = []
for conversation in conversations_ids:
    for i in range(len(conversation)):
        conversations.append(id2line[conversation[i]])

In [21]:
conversations[:5]

['Can we make this quick?  Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad.  Again.',
 "Well, I thought we'd start with pronunciation, if that's okay with you.",
 'Not the hacking and gagging and spitting part.  Please.',
 "Okay... then how 'bout we try out some French cuisine.  Saturday?  Night?",
 "You're asking me out.  That's so cute. What's your name again?"]

In [22]:
conversations[-5:]

['Lord Chelmsford seems to want me to stay back with my Basutos.',
 'I think Chelmsford wants a good man on the border Why he fears a flanking attack and requires a steady Commander in reserve.',
 'Well I assure you, Sir, I have no desire to create difficulties. 45',
 "And I assure you, you do not In fact I'd be obliged for your best advice. What have your scouts seen?",
 'So far only their scouts. But we have had reports of a small Impi farther north, over there. ']

In [23]:
len(conversations)

304713

- *Function* to perform **cleaning** of the **texts**.

In [24]:
def clean_text(text):
    
    text = text.lower().strip()      # Changing case to lower and stripping any white space from start and end of sentence.
    text = re.sub(r"i'm", "i am", text)                       # Substituting some improper words with proper words 
    text = re.sub(r"he's", "he is", text)
    text = re.sub(r"she's", "she is", text)
    text = re.sub(r"that's", "that is", text)
    text = re.sub(r"what's", "what is", text)
    text = re.sub(r"where's", "where is", text)
    text = re.sub(r"how's", "how is", text)
    text = re.sub(r"\'s", " is", text)
    text = re.sub(r"\'ll", " will", text)
    text = re.sub(r"\'ve", " have", text)
    text = re.sub(r"\'re", " are", text)
    text = re.sub(r"\'d", " would", text)
    text = re.sub(r"won't", "will not", text)
    text = re.sub(r"can't", "cannot", text)
    text = re.sub(r"n't", " not", text)                       # Substituting some improper words with proper words
    text = re.sub(r"([-?.!,/\"])", r" \1 ", text)             # Adding spaces before and after -?.!,/\"
    text = re.sub(r"[-()\"#/@;:<>{}`+=~|.!?,']", "", text)    # Removing -()\"#/@;:<>{}`+=~|.!?,'
    text = re.sub(r"[ ]+", " ", text)                         # Replacing more than 1 space with a single space
    text = text.rstrip().strip()                              # Stripping white space from start and end of sentence.
    
    return text

- **Cleaning** the **conversations**.

In [25]:
clean_conversations = []
for conversation in conversations:
    clean_conversations.append(clean_text(conversation))

In [26]:
clean_conversations[:5]

['can we make this quick roxanne korrine and andrew barrett are having an incredibly horrendous public break up on the quad again',
 'well i thought we would start with pronunciation if that is okay with you',
 'not the hacking and gagging and spitting part please',
 'okay then how bout we try out some french cuisine saturday night',
 'you are asking me out that is so cute what is your name again']

<a id=section301></a>
### 3.1 Saving the clean_conversations list into a pickle file

- We are performing this step so that we can avoid the **data loading** and **preprocessing** steps every time we run the notebook. 


- We can **resume** directly from *Section 4: Building the Model* after **importing** the *libraries*.


- Also, this allows us to **export** the preprocessed list of **clean_conversations** to any system we want.

In [27]:
with open('clean_conversations.pickle', 'wb') as fp:
    pickle.dump(clean_conversations, fp)

<a id=section4></a>
## 4. Building the Model

<a id=section401></a>
### 4.1 Importing the clean_conversations pickle file

- **Loading** the preprocessed list **clean_conversations**.

In [28]:
with open ('clean_conversations.pickle', 'rb') as fp:
    clean_conversations = pickle.load(fp)

In [29]:
clean_conversations[:5]

['can we make this quick roxanne korrine and andrew barrett are having an incredibly horrendous public break up on the quad again',
 'well i thought we would start with pronunciation if that is okay with you',
 'not the hacking and gagging and spitting part please',
 'okay then how bout we try out some french cuisine saturday night',
 'you are asking me out that is so cute what is your name again']

In [30]:
len(clean_conversations)

304713

<a id=section402></a>
### 4.2 Training

- **Creating the bot**.
<br><br> 
  - First we initiate the **Flask** app as a variable named **app**.
<br><br> 
  - Then we create an object **bot** of **ChatBot** class and name the chatbot **Abot**.

In [31]:
app = Flask(__name__)
bot = ChatBot("Abot")

- **Training** the bot on **Chatterbot's English corpus**. 

<br> 

<center><img src="https://raw.githubusercontent.com/insaid2018/DeepLearning/master/images/chatbot3.gif" width="500" height="300"></center>

<br> 

  - Here we create a **trainer** object of **ChatterBotCorpusTrainer** class and pass in our ChatBot object **bot** to it.
<br><br>   
  - Then we use the **train** method and pass **chatterbot.corpus.english** into it.
<br><br>   
  - This will train the chatbot over the **Chatterbot's** in-built **English** corpus. 

In [32]:
!pip install chatterbot
!pip install chatterbot_corpus



In [33]:
trainer = ChatterBotCorpusTrainer(bot)
trainer.train("chatterbot.corpus.english")

Training ai.yml: [####################] 100%
Training botprofile.yml: [####################] 100%
Training computers.yml: [####################] 100%
Training conversations.yml: [####################] 100%
Training emotion.yml: [####################] 100%
Training food.yml: [####################] 100%
Training gossip.yml: [####################] 100%
Training greetings.yml: [####################] 100%
Training health.yml: [####################] 100%
Training history.yml: [####################] 100%
Training humor.yml: [####################] 100%
Training literature.yml: [####################] 100%
Training money.yml: [####################] 100%
Training movies.yml: [####################] 100%
Training politics.yml: [####################] 100%
Training psychology.yml: [####################] 100%
Training science.yml: [####################] 100%
Training sports.yml: [####################] 100%
Training trivia.yml: [####################] 100%


- **Training** the bot on some **basic statements**.
<br><br>
  - Then we change the trainer into a **ListTrainer**.
<br><br>   
  - It trains the bot using **lists** of **conversations** as input.
<br><br>
  - Then we pass in some lists containing some conversations to the trainer's **train** method and train the bot on them.

In [34]:
trainer = ListTrainer(bot)
trainer.train(['hello', 'Hi', 'Hello', 'hi'])
trainer.train(['What is your name?', 'My name is Abot'])
trainer.train(['Who are you?', 'I am a bot' ])
trainer.train(['Who created you?', 'A Human', 'You?'])

List Trainer: [####################] 100%
List Trainer: [####################] 100%
List Trainer: [####################] 100%
List Trainer: [####################] 100%


- **Training** the bot on the **movie conversations**.

  - Here we pass the **clean_conversations** list to our trainer's **train** method and train the bot.

  - This will take some time to train. Please be patient.

In [None]:
trainer.train(clean_conversations)

<a id=section5></a>
## 5. Chatting

- Creating **end_list** and **chatbot_bye** lists which contain some **partings** from user and from bot respectively.

In [36]:
end_list = ['bye', 'goodbye', 'see you later', 'see you soon', 'ciao', 'bi', 'bie', 'talk to you later']
chatbot_bye = ['Bye', 'See you soon', 'Goodbye']

- **To chat** with the bot in the **notebook**:

<br> 

<center><img src="https://raw.githubusercontent.com/insaid2018/DeepLearning/master/images/chatbot_reply.jpg" width="500" height="300"></center>

<br>

  - In this **loop**, we can chat with the bot.
<br><br>
  - Here, we use the **get_response** method of our chatbot object **bot** to get the output from the bot.
<br><br>
  - To **end** the conversation, just type in any of the partings from the **end_list**.

In [37]:
while True:
    message = input('You: ')
    if message.strip().lower() in end_list:
        print('Abot:', random.choice(chatbot_bye))
        break
    reply = bot.get_response(message)
    print('Abot:', reply)

You: hello
Abot: Hi
You: how are you
Abot: I am on the Internet.
You: Ok, how's the day today
Abot: DO YOU PLAY SOCCER
You: No
Abot: So what's your favorite color?
You: Green
Abot: global organization promoting environmental activism.
You: yeah
Abot: It all makes sense to my artificial mind.
You: bye
Abot: Goodbye


<br>

<center><img src="https://raw.githubusercontent.com/insaid2018/DeepLearning/master/images/chatbot2.png" width="500" height="300"></center>

<br>

<a id=section6></a>
## 6. Deploying as a Flask Web App

<br>

<center><img src="https://raw.githubusercontent.com/insaid2018/DeepLearning/master/images/flask.png" width="350" height="180"></center>

<br>

- Defining the **Flask** web app for our chatbot.
<br><br> 
  - `@app.route("/")` will set the directory of this notebook as the **root** directory for our Flask app.
<br><br>   
  - In the **home** function we define the **HTML** template for our web app.
<br><br>     
    - We use **render_template** funtion to provide the location of our HTML file **chatbot.html**.
<br><br>    
    - The **chatbot.html** file must be located in the **templates** folder in the root directory of our Flask app.
<br><br>     
    - The current **root** directory of our Flask app is the same folder in which this notebook is located.

In [38]:
@app.route("/")
def home():
    return render_template("chatbot.html")

- Here we make the **connection** of our chatbot with the Flask app.
<br><br> 
  - `@app.route("/get")` will allow us to get the **user input** provided in the web app.
<br><br>   
  - **get_bot_response** funtion allows us to send replies to the user input.
<br><br>   
    - First, we save the user input into a variable **user_text**.
<br><br>   
    - Then if the *user_text* value is present in *end_list*, then we set **reply** equal to one of the values present in *chatbot_bye* list randomly.
<br><br>  
    - If not, then we set **reply** equal to the bot response as a string, after passing the **user_text** to the **get_response** method of our chatbot object **bot**.
<br><br> 
    - Finally, we **return** the reply value.

In [39]:
@app.route("/get")
def get_bot_response():
    user_text = request.args.get('msg')
    
    if user_text.strip().lower() in end_list:
        reply = random.choice(chatbot_bye)
    else:
        reply = str(bot.get_response(user_text))

    return reply

- Then we run the Flask app using **app.run()**.

In [None]:
if __name__ == "__main__":
    app.run()

<br>

- The **chatbot** can be accessed in your **web browser** using the above **address**.