# Digital Assistant: chat-bots

In this case study we build a chatbot prototype using NLP and
ML to understand the user’s intent and have response provided based on underlying logic.


## Content

* [1. Problem Definition](#1)
* [2. Getting Started - Load Libraries and Dataset](#2)
    * [2.1. Load Libraries](#2.1) 
* [3. Training a default chatbot ](#3)  
* [4. Data Preparation for customized chatbot](#4)
* [5.Model Construction and Training](#5)        
    * [5.1. Model Construction](#5.1)
    * [5.2. Building Custom Logic Adapter](#5.2)
    * [5.3. Training the model](#5.3)       
* [6.Model Testing and Usage](#6)           

<a id='1'></a>
# 1. Problem Definition

The goal of this case study is to build a basic prototype of the conversational chatbot
powered by NLP. The primary purpose of this chatbot would be to retrieve the finan‐
cial ratio for the company user is looking for. Such chatbots designed to quickly
retrieve the summary of a stock or instrument may help the user to make a trading
decision.

<a id='2'></a>
# 2. Getting Started- Loading the data and python packages


<a id='2.1'></a>
## 2.1. Loading the python packages
For this case study we use python package - chatterbot. Chatterbot is a python library to create a simple chatbot with minimal programming required. 
Let us chek is the Chatterbot package is present, if not install it. This package is checked separately as it is not included in requirement.txt of this book repository as the package is not used across any other case study of thie book.

In [5]:
import pkg_resources
import pip
installedPackages = {pkg.key for pkg in pkg_resources.working_set}
if 'chatterbot' not in installedPackages :
    !pip install ChatterBot==1.0.5    

Let us load the chatterbot package

In [1]:
# Load libraries
from chatterbot import ChatBot
from chatterbot.logic import LogicAdapter
from chatterbot.trainers import ChatterBotCorpusTrainer
from chatterbot.trainers import ListTrainer

In [2]:
#Diable the warnings
import warnings
warnings.filterwarnings('ignore')

Before we move to the customised chatbot, let us develop a chatbot using the defualt features and logic adapters of chatterbot package. 

<a id='3'></a>
## 3 Training a default chatbot 

Before we move on to build a chatbot for customised function avilable in chatterbot. Chatterbot and many other chatbot packages comes with a data utility module that can be used to train the chatbots.

Following is a simple example to get started with ChatterBot in python with the following components. 
* **preprocessors** 
* **logic_adapters** 
* **corpus training** 
* **list training** 

In [3]:
chatB = ChatBot("Trader",
                preprocessors=['chatterbot.preprocessors.clean_whitespace'],
                logic_adapters=['chatterbot.logic.BestMatch',
                                'chatterbot.logic.MathematicalEvaluation'])

# Corpus Training
trainerCorpus = ChatterBotCorpusTrainer(chatB)

#Train based on English Corpus
trainerCorpus.train(
    "chatterbot.corpus.english"
)
# Train based on english greetings corpus
trainerCorpus.train("chatterbot.corpus.english.greetings")

# Train based on the english conversations corpus
trainerCorpus.train("chatterbot.corpus.english.conversations")

trainerConversation = ListTrainer(chatB)
#Traing based on conversations

#List training
trainerConversation.train([
    'Help!',
    'Please go to google.com',
    'What is Bitcoin?',
    'It is a decentralized digital currency'
])

# You can train with a second list of data to add response variations
trainerConversation.train([
    'What is Bitcoin?',
    'Bitcoin is a cryptocurrency.'
])


Training ai.yml: [                    ] 1%

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Training ai.yml: [####################] 100%
Training botprofile.yml: [####################] 100%
Training computers.yml: [####################] 100%
Training conversations.yml: [####################] 100%
Training emotion.yml: [####################] 100%
Training food.yml: [####################] 100%
Training gossip.yml: [####################] 100%
Training greetings.yml: [####################] 100%
Training health.yml: [####################] 100%
Training history.yml: [####################] 100%
Training humor.yml: [####################] 100%
Training literature.yml: [####################] 100%
Training money.yml: [####################] 100%
Training movies.yml: [####################] 100%
Training politics.yml: [####################] 100%
Training psychology.yml: [####################] 100%
Training science.yml: [####################] 100%
Training sports.yml: [####################] 100%
Training trivia.yml: [####################] 100%
Training greetings.yml: [####################] 

In [4]:
def converse(quit="quit"):
    user_input = ""
    while user_input != quit:
        user_input = quit
        try:
            user_input = input(">")
        except EOFError:
            print(user_input)
        if user_input:
            while user_input[-1] in "!.":
                user_input = user_input[:-1]
            print(chatB.get_response(user_input))

In [5]:
converse()

>Hi
How are you doing?
>I am doing well.
That is good to hear
>What is 78964 plus 5970
78964 plus 5970 = 84934
>what is a dollar
dollar: unit of currency in the united states.
>What is Bitcoin?
It is a decentralized digital currency
>Help!
Please go to google.com
>Tell me a joke
Did you hear the one about the mountain goats in the andes? It was "ba a a a a a d".
>What is Bitcoin?
It is a decentralized digital currency
>What is Bitcoin?
It is a decentralized digital currency
>What is Bitcoin?
Bitcoin is a cryptocurrency.
>quit
no.


In this example, we see a fairly good chatbot which gives us response according to the input that we have given. The first two responses are due to the training on english greetings and conversation corpus. Additionally the response to "tell me a joke" and "what is a dollar" are due to the training on the english corpus. The computation in the forth line is the result of the chatbot being trained on the Mathematical Evaluation logical adapter. The response to "Help" and "What is a bitcoin" are the result of the customised list trainers. 

Given, that we have already have a customised chatbot, we move on to create a chatbot which is designed to give us the financial ratios of a company based on a customised logical adapter.

<a id='4'></a>
# 4. Data Preparation for customized chatbot

The purpose of performing the data preparation is to use it for training through logic adapter.The details are under https://chatterbot.readthedocs.io/en/stable/logic/create-a-logic-adapter.html. Given the logic adapter need to be in a separate file from the chat bot, we perform the step of data preparation in the module **financial_ratio_adapter.py** where logic adapter is created.


<a id='5'></a>
# 5. Model construction and training

<a id='5.1'></a>
## 5.1 and 5.2 Model optimization function and building custom logic adapter
Step 4.2 and 4.2 are shown in the module **financial_ratio_adapter.py**, given the logic adapter need to be in a separate file from the chat bot. In the next step we train the chatbot, which trains it on the customised logic adapter. 

<a id='5.3'></a>
## 5.3. Training the model

In this step we combine all the components (i.e. preprocessor, custom logical adapter, list and corpus trainer) with the custom logical adapter (financial_ratio_adapter.FinancialRatioAdapter) that we have created. 

In [6]:
#Here we add 
chatbot = ChatBot(
    "My ChatterBot",
    preprocessors=['chatterbot.preprocessors.clean_whitespace'],
    logic_adapters=[
        'financial_ratio_adapter.FinancialRatioAdapter',
        'chatterbot.logic.MathematicalEvaluation',
        'chatterbot.logic.BestMatch'
    ]
)

#Train based on English Corpus
trainerCorpus.train(
    "chatterbot.corpus.english"
)
# Train based on english greetings corpus
trainerCorpus.train("chatterbot.corpus.english.greetings")

# Train based on the english conversations corpus
trainerCorpus.train("chatterbot.corpus.english.conversations")

trainerConversation = ListTrainer(chatB)
#Traing based on conversations

trainerConversation.train([
    'Help!',
    'Please go to google.com',
    'What is Bitcoin?',
    'It is a decentralized digital currency'
])

# You can train with a second list of data to add response variations
trainerConversation.train([
    'What is Bitcoin?',
    'Bitcoin is a cryptocurrency.'
])


Losses {'ner': 250.5925518564882}
Losses {'ner': 86.34306746371182}
Losses {'ner': 9.912364617525238}
Losses {'ner': 0.007054564759577683}
Losses {'ner': 0.002342427745124589}
Losses {'ner': 0.17200641879483095}
Losses {'ner': 0.00014026589302679004}
Losses {'ner': 0.04666429370491898}
Losses {'ner': 0.0005265609668528584}
Losses {'ner': 0.00029906058166727796}
Losses {'ner': 5.9895766629850823e-05}
Losses {'ner': 0.0006064481033172622}
Losses {'ner': 1.0745628683613567e-05}
Losses {'ner': 9.724242475936387e-06}
Losses {'ner': 1.7436667959465367e-06}
Losses {'ner': 5.097320584206234e-07}
Losses {'ner': 1.5063773009800355e-06}
Losses {'ner': 3.463751450599309e-05}
Losses {'ner': 8.846712629581901e-06}
Losses {'ner': 5.9018098284142235e-05}
Losses {'ner': 6.828183680571441e-07}
Losses {'ner': 0.0001549424831125363}
Losses {'ner': 0.00011724383958802145}
Losses {'ner': 2.327508621099159e-06}
Losses {'ner': 2.080900377673051e-05}
Losses {'ner': 6.029163538041867e-07}
Losses {'ner': 1.51605

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Training ai.yml: [####################] 100%
Training botprofile.yml: [####################] 100%
Training computers.yml: [####################] 100%
Training conversations.yml: [####################] 100%
Training emotion.yml: [####################] 100%
Training food.yml: [####################] 100%
Training gossip.yml: [####################] 100%
Training greetings.yml: [####################] 100%
Training health.yml: [####################] 100%
Training history.yml: [####################] 100%
Training humor.yml: [####################] 100%
Training literature.yml: [####################] 100%
Training money.yml: [####################] 100%
Training movies.yml: [####################] 100%
Training politics.yml: [####################] 100%
Training psychology.yml: [####################] 100%
Training science.yml: [####################] 100%
Training sports.yml: [####################] 100%
Training trivia.yml: [####################] 100%
Training greetings.yml: [####################] 

As we can see the training was not only for the FinancialRatioAdapter, but also for the list and corpus trainer. Let us move to the model testing.

<a id='6'></a>
# 6. Model Testing and Usage

In [7]:
def converse(quit="quit"):
    user_input = ""
    while user_input != quit:
        user_input = quit
        try:
            user_input = input(">")
        except EOFError:
            print(user_input)
        if user_input:
            while user_input[-1] in "!.":
                user_input = user_input[:-1]
            print(chatbot.get_response(user_input))

In [13]:
converse()

>What is ROE for Citibank ?
https://www.zacks.com/stock/chart/C/fundamental/return-on-equity-ttm
					  
>Tell me PE for Delta?
https://www.zacks.com/stock/chart/DAL/fundamental/pe-ratio-ttm
					  
>What is Bitcoin?
It is a decentralized digital currency
>Help!
Please go to google.com
>What is 786940 plus 75869
786940 plus 75869 = 862809
>Do you like dogs?
Sorry! Could not figure out what the user wants
>Quit
Sorry! Could not figure out what the user wants
>quit
Sorry! Could not figure out what the user wants


The custom logic adaptor for our Chatter bot, finds a RATIO or a COMPANY in the sentence using our NLP model. If the model finds exactly one COMPANY and exactly one RATIO, it con structs a url to guide the user. Additionally other logical adpater such as mathematical evaluation, and curpus and list trainer work as expected as well. 

**Conclusion**

In this case study, we have learned how to make a chatbot in python using the ChatterBot library. We learnt how to build a custom NLP based model focusing on NER(Named Entity Recognition) and use in a chatbot.


In order to train a blank model, one must have a substantial training dataset. In this
case study, we looked at patterns available to us and used them to generate training
samples. 

This case study is a demo project, and significant enhancements can be made for each
component to extend it for a wide variety of tasks. Additional preprocessing steps can
be added to have cleaner data to work with. 

Overall, this case study provides an introduction to all the aspects of chatbot development. Although, it is a very simple bot, it’s a good starting point to use NLP to create
chatbots.


