# Digital Assistant: chat-bots

Chatbots are virtual assistants that help customers transact or solve problems. These automated programmes use NLP to interact with clients in natural language (by text or voice), and use machine learning algorithms to improve over time. Chatbots are being introduced by a range of financial services firms, often in their mobile apps or social media. While many are still in the trial phase, there is potential for growth as chatbots gain increasing usage and become more sophisticated. The current generation of chatbots in use by financial services firms is simple, generally providing balance information or alerts to custom ers, or answering simple questions. It is worth observing that the increasing usage of chatbots is correlated with the increased usage of messaging applications.

We can define the chatbots into two categories, following are the two categories of chatbots:

* **Rule-Based Approach** – In this approach, a bot is trained according to rules. Based on this a bot can answer simple queries but sometimes fails to answer complex queries.

* **Self-Learning Approach** – These bots follow the machine learning approach which is rather more efficient and is further divided into two more categories.

    * Retrieval-Based Models – In this approach, the bot retrieves the best response from a list of responses according to the user input.
    
    * Generative Models – These models often come up with answers than searching from a set of answers which makes them intelligent bots as well.
    
In this case study, we will touch upon both the approaches of the chatbot development. 

The focus of this case study are as follows:
* Understand and build a chatbot
* Understand the data preparation required for building a chatbot
* Understand the python packages required for chatbot development.

## Content

* [1. Problem Definition](#1)
* [2. Getting Started - Load Libraries and Dataset](#2)
    * [2.1. Load Libraries](#2.1) 
* [3. Training a default chatbot ](#3)  
* [4. Data Preparation for customized chatbot](#4)
* [5.Model Construction and Training](#5)        
    * [5.1. Model Construction](#5.1)
    * [5.2. Building Custom Logic Adapter](#5.2)
    * [5.3. Training the model](#5.3)       
* [6.Model Testing and Usage](#6)           

<a id='1'></a>
# 1. Problem Definition

The problem statement is to build a chatbot that uses NLP to understand user inputs and intention and retrieve the financial ratio for the company user is looking for.

<a id='2'></a>
# 2. Getting Started- Loading the data and python packages


<a id='2.1'></a>
## 2.1. Loading the python packages
For this case study we will use 2 text based libraries. The first one is spacy and the other one is chatterbot. Spacy is a python library which makes it easier to analyze text and build custom natural language models. Chatterbot is a python library to create a simple chatbot with minimal programming required. 

In [1]:
# Load libraries
from chatterbot import ChatBot
from chatterbot.logic import LogicAdapter
from chatterbot.trainers import ChatterBotCorpusTrainer
from chatterbot.trainers import ListTrainer

In [2]:
#Diable the warnings
import warnings
warnings.filterwarnings('ignore')

Before we move to the customised chatbot, let us develop a chatbot using the defualt features and logic adapters of chatterbot package. 

<a id='3'></a>
## 3 Training a default chatbot 

Before we move on to build a chatbot for customised function avilable in chatterbot. Chatterbot and many other chatbot packages comes with a data utility module that can be used to train the chatbots.

Following is a simple example to get started with ChatterBot in python with the following components. 
* **preprocessors** : ChatterBot’s preprocessors are simple functions that modify the input statement that a chat bot receives before the statement gets processed by the logic adaper. The preprocessors can be customise to perform different preprocessing steps such as "tokenization", "lemmatisation" etc. so that we have clean and processed data available for further processing. In the example below, the default preprocessor for cleaning white spaces "clean_whitespace" is used. 

* **logic_adapters** : Logic adapters determine the logic for how ChatterBot selects a response to a given input statement.It is possible to enter any number of logic adapters for your bot to use. In the example below we are using two inbuilt adapter. We use log adapters "Best match" which returns the best known responses and "Mathematical Evaluation" which performs mathematical computation. In the next section we will build our own customised logic, trained using machine learning to perform a specific task. 

* **corpus training** : ChatterBot comes with a corpus data and utility module that makes it easy to quickly train the bot to communicate.We use the already existing corpus english, english.greetings, english.conversations for training the chatbot. However, the chatbot can be trained using a customised corpus. 


* **list training** : Just like the corpus training, we train the chatbot with the conversations which can be used for training using ListTrainer. In the example below, we have trained the chatbot using some sample commands defined in the code below. The chatbot can be trained using huge data of conversations.


In [3]:
chatB = ChatBot("Trader",
                preprocessors=['chatterbot.preprocessors.clean_whitespace'],
                logic_adapters=['chatterbot.logic.BestMatch',
                                'chatterbot.logic.MathematicalEvaluation'])

# Corpus Training
trainerCorpus = ChatterBotCorpusTrainer(chatB)

#Train based on English Corpus
trainerCorpus.train(
    "chatterbot.corpus.english"
)
# Train based on english greetings corpus
trainerCorpus.train("chatterbot.corpus.english.greetings")

# Train based on the english conversations corpus
trainerCorpus.train("chatterbot.corpus.english.conversations")

trainerConversation = ListTrainer(chatB)
#Traing based on conversations

#List training
trainerConversation.train([
    'Help!',
    'Please go to google.com',
    'What is Bitcoin?',
    'It is a decentralized digital currency'
])

# You can train with a second list of data to add response variations
trainerConversation.train([
    'What is Bitcoin?',
    'Bitcoin is a cryptocurrency.'
])


Training ai.yml: [                    ] 1%

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Training ai.yml: [####################] 100%
Training botprofile.yml: [####################] 100%
Training computers.yml: [####################] 100%
Training conversations.yml: [####################] 100%
Training emotion.yml: [####################] 100%
Training food.yml: [####################] 100%
Training gossip.yml: [####################] 100%
Training greetings.yml: [####################] 100%
Training health.yml: [####################] 100%
Training history.yml: [####################] 100%
Training humor.yml: [####################] 100%
Training literature.yml: [####################] 100%
Training money.yml: [####################] 100%
Training movies.yml: [####################] 100%
Training politics.yml: [####################] 100%
Training psychology.yml: [####################] 100%
Training science.yml: [####################] 100%
Training sports.yml: [####################] 100%
Training trivia.yml: [####################] 100%
Training greetings.yml: [####################] 

In [4]:
def converse(quit="quit"):
    user_input = ""
    while user_input != quit:
        user_input = quit
        try:
            user_input = input(">")
        except EOFError:
            print(user_input)
        if user_input:
            while user_input[-1] in "!.":
                user_input = user_input[:-1]
            print(chatB.get_response(user_input))

In [5]:
converse()

>Hi
How are you doing?
>I am doing well.
That is good to hear
>What is 78964 plus 5970
78964 plus 5970 = 84934
>what is a dollar
dollar: unit of currency in the united states.
>What is Bitcoin?
It is a decentralized digital currency
>Help!
Please go to google.com
>Tell me a joke
Did you hear the one about the mountain goats in the andes? It was "ba a a a a a d".
>What is Bitcoin?
It is a decentralized digital currency
>What is Bitcoin?
It is a decentralized digital currency
>What is Bitcoin?
Bitcoin is a cryptocurrency.
>quit
no.


In this example, we see a fairly good chatbot which gives us response according to the input that we have given. The first two responses are due to the training on english greetings and conversation corpus. Additionally the response to "tell me a joke" and "what is a dollar" are due to the training on the english corpus. The computation in the forth line is the result of the chatbot being trained on the Mathematical Evaluation logical adapter. The response to "Help" and "What is a bitcoin" are the result of the customised list trainers. 

Given, that we have already have a customised chatbot, we move on to create a chatbot which is designed to give us the financial ratios of a company based on a customised logical adapter.

<a id='4'></a>
# 4. Data Preparation for customized chatbot

The purpose of performing the data preparation is to use it for training through logic adapter.The details are under https://chatterbot.readthedocs.io/en/stable/logic/create-a-logic-adapter.html. Given the logic adapter need to be in a separate file from the chat bot, we perform the step of data preparation in the module financial_ratio_adapter.py where logic adapter is created.


<a id='5'></a>
# 5. Model construction and training

<a id='5.1'></a>
## 5.1 and 5.2 Model optimization function and building custom logic adapter
Step 4.2 and 4.2 are shown in the module financial_ratio_adapter.py, given the logic adapter need to be in a separate file from the chat bot. In the next step we train the chatbot, which trains it on the customised logic adapter. 

<a id='5.3'></a>
## 5.3. Training the model

In this step we combine all the components (i.e. preprocessor, custom logical adapter, list and corpus trainer) with the custom logical adapter (financial_ratio_adapter.FinancialRatioAdapter) that we have created. 

In [6]:
#Here we add 
chatbot = ChatBot(
    "My ChatterBot",
    preprocessors=['chatterbot.preprocessors.clean_whitespace'],
    logic_adapters=[
        'financial_ratio_adapter.FinancialRatioAdapter',
        'chatterbot.logic.MathematicalEvaluation',
        'chatterbot.logic.BestMatch'
    ]
)

#Train based on English Corpus
trainerCorpus.train(
    "chatterbot.corpus.english"
)
# Train based on english greetings corpus
trainerCorpus.train("chatterbot.corpus.english.greetings")

# Train based on the english conversations corpus
trainerCorpus.train("chatterbot.corpus.english.conversations")

trainerConversation = ListTrainer(chatB)
#Traing based on conversations

trainerConversation.train([
    'Help!',
    'Please go to google.com',
    'What is Bitcoin?',
    'It is a decentralized digital currency'
])

# You can train with a second list of data to add response variations
trainerConversation.train([
    'What is Bitcoin?',
    'Bitcoin is a cryptocurrency.'
])


Losses {'ner': 250.5925518564882}
Losses {'ner': 86.34306746371182}
Losses {'ner': 9.912364617525238}
Losses {'ner': 0.007054564759577683}
Losses {'ner': 0.002342427745124589}
Losses {'ner': 0.17200641879483095}
Losses {'ner': 0.00014026589302679004}
Losses {'ner': 0.04666429370491898}
Losses {'ner': 0.0005265609668528584}
Losses {'ner': 0.00029906058166727796}
Losses {'ner': 5.9895766629850823e-05}
Losses {'ner': 0.0006064481033172622}
Losses {'ner': 1.0745628683613567e-05}
Losses {'ner': 9.724242475936387e-06}
Losses {'ner': 1.7436667959465367e-06}
Losses {'ner': 5.097320584206234e-07}
Losses {'ner': 1.5063773009800355e-06}
Losses {'ner': 3.463751450599309e-05}
Losses {'ner': 8.846712629581901e-06}
Losses {'ner': 5.9018098284142235e-05}
Losses {'ner': 6.828183680571441e-07}
Losses {'ner': 0.0001549424831125363}
Losses {'ner': 0.00011724383958802145}
Losses {'ner': 2.327508621099159e-06}
Losses {'ner': 2.080900377673051e-05}
Losses {'ner': 6.029163538041867e-07}
Losses {'ner': 1.51605

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\tatsa\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Training ai.yml: [####################] 100%
Training botprofile.yml: [####################] 100%
Training computers.yml: [####################] 100%
Training conversations.yml: [####################] 100%
Training emotion.yml: [####################] 100%
Training food.yml: [####################] 100%
Training gossip.yml: [####################] 100%
Training greetings.yml: [####################] 100%
Training health.yml: [####################] 100%
Training history.yml: [####################] 100%
Training humor.yml: [####################] 100%
Training literature.yml: [####################] 100%
Training money.yml: [####################] 100%
Training movies.yml: [####################] 100%
Training politics.yml: [####################] 100%
Training psychology.yml: [####################] 100%
Training science.yml: [####################] 100%
Training sports.yml: [####################] 100%
Training trivia.yml: [####################] 100%
Training greetings.yml: [####################] 

As we can see the training was not only for the FinancialRatioAdapter, but also for the list and corpus trainer. Let us move to the model testing.

<a id='6'></a>
# 6. Model Testing and Usage

In [7]:
def converse(quit="quit"):
    user_input = ""
    while user_input != quit:
        user_input = quit
        try:
            user_input = input(">")
        except EOFError:
            print(user_input)
        if user_input:
            while user_input[-1] in "!.":
                user_input = user_input[:-1]
            print(chatbot.get_response(user_input))

In [13]:
converse()

>What is ROE for Citibank ?
https://www.zacks.com/stock/chart/C/fundamental/return-on-equity-ttm
					  
>Tell me PE for Delta?
https://www.zacks.com/stock/chart/DAL/fundamental/pe-ratio-ttm
					  
>What is Bitcoin?
It is a decentralized digital currency
>Help!
Please go to google.com
>What is 786940 plus 75869
786940 plus 75869 = 862809
>Do you like dogs?
Sorry! Could not figure out what the user wants
>Quit
Sorry! Could not figure out what the user wants
>quit
Sorry! Could not figure out what the user wants


The custom logic adaptor for our Chatter bot, finds a RATIO or a COMPANY in the sentence using our NLP model. If the model finds exactly one COMPANY and exactly one RATIO, it con structs a url to guide the user. Additionally other logical adpater such as mathematical evaluation, and curpus and list trainer work as expected as well. 

**Conclusion**

In this case study, we have learned how to make a chatbot in python using the ChatterBot library. We learnt how to build a custom NLP model and use it in a chatbot. 

The chatbot understands the intent of your messages with the help of NLP and has successful conversation or retieves the significant information. NLP and ML are used to parse user messages, collect relevant parameters from words and sentences, and map those to actions to take. 

In order to train a blank model, one must have a substantial training dataset. In this case study, we looked at patterns available to us and used them to generate training samples. Getting the right amount of Training data is usually the hardest parts of constructing a custom model.

Using the chatterbot library in Python allows us to built a simple interface to resolve user inputs. 

There can be a significant enhancement made for each and evey items for the specific tasks required from chatbot. Additional preprocessing steps can be added to have more processed and cleaner data. To generate a response from our bot for input questions, the logic can be refined further to incorporate the concept of text similarity. The chatbot can be trained on a bigger dataset and using more advance ML techniques. A series of custom Logic Adaptors can be used to construct a more sophisticated chatterbot. This can be generalized to more interesting tasks such as retrieving information from a database or asking for more input from the user. 

However, this case study provides an introduction to all the aspects of chatbot development. Although, it is a very simple bot, its a good starting point to use NLP to create chatbots.


