# Deep Learning Based Simple CHATBOT

A chatbot is a piece of software that conducts a conversation via auditory or textual methods. 

![nn.png](nn.png)

### Code Flow: 

1. Install Packages
2. Corpus
3. Data Pre Processing
4. Implement fully connected Neural Net model
5. Train fully connected Neural Net model
6. Design the interactive chabot terminal
7. Launch chatbot terminal

#### 1. Install Packages

In [1]:
# Natural Langauge Tool Kit
import nltk
from nltk.stem.lancaster import LancasterStemmer  # Lancaster stemmer
stemmer = LancasterStemmer() # Lancaster stemmer

# Tensor flow 
import tensorflow as tf

# Numeric computation
import numpy
import random

#### 2. Corpus

1. Intents:
    1. Tag: Target variable
    2. Patterns: Training data for the bot
    3. Response: Bot replies


In [2]:
corpus = {'intents': [
    {'tag': 'greeting', 
     'patterns': ['Hi', 'How are you', 'Is anyone there?', 'Hello', 'Good day', 'Whats up'], 
     'responses': ['Hello! \nPlease help me with your location, \nhow can i help you?'], 
     'context_set': ''}, 
    {'tag': 'goodbye', 
     'patterns': ['thank you', 'thanks', 'cya', 'See you later', 'Goodbye', 'I am Leaving', 'Have a Good day'], 
     'responses': ['I hope I was able to assist you, Good Bye'], 
     'context_set': ''}, 
    {'tag': 'location', 
     'patterns': ['i work at bangalore office', 'i sit in bangalore office', 'bangalore', 'India'], 
     'responses': ['I understand you are at Bangalore office, India', 'Got your location @Bangalore'], 
     'context_set': ''}, 
    {'tag': 'name', 
     'patterns': ['what is your name', 'what should I call you', 'Who is this', 'who i am talking to'], 
     'responses': ['You can call me Ramos: your virtual assistant \nHow can I help you ?'], 
     'context_set': ''}, 
    {'tag': 'catalog', 
     'patterns': ['I would like to see your catalog', 'whats on the help menu', 'could i get some information on your catalog'], 
     'responses': ['We support for all access issues related to email and computer'], 
     'context_set': ''}, 
    {'tag': 'message', 
     'patterns': ['not able to access email', 'unable to access email', 'no email access', 'locked out of email'], 
     'responses': ['Please use this LINK [www.emailhelp.com] to get step by step solution. \nThank you !'], 
     'context_set': ''},
    {'tag': 'machine', 
     'patterns': ['not able to access computer', 'unable to access computer', 'no computer access', 'locked out of computer'], 
     'responses': ['Please use this LINK [www.computerhelp.com] to get step by step solution. \nThank you !'], 
     'context_set': ''}, 
    {'tag': 'hours', 
     'patterns': ['when are you guys open', 'what are your hours', 'hours of operation'], 
     'responses': ['We are open 24x7 Monday-Friday! @ Bangalore office'], 
     'context_set': ''}]}

#### 3. Data Preprocessing

In [3]:
# # this will slove broken link issue with nltk using ssl.
# import ssl
# try:
#     _create_unverified_https_context = ssl._create_unverified_context
# except AttributeError:
#     pass
# else:
#     ssl._create_default_https_context = _create_unverified_https_context
# # Download "punkt" if missing
# nltk.download('punkt')

In [4]:
# Extract data
W = [] # Tokens 
L = [] # Identified Tags or Labels
doc_x = [] # Tokenised words
doc_y = [] # Tags or Labels

for intent in corpus['intents']:
    for pattern in intent['patterns']:
        w_temp = nltk.word_tokenize(pattern)
        W.extend(w_temp)
        doc_x.append(w_temp)
        doc_y.append(intent["tag"])
    
    # Add the mising tag if any    
    if intent['tag'] not in L:
        L.append(intent['tag'])

In [5]:
# Stemming
W = [stemmer.stem(w.lower()) for w in W if w != "?"] # Stemming or learning the root word
W = sorted(list(set(W))) # Sorted words
L = sorted(L) # Sorted list of tags or labels

In [6]:
# Words
W

['a',
 'abl',
 'access',
 'am',
 'anyon',
 'ar',
 'at',
 'bang',
 'cal',
 'catalog',
 'comput',
 'could',
 'cya',
 'day',
 'email',
 'get',
 'good',
 'goodby',
 'guy',
 'hav',
 'hello',
 'help',
 'hi',
 'hour',
 'how',
 'i',
 'in',
 'ind',
 'inform',
 'is',
 'lat',
 'leav',
 'lik',
 'lock',
 'menu',
 'nam',
 'no',
 'not',
 'of',
 'off',
 'on',
 'op',
 'out',
 'see',
 'should',
 'sit',
 'som',
 'talk',
 'thank',
 'the',
 'ther',
 'thi',
 'to',
 'un',
 'up',
 'what',
 'when',
 'who',
 'work',
 'would',
 'yo',
 'you']

In [7]:
# Tags
L

['catalog',
 'goodbye',
 'greeting',
 'hours',
 'location',
 'machine',
 'message',
 'name']

#### Bag Of Words

The bag-of-words model is a simplifying representation used in natural language processing and information retrieval. In this model, a text is represented as the bag of its words, disregarding grammar and even word order but keeping multiplicity. 

![bow.png](bow.png)

In [9]:
train = [] # Training data for NN
target = [] # Target data for NN

out_empty = [0 for _ in range(len(L))]

# Loop to create bag of words and put the frequency count per each word
for x, doc in enumerate(doc_x):
    bag = []

    w_temp = [stemmer.stem(w.lower()) for w in doc]

    for w in W:
        if w in w_temp:
            bag.append(1)
        else:
            bag.append(0)

    output_row = out_empty[:]
    output_row[L.index(doc_y[x])] = 1

    train.append(bag) # List
    target.append(output_row) # List

In [11]:
# convert training data and output to numpy arrays

train = numpy.array(train) # List to numpy arrray
target = numpy.array(target) # List to numpy arrray

In [12]:
train[:5]

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        

#### 4. Implement fully connected Neural Net model

In [40]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(8,input_dim=train.shape[1],activation='relu')) # fully connected  layer with 8 hidden neurons
#model.add(tf.keras.layers.BatchNormalization()) # BatchNormalization layer 
model.add(tf.keras.layers.Dense(8,activation='relu')) # fully connected  layer with 8 hidden neurons
#model.add(tf.keras.layers.Dropout(0.2)) # Dropout layer
model.add(tf.keras.layers.Dense(len(L),activation='softmax')) # output layer

model.compile(optimier='adam',loss="categorical_crossentropy",metrics=["accuracy"])
model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_14 (Dense)             (None, 8)                 504       
_________________________________________________________________
dense_15 (Dense)             (None, 8)                 72        
_________________________________________________________________
dense_16 (Dense)             (None, 8)                 72        
Total params: 648
Trainable params: 648
Non-trainable params: 0
_________________________________________________________________


#### 5. Train AI based fully connected Neural Net model

In [41]:
model.fit(train, target, epochs=200, batch_size=8, verbose=1)

Train on 35 samples
Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
E

Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


<tensorflow.python.keras.callbacks.History at 0x13c4cc6a0>

#### 6. Design the interactive chabot terminal

Chatbot Procedure:
1. Run the chat function
2. Input user's string
3. Convert it into Bag Of Words
4. Call Our Neural Net model to make predictions
5. Get the tag with highest probability (softmax output)
6. Output the response from the chosen tag (refer corpus file)

In [44]:
def bag_of_words(s, W):
    bag = [0 for _ in range(len(W))]

    s_words = nltk.word_tokenize(s)
    s_words = [stemmer.stem(word.lower()) for word in s_words]

    for se in s_words:
        for i, w in enumerate(W):
            if w == se:
                bag[i] = 1
            
    return numpy.array(bag)


def chat():
    print("Chat with NNChatbot (type: stop to quit)")
    print("If answer is not right (type: *)")
    while True:
        inp = input("\n\nYou: ")
        if inp.lower()=="*":
            print("BOT: Please rephrase your question and try again")
        if inp.lower() == "quit":
            break
        input_q = bag_of_words(inp, W).reshape(1,-1)
        results = model.predict([input_q])
        results_index = numpy.argmax(results)
        tag = L[results_index]

        for tg in corpus["intents"]:
            if tg['tag'] == tag:
                responses = tg['responses']

        print(random.choice(responses))

In [45]:
chat()

Chat with NNChatbot (type: stop to quit)
If answer is not right (type: *)


You: hello
Hello! 
Please help me with your location, 
how can i help you?


You: bangalore
I understand you are at Bangalore office, India


You: I would like to see your catalog
We support for all access issues related to email and computer


You: unable to access email
Please use this LINK [www.emailhelp.com] to get step by step solution. 
Thank you !


You: when are you guys open
We are open 24x7 Monday-Friday! @ Bangalore office


You: Goodbye
I hope I was able to assist you, Good Bye


You: quit
