# Final Project

https://data-flair.training/blogs/python-chatbot-project/

Notes

Retrieval based Chatbots: predefined unputs and responses, something like a trial & error way of thinking
- simplifies complex problems
- "shortcuts to help make quick decisions"

Generative Based Chatbots: based on deep neural networks
- "transform input into output"
- same idea as machine translation

Training requires the classes Keras and NTLK
- Keras: basically library for neural networks
- NLTK: breaking down and analyzing text

In [1]:
# !pip install tensorflow
# !pip install keras
# %pip install nltk
# !pip install pickle
# !pip install --user -U nltk

In [2]:
# %pip install pickle

In [3]:
import nltk
from nltk.stem import WordNetLemmatizer
import pickle
import json

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.optimizers import SGD
import random
print("successful imports")

successful imports


In [4]:
lemmatizer = WordNetLemmatizer() # basically simplifies words to their base meaning ("better" -> "good")

In [5]:
words=[]
classes = []
documents = []
ignore_words = ['?', '!']
data_file = open('intents.json').read()
intents = json.loads(data_file)

In [6]:
nltk.download('punkt') # a package from nlkt needed to be installed in order for word_tokenize to work properly

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\attic\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [7]:
for intent in intents['intents']:
    for pattern in intent['patterns']:
        #tokenize each word
        w = nltk.word_tokenize(pattern)
        # w = nltk.word_tokenize(pattern)
        words.extend(w)
        #add documents in the corpus
        documents.append((w, intent['tag']))
        # add to our classes list
        if intent['tag'] not in classes:
            classes.append(intent['tag'])

In [8]:
# lemmatize, lower each word and remove duplicates
words = [lemmatizer.lemmatize(w.lower()) for w in words if w not in ignore_words]
words = sorted(list(set(words)))
# sort classes
classes = sorted(list(set(classes)))
# documents = combination between patterns and intents
print (len(documents), "documents")
# classes = intents
print (len(classes), "classes", classes)
# words = all words, vocabulary
print (len(words), "unique lemmatized words", words)
pickle.dump(words,open('words.pkl','wb'))
pickle.dump(classes,open('classes.pkl','wb'))

52 documents
11 classes ['adverse_drug', 'blood_pressure', 'blood_pressure_search', 'goodbye', 'greeting', 'greeting_extra', 'hospital_search', 'options', 'pharmacy_search', 'thanks', 'whoami']
91 unique lemmatized words ["'s", ',', 'a', 'adverse', 'all', 'anyone', 'are', 'awesome', 'be', 'behavior', 'blood', 'by', 'bye', 'can', 'causing', 'chatbot', 'chatting', 'check', 'could', 'data', 'day', 'detail', 'do', 'dont', 'drug', 'entry', 'find', 'for', 'give', 'good', 'goodbye', 'have', 'hello', 'help', 'helpful', 'helping', 'hey', 'hi', 'history', 'hola', 'hospital', 'how', 'human', 'i', 'id', 'is', 'later', 'list', 'load', 'locate', 'log', 'looking', 'lookup', 'management', 'me', 'module', 'nearby', 'next', 'nice', 'of', 'offered', 'open', 'patient', 'pharmacy', 'pressure', 'provide', 'reaction', 'related', 'result', 'search', 'searching', 'see', 'show', 'suitable', 'support', 'task', 'thank', 'thanks', 'that', 'there', 'till', 'time', 'to', 'transfer', 'up', 'want', 'what', 'which', 'w

In [9]:
# create our training data
training = []
# create an empty array for our output
output_empty = [0] * len(classes)

In [10]:
for doc in documents:
    # initialize our bag of words
    bag = []
    # list of tokenized words for the pattern
    pattern_words = doc[0]
    # lemmatize each word - create base word, in attempt to represent related words
    pattern_words = [lemmatizer.lemmatize(word.lower()) for word in pattern_words]
    # create our bag of words array with 1, if word match found in current pattern
    for w in words:
        bag.append(1) if w in pattern_words else bag.append(0)
    # output is a '0' for each tag and '1' for current tag (for each pattern)
    output_row = list(output_empty)
    output_row[classes.index(doc[1])] = 1
    training.append([bag, output_row])

In [11]:
print(training[0][0])
print(training[0][1])
print("----------------------")
print(training[1][0])
print(training[1][1])


[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
----------------------
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]


Original example had a ValueError in `training = np.array(training)` because the shape of the array is no longer uniform after 2 dimensions.

Attempt 1: getting the lists inside the training list manually


In [12]:
# X_train = list(training[0:,0])
# print(training[0])
# print(training[1])

# print(training[:,0])


Attempt 2: putting `dtype='object'` with the `training = np.array(training)` ✅

In [13]:
# shuffle our features and turn into np.array
random.shuffle(training)
training = np.array(training, dtype='object') 
# create train and test lists. X - patterns, Y - intents

X_train = list(training[:,0])
y_train = list(training[:,1])

# X_train = training[:,0]
# y_train = training[:,1]

print("Training data created")

Training data created


Attempt at playing with reshaping data ❌
- not needed as we get the input_shape parameters from X_train length directly

In [14]:
# TODO play with reshaping
# print(np.array(X_train).shape)
# print("-----")
# print(np.array(y_train).shape)

# X_train=X_train.reshape(-1,47)


In [15]:
# Create model - 3 layers. First layer 128 neurons, second layer 64 neurons and 3rd output layer contains number of neurons
# equal to number of intents to predict output intent with softmax
model = Sequential()
model.add(Dense(128, input_shape=(len(X_train[0]),), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(len(y_train[0]), activation='softmax'))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Having ValueError in `hist = model.fit(X_train, y_train, epochs=200, batch_size=5, verbose=1)`

"unrecognized datatype"
- fixed by in attempt 2 previously (`dtype = 'object'`)

New error: The requested array has an inhomogeneous shape after 2 dimensions
- fixed by attempt 2 functioning properly

Was having problems with `hist = model.fit(np.array(X_train), np.array(y_train), epochs=200, batch_size=5, verbose=1)`, keras shape didn't match input shape
- fixed by setting input shape to len(X_train[0]) instead of a solid integer (47).

In [16]:
import keras

**Stochastic gradient descent:** basically finding the lowest point on a graph

Basically, the amount of errors of the model can be plotted on a graph. SGD finds the lowest point on the graph where errors are minimal.

SKLearn: "it is an optimization technique"

Learning rate: controls how much the model adapts to the problem (usually on a scale from 0-1)
- basically how many steps the model takes to be trained
- too big makes the model recieve too many changes, making its result suboptimal
- too small doesn't really take the model anywhere as it doesn't learn fast enough

Weight Decay: prevents overfitting in neural networks
- basically attempts to remove complexity from dataset


In [35]:
# Compile model. Stochastic gradient descent with Nesterov accelerated gradient gives good results for this model
sgd = SGD(learning_rate=0.01, weight_decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
#fitting and saving the model 
hist = model.fit(np.array(X_train), np.array(y_train), epochs=200, batch_size=15, verbose=1)

# keras.saving.save_model(hist, 'my_model.keras')
keras.saving.save_model(model, 'my_model.keras')
# model.save('chatbot_model.h5', hist)
print("model created")

Epoch 1/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 1.0000 - loss: 0.0102
Epoch 2/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9812 - loss: 0.0430 
Epoch 3/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 1.0000 - loss: 0.0177 
Epoch 4/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 1.0000 - loss: 0.0095 
Epoch 5/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 1.0000 - loss: 0.0049 
Epoch 6/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 1.0000 - loss: 0.0244 
Epoch 7/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 1.0000 - loss: 0.0067 
Epoch 8/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 1.0000 - loss: 0.0269 
Epoch 9/200
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37

- https://www.includehelp.com/python/how-to-fix-valueerror-the-requested-array-has-an-inhomogeneous-shape-after-1-dimensions.aspx
- https://stackoverflow.com/questions/69014384/input-0-of-layer-dense-is-incompatible-with-the-layer-expected-axis-1-of-input
- https://stackoverflow.com/questions/76872111/valueerror-found-shape-in-keras-does-not-match-input-shape
- https://datascience.stackexchange.com/questions/53609/how-to-determine-input-shape-in-keras
- https://datascience.stackexchange.com/questions/46885/how-to-reshape-xtrain-array-and-what-about-input-shape
- https://datascience.stackexchange.com/questions/53609/how-to-determine-input-shape-in-keras

# Running the model
the model doesn't work as well if I seperate the notebooks, so I put them together.

ie. it doesn't load properly across seperate notebooks

Well technically it's still really minimalistic but it's a chatbot at least.

In [28]:
from keras.models import load_model
print("import successful")

model = load_model('my_model.keras')
print("loaded")

import successful
loaded


In [29]:
intents = json.loads(open('intents.json').read())
words = pickle.load(open('words.pkl','rb'))
classes = pickle.load(open('classes.pkl','rb'))

Basically breaks down the user input for the model to predict

In [30]:
def clean_up_sentence(sentence):
    # tokenize the pattern - split words into array
    sentence_words = nltk.word_tokenize(sentence)
    # stem each word - create short form for word
    sentence_words = [lemmatizer.lemmatize(word.lower()) for word in sentence_words]
    return sentence_words
# return bag of words array: 0 or 1 for each word in the bag that exists in the sentence
def bow(sentence, words, show_details=True):
    # tokenize the pattern
    sentence_words = clean_up_sentence(sentence)
    # bag of words - matrix of N words, vocabulary matrix
    bag = [0]*len(words) 
    for s in sentence_words:
        for i,w in enumerate(words):
            if w == s: 
                # assign 1 if current word is in the vocabulary position
                bag[i] = 1
                if show_details:
                    print ("found in bag: %s" % w)
    return(np.array(bag))
def predict_class(sentence, model):
    # filter out predictions below a threshold
    p = bow(sentence, words,show_details=False)
    res = model.predict(np.array([p]))[0]
    ERROR_THRESHOLD = 0.25
    results = [[i,r] for i,r in enumerate(res) if r>ERROR_THRESHOLD]
    # sort by strength of probability
    results.sort(key=lambda x: x[1], reverse=True)
    return_list = []
    for r in results:
        return_list.append({"intent": classes[r[0]], "probability": str(r[1])})
    return return_list

From the input, the model predicts which class the response should be from, then chooses a random response from a set list of responses.

In [31]:
def getResponse(ints, intents_json):
    tag = ints[0]['intent']
    list_of_intents = intents_json['intents']
    for i in list_of_intents:
        if(i['tag']== tag):
            result = random.choice(i['responses'])
            break
    return result
def chatbot_response(text):
    ints = predict_class(text, model)
    res = getResponse(ints, intents)
    return res

In [36]:
#Creating GUI with tkinter
import tkinter
from tkinter import *
def send():
    msg = EntryBox.get("1.0",'end-1c').strip()
    EntryBox.delete("0.0",END)
    if msg != '':
        ChatLog.config(state=NORMAL)
        ChatLog.insert(END, "You: " + msg + '\n\n')
        ChatLog.config(foreground="#442265", font=("Verdana", 12 ))
        res = chatbot_response(msg)
        ChatLog.insert(END, "Bot: " + res + '\n\n')
        ChatLog.config(state=DISABLED)
        ChatLog.yview(END)
base = Tk()
base.title("Hello")
base.geometry("400x500")
base.resizable(width=FALSE, height=FALSE)
#Create Chat window
ChatLog = Text(base, bd=0, bg="white", height="8", width="50", font="Arial",)
ChatLog.config(state=DISABLED)
#Bind scrollbar to Chat window
scrollbar = Scrollbar(base, command=ChatLog.yview, cursor="heart")
ChatLog['yscrollcommand'] = scrollbar.set
#Create Button to send message
SendButton = Button(base, font=("Verdana",12,'bold'), text="Send", width="12", height=5,
                    bd=0, bg="#32de97", activebackground="#3c9d9b",fg='#ffffff',
                    command= send )
#Create the box to enter message
EntryBox = Text(base, bd=0, bg="white",width="29", height="5", font="Arial")
#EntryBox.bind("<Return>", send)
#Place all components on the screen
scrollbar.place(x=376,y=6, height=386)
ChatLog.place(x=6,y=6, height=386, width=370)
EntryBox.place(x=128, y=401, height=90, width=265)
SendButton.place(x=6, y=401, height=90)
base.mainloop()

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 21ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 20ms/step


# Summary of what I learned

- Basically learned how to properly use keras and tensorflow for neural networks
- Learned to simplify words and convert them into data readable by the model
- trained the model to be able to hold a minimal conversation
- learne dabout [:,] list/numpy slicing
- The keras Sequential model allows for creating models layer by layer
- Good chatbots are really hard to make