# Basic chatbot using Keras and NLTK

## Introduction 

A chatbot is smart code that is capable of communicating similar to a human.

Chatbots are used a lot in customer interaction, marketing on social network sites and instantly messaging the client.

There are two basic types of Natural Language Processing (NLP) chatbot models based on how they are built:

Retrieval based. A retrieval-based chatbot uses predefined input patterns and responses. It then uses some type of heuristic approach to select the appropriate response. It is widely used in the industry to make goal-oriented chatbots where we can customize the tone and flow of the chatbot to drive our customers with the best experience.

Generative based models are not based on some predefined responses. They are based on seq 2 seq neural networks. It is the same idea as machine translation. In machine translation, we translate the source code from one language to another language but here, we are going to transform input into an output. It needs a large amount of data and it is based on Deep Neural Networks (DNN).



In [None]:
#from google.colab import files
#upload = files.upload()
#!cp -r "/content/Chatbot" "/content/drive/MyDrive"

## Importing Libraries


In [16]:
import json
import pickle           #for serialisation
import random
import numpy as np 
import nltk
import tensorflow as tf
from nltk.stem import WordNetLemmatizer
from tensorflow.keras.models import load_model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout
from tensorflow.keras.optimizers import SGD


nltk.download('punkt')  # dependencies required 
nltk.download('wordnet')
nltk.download('omw-1.4')

lemmatizer = WordNetLemmatizer() 
path_json = '/content/drive/MyDrive/Chatbot/intents.json'
intents = json.loads(open(path_json).read())

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


## Training the chatbot


In [17]:
words = []
classes = []
documents = []
ignore_letters = {'?', '!', '.', ','}

for intent in intents['intents']:
    for pattern in intent['patterns']:
        word_list = nltk.word_tokenize(pattern)
        words.extend(word_list)
        documents.append((word_list, intent['tag']))
        if intent['tag'] not in classes:
            classes.append(intent['tag'])
print(documents)
print(words)

[(['hello'], 'greeting'), (['hi'], 'greeting'), (['good', 'day'], 'greeting'), (['what', "'s", 'up', '?'], 'greeting'), (['how', "'s", 'it', 'going', '?'], 'greeting'), (['Greetings'], 'greeting'), (['hola'], 'greeting'), (['hey', 'there'], 'greeting'), (['Excuse', 'me'], 'greeting'), (['Anybody', 'there', '?'], 'greeting'), (['hola'], 'greeting'), (['cya'], 'goodbye'), (['See', 'you', 'later'], 'goodbye'), (['Goodbye'], 'goodbye'), (['I', 'am', 'Leaving'], 'goodbye'), (['Have', 'a', 'Good', 'day'], 'goodbye'), (['bye'], 'goodbye'), (['ciao'], 'goodbye'), (['see', 'ya'], 'goodbye'), (['how', 'are', 'you', '?'], 'greet'), (['How', "'s", 'it', 'going', '?'], 'greet'), (['Are', 'you', 'good', '?'], 'greet'), (['How', 'have', 'you', 'been', '?'], 'greet'), (['how', "'s", 'you', '?'], 'greet'), (['what', 'stocks', 'do', 'I', 'own', '?'], 'stocks'), (['how', 'are', 'my', 'shares', '?'], 'stocks'), (['what', 'companies', 'am', 'I', 'investing', 'in', '?'], 'stocks'), (['what', 'am', 'I', 'doi

In [None]:
words = [lemmatizer.lemmatize(word) for word in words if word not in ignore_letters]
words = sorted(set(words))
classes = sorted(set(classes))

pickle.dump(words, open('words.pkl', 'wb')) #savin the lists in a pickle file in the form of binaries
pickle.dump(classes, open('classes.pkl', 'wb'))

#creating the bow list and the training list
training = []
output_empty = [0] * len(classes)

for document in documents:
    bow = []
    word_patterns = document[0]
    word_patterns = [lemmatizer.lemmatize(word.lower()) for word in word_patterns]
    for word in words:
        bow.append(1) if word in word_patterns else bow.append(0)
        
    output_row = list(output_empty)
    output_row[classes.index(document[1])] = 1 
    training.append([bow, output_row])


random.shuffle(training)
training = np.array(training)

train_x = list(training[:,0])
train_y = list(training[:,1])


#building and training the neural network

model = Sequential()
model.add(Dense(128, input_shape = (len(train_x[0]),), activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(len(train_y[0]), activation = 'softmax'))

sgd = SGD(lr = 0.01, decay = 1e-6, momentum = 0.9, nesterov= True)
model.compile(loss = 'categorical_crossentropy', optimizer = sgd, metrics = ['accuracy'])

hist = model.fit(np.array(train_x), np.array(train_y), epochs = 500, batch_size = 5, verbose =1)
model.save('chatbotmodel.h5', hist)
print("Done!")

## Data Pre-processing and Function Definition

In [21]:
words = pickle.load(open('words.pkl', 'rb'))
classes = pickle.load(open('classes.pkl', 'rb'))
model = load_model('chatbotmodel.h5')

def  clean_up_sentence(sentence):
  sentence_words = nltk.word_tokenize(sentence)
  sentence_words = [lemmatizer.lemmatize(word) for word in sentence_words]
  return sentence_words


def  bag_of_words(sentence):
  sentence_words = clean_up_sentence(sentence)
  bow = [0]*len(words)
  for w in sentence_words:
    for i, word in enumerate(words):
      if word == w:
        bow[i] = 1
  return np.array(bow)


def predict_class(sentence):
  bag = bag_of_words(sentence)
  res = model.predict(np.array([bag]))[0]
  ERROR_THRESHOLD = 0.25
  results = [[i,r] for i,r in enumerate(res) if r > ERROR_THRESHOLD]

  results.sort(key = lambda x: x[1], reverse = True)
  return_list = []
  for r in results:
    return_list.append({'intent': classes[r[0]], 'probability': str(r[1])})
  return return_list 

def get_response(intents_list, intents_json):
  tag = intents_list[0]['intent']
  list_of_intents = intents_json['intents']
  for i in list_of_intents:
    if i['tag'] == tag:
      result = random.choice(i['responses'])
      break
  return result

## Running the Chatbot

In [25]:
print("Speak to me..")

while True:
  message = input("")
  ints = predict_class(message)
  res = get_response(ints, intents)
  print(res)
  if ints[0]['intent'] == 'goodbye':
    break


Speak to me..
hello
Hey!
how are you
Not too bad! Wbu?
bye
Tata!
