# Rule-Based Chatbot Tutorial
* [Link to youtube playlist](https://youtu.be/RpWeNzfSUHw)
* [Article with TF implementation](https://chatbotsmagazine.com/contextual-chat-bots-with-tensorflow-4391749d0077)

**Overall Idea**: Our chatbot uses a model that is trained on cases of different intents (i.e. "greeting"), which is similar to a class label, and encompasses different patterns. There is a set of appropriate responses associated with each intent. 

However, you don't have to type the exact pattern in the training to get an appropriate response. The model predicts the intent of what the user wrote, and replies with an appropriate response for that intent. So, the chatbot can respond to whatever use cases you train the model on, and to extend the abilities of this chatbot, the model must be trained on additional intents.

###Relevant Imports

In [1]:
from os import path
import json
import random
import numpy as np

import torch
import torch.nn as nn
from torch.utils.data import DataLoader, Dataset

import nltk
from nltk.stem.porter import PorterStemmer

nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

##  Step 1 - Data Processing Methods
* Bag of words - the words from all the patterns are each represented by an inedex of a list (which contains all the words) and we represent each example patter in our training data as a vector (which contains indices for each word from the list) with 1's in the indices for words that are in this pattern and 0's in the indices for words that are not in this pattern

How do we get this bag of words from the text of the patterns we write out? 
* tokenization - to split the text into individual words and punctuation
* stemming - to get a general root of the word (ex: "organize", "organizer", and "organizing" all really mean the same thing)
* We also want to make all the words lowercase and might want to get rid of punctuation too

The NLTK library can help us do this!

In [2]:
stemmer = PorterStemmer()

def tokenize(sentence):
  return nltk.word_tokenize(sentence)

def stem(word):
  return stemmer.stem(word.lower())

def bag_of_words(tokenized_sentence, all_words):
  tokenized_sentence = [stem(w) for w in tokenized_sentence]
  bag = np.zeros(len(all_words), dtype=np.float32)

  for idx, w in enumerate(all_words):
    if w in tokenized_sentence:
      bag[idx] = 1.0
  
  return bag

### Examples of how these functions work

In [3]:
sent_1 = "Hi, nice to meet you! How are you doing?"

sent_1_tokenized = tokenize(sent_1)
print(sent_1_tokenized)

sent_1_stemmed = [stem(word) for word in sent_1_tokenized]
print(sent_1_stemmed)

# The words in the sentence above aren't very complicated,
# so let's try another example to see how the stemmer works
words_to_stem = ['organize', 'organizer', 'organizing'] 
stemmed_words = [stem(word) for word in words_to_stem]
print(stemmed_words)

['Hi', ',', 'nice', 'to', 'meet', 'you', '!', 'How', 'are', 'you', 'doing', '?']
['hi', ',', 'nice', 'to', 'meet', 'you', '!', 'how', 'are', 'you', 'do', '?']
['organ', 'organ', 'organ']


To test the bag_of_words function here, make sure you don't use punctuation and that the words in your sentence are in the word set. This is just for the example - these cases will be handled properly when we make the chatbot

In [4]:
word_set = ['hi', 'hello', 'i', 'you', 'me', 'good', 'bad', 'nice', 'to', 'am', 'are', 'meet', 'do', 'it', 'when', 'how']
sent_1_no_punc = "Hi nice to meet you How are you doing"

sent_1_no_punc_tokenized = tokenize(sent_1_no_punc)
sent_1_no_punc_stemmed = [stem(word) for word in sent_1_no_punc_tokenized]

bag = bag_of_words(sent_1_no_punc_tokenized, word_set)
print(bag)

[1. 0. 0. 1. 0. 0. 0. 1. 1. 0. 1. 1. 1. 0. 0. 1.]


###**TODO:** Try adding your own sentence(s) and see how it is processed
Add your text in parentheses and set it equal to "sent_2", and then run the cell

In [5]:
sent_2 = "Remove this text and add your own!"

sent_2_tokenized = tokenize(sent_2)
print(sent_2_tokenized)

sent_2_stemmed = [stem(word) for word in sent_2_tokenized]
print(sent_2_stemmed)

['Remove', 'this', 'text', 'and', 'add', 'your', 'own', '!']
['remov', 'thi', 'text', 'and', 'add', 'your', 'own', '!']


##Step 2 - Training Data

### Loading the Training Data

First we get a premade JSON file with some intents, patterns, and responses predefined

In [14]:
if not path.exists('chatbot_tutorial'):
  !git clone https://github.com/emilypilley/chatbot_tutorial.git

In [15]:
!ls chatbot_tutorial/

chat_bot_tutorial.ipynb  intents.json


In [16]:
# This opens the json file with our intents
with open('chatbot_tutorial/intents.json') as f:
  intents = json.load(f)

print(intents)

{'intents': [{'tag': 'greeting', 'patterns': ['Hi', 'Hey', 'How are you', 'Is anyone there?', 'Hello', 'Good day'], 'responses': ['Hey :-)', 'Hello, thanks for visiting', 'Hi there, what can I do for you?', 'Hi there, how can I help?']}, {'tag': 'goodbye', 'patterns': ['Bye', 'See you later', 'Goodbye'], 'responses': ['See you later, thanks for visiting', 'Have a nice day', 'Bye! Come back again soon.']}, {'tag': 'thanks', 'patterns': ['Thanks', 'Thank you', "That's helpful", "Thank's a lot!"], 'responses': ['Happy to help!', 'Any time!', 'My pleasure']}, {'tag': 'items', 'patterns': ['Which items do you have?', 'What kinds of items are there?', 'What do you sell?'], 'responses': ['We sell coffee and tea', 'We have coffee and tea']}, {'tag': 'payments', 'patterns': ['Do you take credit cards?', 'Do you accept Mastercard?', 'Can I pay with Paypal?', 'Are you cash only?'], 'responses': ['We accept VISA, Mastercard and Paypal', 'We accept most major credit cards, and Paypal']}, {'tag': 'd

### Processing the Training Data

First, we apply the stemming and tokenization to the patterns from the training data, and remove punctuation and duplicate items, to get a set of all the words used in each pattern. We associate all the patterns for one tag with that tag.

In [17]:
def process_training_data(intents):
  all_words = []
  tags = []
  xy = []  # holds patterns (x) and their assoicated tags (y)

  for intent in intents['intents']:
    tag = intent['tag']
    tags.append(tag)

    for pattern in intent['patterns']:
      w = tokenize(pattern)
      all_words.extend(w)
      xy.append((w, tag))

  ignore_words = ['?', '!', '.', ',']
  all_words = [stem(w) for w in all_words if w not in ignore_words]
  all_words = sorted(set(all_words))
  # print(all_words)

  tags = sorted(set(tags))
  # print(tags)

  return all_words, tags, xy

all_words, tags, xy = process_training_data(intents)

### Preparing the Training Data for use with PyTorch

Then, we separate the data into x (the patterns) and y (the tag for the patterns) in order to train the model.

In [18]:
def separate_train_x_y(all_words, tags, xy):
  x_train = []  # holds the patterns
  y_train = []  # holds the tags for each pattern

  for (pattern_sentence, tag) in xy:
    bag = bag_of_words(pattern_sentence, all_words)
    x_train.append(bag)

    label = tags.index(tag)  # for the labels we are storing a number representing the tag
    y_train.append(label)

  x_train = np.array(x_train)
  y_train = np.array(y_train)

  return x_train, y_train

x_train, y_train = separate_train_x_y(all_words, tags, xy)

In [19]:
# this is the first pattern and its associated tag in the training data
print(x_train[0])
print(y_train[0])

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0.]
3


Here we turn the data into PyTorch datasets (for use with the model)

In [20]:
class ChatDataset(Dataset):
  def __init__(self, x_train, y_train):
    self.n_samples = len(x_train)
    self.x_data = x_train
    self.y_data = y_train
  
  def __getitem__(self, index):
    return self.x_data[index], self.y_data[index]
  
  def __len__(self):
    return self.n_samples

### **TODO:** Set the Hyperparameters

Hyperparameters are aspects of the model that we can change that impact training - you can try to change some of these and see how this affects the results

In [21]:
# You can try adjusing the following by changing the numbers
batch_size = 8
hidden_size = 8
learning_rate = 0.001
num_epochs = 1000

In [22]:
# These depend on our particular data - don't change!
output_size = len(tags)
input_size = len(x_train[0])

print('output_size: ', output_size)
print('input_size: ', input_size)

output_size:  7
input_size:  54


### Create the dataset and dataloader

In [23]:
dataset = ChatDataset(x_train, y_train)
train_loader = DataLoader(dataset=dataset, batch_size=batch_size, shuffle=True, num_workers=2)

In [24]:
dataset.__getitem__(0)

(array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0.], dtype=float32), 3)

## Step 3 - Create the model

### Define the structure and flow of data through the model

In [25]:
class NeuralNet(nn.Module):
  
  def __init__(self, input_size, hidden_size, num_classes):
    super(NeuralNet, self).__init__()

    self.l1 = nn.Linear(input_size, hidden_size)
    self.l2 = nn.Linear(hidden_size, hidden_size)
    self.l3 = nn.Linear(hidden_size, num_classes)
    
    self.relu = nn.ReLU()
  
  def forward(self, x):
    out = self.l1(x)
    out = self.relu(out)

    out = self.l2(out)
    out = self.relu(out)

    out = self.l3(out)

    return out

In [26]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = NeuralNet(input_size, hidden_size, output_size).to(device)

Set loss function and optimizer

In [27]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

###Train the Model

Train the model to predict the tag based on the given input text (the patterns). This might take a minute, but you should be able to see the loss printed out after 100 epochs of training. Observe the loss function - it should be decreasing, indicating that the model is leraning

In [28]:
def train_model(model, train_loader, loss_func, optimizer):
  # training loop
  for epoch in range(num_epochs):
    for (words, labels) in train_loader:
      words = words.to(device)
      labels = labels.to(device)

      # forward pass
      outputs = model(words)  # the outputs are the model's predictions
      loss = loss_func(outputs, labels)

      # backward and optimizer steps
      optimizer.zero_grad()
      loss.backward()
      optimizer.step()
    
    if (epoch + 1) % 100 == 0:
      print(f'epoch {epoch+1}/{num_epochs}, loss={loss.item():.4f}')

  print(f'final loss={loss.item():.4f}')

  model.eval()

train_model(model, train_loader, criterion, optimizer)

epoch 100/1000, loss=1.4775
epoch 200/1000, loss=0.1640
epoch 300/1000, loss=0.0410
epoch 400/1000, loss=0.0351
epoch 500/1000, loss=0.0178
epoch 600/1000, loss=0.0016
epoch 700/1000, loss=0.0027
epoch 800/1000, loss=0.0005
epoch 900/1000, loss=0.0003
epoch 1000/1000, loss=0.0010
final loss=0.0010


## Step 4 - Use Model and Implement Chatting

### Implement the Chatting Functionality

In [31]:
def run_chatbot(bot_name='Chatty'):
  print("Let's chat! Type 'quit' to exit")

  while True:
    sentence = input('You: ')

    if sentence == 'quit':
      break
    
    # process the sentence the user inputs
    sentence = [stem(word) for word in tokenize(sentence)]
    x = bag_of_words(sentence, all_words)

    # reshape and format the input data for the model
    x = x.reshape(1, x.shape[0])
    x = torch.from_numpy(x)

    # get the predicted tag from the model
    output = model(x)
    _, predicted = torch.max(output, dim=1)
    # the model outputs a number for the tag, this coverts it to
    # the actual name of the tag
    tag = tags[predicted.item()] 

    # the model will tell us how likely it is that the input 
    # sentence corresponds to a certain tag
    probabilities = torch.softmax(output, dim=1)
    probability = probabilities[0][predicted.item()]
    # print('tag: ', tag, ' probability: ', probability)  # for debugging
    
    if probability.item() > 0.75:
      # find the corresponding intent for that tag
      for intent in intents['intents']:
        if tag == intent['tag']:
          # the chatbot responds with one of the responses for 
          # the tag, randomly chosen from the possible responses
          print(f"{bot_name}: {random.choice(intent['responses'])}") 
    
    # if we couldn't determine what the intent of the sentence was 
    # with much certainty, the chatbot shouldn't respond
    else:
      print(f"{bot_name}: I do not understand :(") 

In [32]:
run_chatbot()

Let's chat! Type 'quit' to exit
You: hello :)
Chatty: Hello, thanks for visiting
You: what do you seell?
Chatty: We sell coffee and tea
You: how long does delivery take?
Chatty: Shipping takes 2-4 days
You: do you take venmo?
Chatty: I do not understand :(
You: do you accept credit cards
Chatty: We accept most major credit cards, and Paypal
You: tell me a joke
Chatty: Why did the hipster burn his mouth? He drank the coffee before it was cool.
You: I hated your tea
Chatty: I do not understand :(
You: the coffee I got was gross
Chatty: I do not understand :(
You: bye
Chatty: Bye! Come back again soon.
You: quit


## Step 5 - Adding New Intents and Patterns to Improve the Chatbot

### Implementation of functions to update JSON with new intents, print intents, and remove intents (given tag)

These are functions to add the new intent to the JSON file holding all of the tags, patterns, and responses for each intent.

In [33]:
def write_json(data, filename='chatbot_tutorial/intents.json'):
    with open(filename,'w') as f:
        json.dump(data, f, indent=4)
      
      
def add_new_intent(tag, patterns, responses, filename='chatbot_tutorial/intents.json'):
  with open('chatbot_tutorial/intents.json') as f:
    data = json.load(f)
      
    intents = data['intents']
  
    new_intent = {
      "tag": tag,
      "patterns": patterns,
      "responses": responses
    }
  
    intents.append(new_intent)
      
  write_json(data, filename)

If you want to check what intents are in the JSON file currently, run this method.

In [34]:
def print_intents(filename='chatbot_tutorial/intents.json'):
  with open('chatbot_tutorial/intents.json') as f:
      data = json.load(f)
        
      intents = data['intents']
      for intent in intents:
        print(intent)

If you want to remove an intent, just provide the tag to this method to delete it.

In [35]:
def remove_intent_with_tag(tag, filename='chatbot_tutorial/intents.json'):
    with open('chatbot_tutorial/intents.json') as f:
      data = json.load(f)
      intents = data['intents']
      for intent in intents:
        if intent['tag'] == tag:
          intents.remove(intent)

    write_json(data, filename)

In [37]:
print_intents()

{'tag': 'greeting', 'patterns': ['Hi', 'Hey', 'How are you', 'Is anyone there?', 'Hello', 'Good day'], 'responses': ['Hey :-)', 'Hello, thanks for visiting', 'Hi there, what can I do for you?', 'Hi there, how can I help?']}
{'tag': 'goodbye', 'patterns': ['Bye', 'See you later', 'Goodbye'], 'responses': ['See you later, thanks for visiting', 'Have a nice day', 'Bye! Come back again soon.']}
{'tag': 'thanks', 'patterns': ['Thanks', 'Thank you', "That's helpful", "Thank's a lot!"], 'responses': ['Happy to help!', 'Any time!', 'My pleasure']}
{'tag': 'items', 'patterns': ['Which items do you have?', 'What kinds of items are there?', 'What do you sell?'], 'responses': ['We sell coffee and tea', 'We have coffee and tea']}
{'tag': 'payments', 'patterns': ['Do you take credit cards?', 'Do you accept Mastercard?', 'Can I pay with Paypal?', 'Are you cash only?'], 'responses': ['We accept VISA, Mastercard and Paypal', 'We accept most major credit cards, and Paypal']}
{'tag': 'delivery', 'pattern

### **TODO:** Fill in the tag, patterns, and responses as shown below with your new intent.

In [38]:
tag = "complaint"

patterns = [
            "Your tea was awful",
            "The coffee sucks",
            "The tea I ordered was gross",
            "I hated the coffee I bought from you",
            "Your products are awful",
            "Your store is terrible",
            "The coffee I ordered tasted like garbage"
            ]

responses =  [
              "I'm sorry you did not like our product.",
              "I guess our store just isn't your cup of tea.",
              "I'm sorry you had a bad experience."
              ]

# This method adds the new intent
add_new_intent(tag, patterns, responses)

# Print out the intents to see that the file was updated with the new data
print_intents()

{'tag': 'greeting', 'patterns': ['Hi', 'Hey', 'How are you', 'Is anyone there?', 'Hello', 'Good day'], 'responses': ['Hey :-)', 'Hello, thanks for visiting', 'Hi there, what can I do for you?', 'Hi there, how can I help?']}
{'tag': 'goodbye', 'patterns': ['Bye', 'See you later', 'Goodbye'], 'responses': ['See you later, thanks for visiting', 'Have a nice day', 'Bye! Come back again soon.']}
{'tag': 'thanks', 'patterns': ['Thanks', 'Thank you', "That's helpful", "Thank's a lot!"], 'responses': ['Happy to help!', 'Any time!', 'My pleasure']}
{'tag': 'items', 'patterns': ['Which items do you have?', 'What kinds of items are there?', 'What do you sell?'], 'responses': ['We sell coffee and tea', 'We have coffee and tea']}
{'tag': 'payments', 'patterns': ['Do you take credit cards?', 'Do you accept Mastercard?', 'Can I pay with Paypal?', 'Are you cash only?'], 'responses': ['We accept VISA, Mastercard and Paypal', 'We accept most major credit cards, and Paypal']}
{'tag': 'delivery', 'pattern

Because we have added a new intent, we need to retrain the model so it will learn the patterns associated with it and be able to recognize the new tag.

In [39]:
# Grab our JOSN file with the updated intents
with open('chatbot_tutorial/intents.json') as f:
  intents = json.load(f)

# Process the data and set up the model like before
all_words, tags, xy = process_training_data(intents)
x_train, y_train = separate_train_x_y(all_words, tags, xy)
dataset = ChatDataset(x_train, y_train)
train_loader = DataLoader(dataset=dataset, batch_size=batch_size, shuffle=True, num_workers=2)
model = NeuralNet(len(x_train[0]), hidden_size, len(tags)).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# Retrain the model with the new data to incorporate the new intents
train_model(model, train_loader, criterion, optimizer)

epoch 100/1000, loss=1.3128
epoch 200/1000, loss=0.1159
epoch 300/1000, loss=0.0112
epoch 400/1000, loss=0.0128
epoch 500/1000, loss=0.0005
epoch 600/1000, loss=0.0011
epoch 700/1000, loss=0.0016
epoch 800/1000, loss=0.0004
epoch 900/1000, loss=0.0000
epoch 1000/1000, loss=0.0004
final loss=0.0004


Now try running the chatbot again, testing out your new intent!

In [41]:
run_chatbot()

Let's chat! Type 'quit' to exit
You: hi
Chatty: Hello, thanks for visiting
You: what do you sell
Chatty: We have coffee and tea
You: I hated your tea
Chatty: I guess our store just isn't your cup of tea.
You: your coffee was terrible
Chatty: I guess our store just isn't your cup of tea.
You: when will my shipment arrive?
Chatty: Hi there, how can I help?
You: when will my delivery arrive?
Chatty: Delivery takes 2-4 days
You: I hated your products
Chatty: I'm sorry you had a bad experience.
You: byt
Chatty: I do not understand :(
You: bye
Chatty: Have a nice day
You: quit
