# Viraj Nishesh Darji

In [7]:
import numpy as np
import nltk
nltk.download('punkt')
from nltk.stem.porter import PorterStemmer
stemmer = PorterStemmer()

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\VIRAJ\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [12]:
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

In [44]:
import random
import json

In [45]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Tokenize

Tokenization is breaking text into smaller parts, that smaller parts are called tokens, tokens can be words or sentences. It helps in understanding meaning of the text by analyizing tokens.

split sentence into array of words/tokens </br>
a token can be a word or punctuation character, or number

In [19]:
def tokenize(sentence):
    return nltk.word_tokenize(sentence)

Stemming

Stemming and Lemmatization are methods to reduce words to their base form. In stemming the resuling base form word we get may not be lexicographically correct, while in lemmatization the words will be lexicographically correct. There stemming is faster and lemmtization is more accurate, it depends on use case what you want to use.

stemming = find the root form of the word</br>
examples:</br>
words = ["organize", "organizes", "organizing"]</br>
words = [stem(w) for w in words]</br>
-> ["organ", "organ", "organ"]

In [22]:
#we are also lowering the words.
def stem(word):
    return stemmer.stem(word.lower())

Word Embedding

Bag of Words

Bag of Words is one of the popular word embedding technique. Each valued vector would represents count of words in a text. It does not contain information on the grammer or the order of the words.

return bag of words array:</br>
1 for each known word that exists in the sentence, 0 otherwise</br>
example:</br>
sentence = ["hello", "how", "are", "you"]</br>
words = ["hi", "hello", "I", "you", "bye", "thank", "cool"]</br>
bog   = [  0 ,    1 ,    0 ,   1 ,    0 ,    0 ,      0]

In [30]:
def bag_of_words(tokenized_sentence, words):
    # stem each word
    sentence_words = [stem(word) for word in tokenized_sentence]
    # initialize bag with 0 for each word
    bag = np.zeros(len(words), dtype=np.float32)
    for idx, w in enumerate(words):
        if w in sentence_words: 
            bag[idx] = 1

    return bag

Creating a Neural Network Model

It has 3 linear Layer and activation function is relu</br>
Activation Function is ReLU

__init__ function defines layers.  </br>
Example below l1 is Linear Layer </br>
__forward__ pass function is used for computing prediction. </br>

In [36]:
class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.l1 = nn.Linear(input_size, hidden_size) 
        self.l2 = nn.Linear(hidden_size, hidden_size) 
        self.l3 = nn.Linear(hidden_size, num_classes)
        self.relu = nn.ReLU()
    
    def forward(self, x):
        out = self.l1(x)
        out = self.relu(out)
        out = self.l2(out)
        out = self.relu(out)
        out = self.l3(out)
        # no activation and no softmax at the end
        return out

In [14]:
with open('intents.json', 'r') as f:
    intents = json.load(f)

In [25]:
print(intents)

{'intents': [{'tag': 'greeting', 'patterns': ['Hi', 'Hey', 'How are you', 'Is anyone there?', 'Hello', 'Good day'], 'responses': ['Hey :-)', 'Hello, thanks for visiting', 'Hi there, what can I do for you?', 'Hi there, how can I help?']}, {'tag': 'goodbye', 'patterns': ['Bye', 'See you later', 'Goodbye'], 'responses': ['See you later, thanks for visiting', 'Have a nice day', 'Bye! Come back again soon.']}, {'tag': 'thanks', 'patterns': ['Thanks', 'Thank you', "That's helpful", "Thank's a lot!"], 'responses': ['Happy to help!', 'Any time!', 'My pleasure']}, {'tag': 'items', 'patterns': ['Which items do you have?', 'What kinds of items are there?', 'What do you sell?'], 'responses': ['We sell coffee and tea', 'We have coffee and tea']}, {'tag': 'payments', 'patterns': ['Do you take credit cards?', 'Do you accept Mastercard?', 'Can I pay with Paypal?', 'Are you cash only?'], 'responses': ['We accept VISA, Mastercard and Paypal', 'We accept most major credit cards, and Paypal']}, {'tag': 'd

Creating list of all words, list of tags and pair of tokenized words and tags

In [26]:
all_words = []
tags = []
xy = []
# loop through each sentence in our intents patterns
for intent in intents['intents']:
    tag = intent['tag']
    # add to tag list
    tags.append(tag)
    for pattern in intent['patterns']:
        # tokenize each word in the sentence
        w = nltk.word_tokenize(pattern)
        # add to our words list
        all_words.extend(w)
        # add to xy pair
        xy.append((w, tag))

Removing punctuation, then stemming and making list of all_words and tags unique

In [27]:
# stem and lower each word
ignore_words = ['?', '.', '!']
all_words = [stem(w) for w in all_words if w not in ignore_words]
# remove duplicates and sort
all_words = sorted(set(all_words))
tags = sorted(set(tags))

In [28]:
print(len(xy), "patterns")
print(len(tags), "tags:", tags)
print(len(all_words), "unique stemmed words:", all_words)

26 patterns
7 tags: ['delivery', 'funny', 'goodbye', 'greeting', 'items', 'payments', 'thanks']
54 unique stemmed words: ["'s", 'a', 'accept', 'anyon', 'are', 'bye', 'can', 'card', 'cash', 'credit', 'day', 'deliveri', 'do', 'doe', 'funni', 'get', 'good', 'goodby', 'have', 'hello', 'help', 'hey', 'hi', 'how', 'i', 'is', 'item', 'joke', 'kind', 'know', 'later', 'long', 'lot', 'mastercard', 'me', 'my', 'of', 'onli', 'pay', 'paypal', 'see', 'sell', 'ship', 'someth', 'take', 'tell', 'thank', 'that', 'there', 'what', 'when', 'which', 'with', 'you']


Creating Training Data

Since training data should be in form of vectors and not words, we are applying bag of words to all_words list and using index we are converting tags to numbers and finally creating training data

In [31]:
X_train = []
y_train = []
for (pattern_sentence, tag) in xy:
    # X: bag of words for each pattern_sentence
    bag = bag_of_words(pattern_sentence, all_words)
    X_train.append(bag)
    # y: PyTorch CrossEntropyLoss needs only class labels, not one-hot
    label = tags.index(tag)
    y_train.append(label)

X_train = np.array(X_train)
y_train = np.array(y_train)

Defining all the Hyper Parameters

In [32]:
# Hyper-parameters 
num_epochs = 1000
batch_size = 8
learning_rate = 0.001
input_size = len(X_train[0])
hidden_size = 8
output_size = len(tags)
print(input_size, output_size)

54 7


Dataset and DataLoader

pytorch Dataset have many inbuilt datasets.
If you want to create custom dataset, you can create it, you need to define 3 functions in it. </br>
__init__ function initiates the loading of the dataset, where we define features and labels. </br>
__get_item__ function is used to return sample from dataset based on index. </br>
__len__ function returns number of sample in the dataset. </br>
DataLoader is then used to load the dataset.

In [33]:
class ChatDataset(Dataset):

    def __init__(self):
        self.n_samples = len(X_train)
        self.x_data = X_train
        self.y_data = y_train

    # support indexing such that dataset[i] can be used to get i-th sample
    def __getitem__(self, index):
        return self.x_data[index], self.y_data[index]

    # we can call len(dataset) to return the size
    def __len__(self):
        return self.n_samples

In [37]:
dataset = ChatDataset()
train_loader = DataLoader(dataset=dataset,
                          batch_size=batch_size,
                          shuffle=True,
                          num_workers=0)

we are calling the  neural network that we defined above

In [39]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = NeuralNet(input_size, hidden_size, output_size).to(device)

CrossEntrophyLoss already applied softmax and Negative Log Likelihood Loss </br>
Loss is used to calculate loss, which will then be used in backpropogation to get gradient. The lower the loss, better the accuracy </br>
Optimizer is then used update weight in back propogation 

In [40]:
# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

Training loop consists of 3 parts: </br>
    Forward Pass : Compute Prediction </br>
    Backward Pass : gradient </br>
    Update Weight

In [41]:
# Train the model
for epoch in range(num_epochs):
    for (words, labels) in train_loader:
        words = words.to(device)
        labels = labels.to(dtype=torch.long).to(device)
        
        # Forward pass
        outputs = model(words)
        # if y would be one-hot, we must apply
        # labels = torch.max(labels, 1)[1]
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
    if (epoch+1) % 100 == 0:
        print (f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')


print(f'final loss: {loss.item():.4f}')


Epoch [100/1000], Loss: 0.9669
Epoch [200/1000], Loss: 0.2731
Epoch [300/1000], Loss: 0.0331
Epoch [400/1000], Loss: 0.0398
Epoch [500/1000], Loss: 0.0102
Epoch [600/1000], Loss: 0.0123
Epoch [700/1000], Loss: 0.0018
Epoch [800/1000], Loss: 0.0022
Epoch [900/1000], Loss: 0.0016
Epoch [1000/1000], Loss: 0.0006
final loss: 0.0006


Saving the model

In [42]:
data = {
"model_state": model.state_dict(),
"input_size": input_size,
"hidden_size": hidden_size,
"output_size": output_size,
"all_words": all_words,
"tags": tags
}

FILE = "data.pth"
torch.save(data, FILE)

print(f'training complete. file saved to {FILE}')

training complete. file saved to data.pth


In [47]:
model.load_state_dict(model.state_dict())
model.eval()

NeuralNet(
  (l1): Linear(in_features=54, out_features=8, bias=True)
  (l2): Linear(in_features=8, out_features=8, bias=True)
  (l3): Linear(in_features=8, out_features=7, bias=True)
  (relu): ReLU()
)

Bot will calculate the probability and identify the tag from the intent json file. </br>
To whichever tag it identifies the most, it will give q random response from that tag.  </br>
If bot do not identify the tag, it will say I do not understand.  </br>
If you want to stop chatting, just type quit  </br>

In [48]:
bot_name = "Sam"
print("Let's chat! (type 'quit' to exit)")
while True:
    # sentence = "do you use credit cards?"
    sentence = input("You: ")
    if sentence == "quit":
        break

    sentence = tokenize(sentence)
    X = bag_of_words(sentence, all_words)
    X = X.reshape(1, X.shape[0])
    X = torch.from_numpy(X).to(device)

    output = model(X)
    _, predicted = torch.max(output, dim=1)

    tag = tags[predicted.item()]

    probs = torch.softmax(output, dim=1)
    prob = probs[0][predicted.item()]
    if prob.item() > 0.75:
        for intent in intents['intents']:
            if tag == intent["tag"]:
                print(f"{bot_name}: {random.choice(intent['responses'])}")
    else:
        print(f"{bot_name}: I do not understand...")

Let's chat! (type 'quit' to exit)
You: what is your name?
Sam: See you later, thanks for visiting
You: tell me a joke?
Sam: What did the buffalo say when his son left for college? Bison.
You: hii
Sam: Hey :-)
You: whats up?
Sam: We sell coffee and tea
You: quit


References: </br>
1. Chat Bot With PyTorch - NLP And Deep Learning - Python Tutorial (Part 1). (2020, June 8). YouTube. https://www.youtube.com/watch?v=RpWeNzfSUHw&list=PLC-Eil48AiqUlPLB8HGRABtydUOFL84DH </br>
2. PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer. (2019, December 28). YouTube. https://www.youtube.com/watch?v=VVDHU_TWwUg&list=PLqnslRFeH2UrcDBWF5mfPGpqQDSta6VK4&index=7 </br>
3. G. (2017, September 9). Contextual Chatbots with Tensorflow. Medium. https://chatbotsmagazine.com/contextual-chat-bots-with-tensorflow-4391749d0077 </br>
4. P. (2020, September 4). GitHub - patrickloeber/pytorch-chatbot: Simple chatbot implementation with PyTorch. GitHub. https://github.com/patrickloeber/pytorch-chatbot </br>
5. Recurrent Neural Network (RNN) Tutorial: Types and Examples [Updated] | Simplilearn. (n.d.). Simplilearn.com. https://www.simplilearn.com/tutorials/deep-learning-tutorial/rnn </br>
6. Datasets & DataLoaders — PyTorch Tutorials 2.0.0+cu117 documentation. (n.d.). Datasets & DataLoaders — PyTorch Tutorials 2.0.0+cu117 Documentation. https://pytorch.org/tutorials/beginner/basics/data_tutorial.html </br>