# <b>CHAT-BOT- Seq2Seq Model</b>
A chatbot is a software application used to conduct an on-line chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. Designed to convincingly simulate the way a human would behave as a conversational partner, chatbot systems typically require continuous tuning and testing, and many in production remain unable to adequately converse or pass the industry standard Turing test. The term "ChatterBot" was originally coined by Michael Mauldin (creator of the first Verbot) in 1994 to describe these conversational programs. 
![alt text](https://static.vecteezy.com/system/resources/previews/000/343/481/non_2x/chatbot-write-answer-to-messages-in-the-chat-bot-consultant-is-free-to-help-users-in-your-phone-online-vector-cartoon-illustration.jpg)

The basic foundation of chatbots is providing the best response of any query that it receives. The best response like answering the sender questions, providing sender relevant information, ask follow-up questions and do the conversation in realistic way.
The chatbot needs to be able to understand the intentions of the sender’s message, determine what type of response message (a follow-up question, direct response, etc.) is required, and follow correct grammatical and lexical rules while forming the response.<br>
<br>
<b>What is Seq2Seq Model?</b><br>
Seq2Seq is a machine learning architecture based on the encoder-decoder paradigm. It is widely used for tasks such as translation, Q&A and other cases where it is desirable to produce a sequence from another. The main idea is to have one model, for example an RNN, which can create a good representation of the input sequence. We will refer to this model as the ‘encoder’. Using this representation, another model, the ‘decoder’, produces the expected output sequence.

## <b>About the Data</b>
### <b>Cornell Movie--Dialogs Corpus</b>
#### <b>Description</b>
This corpus contains a large metadata-rich collection of fictional conversations extracted from raw movie scripts:

- 220,579 conversational exchanges between 10,292 pairs of movie characters

- involves 9,035 characters from 617 movies

- in total 304,713 utterances
In all files the field separator is " +++$+++ ".<br>

#### <b>movie_lines.txt</b>
- contains the actual text of each utterance
- fields:
  - lineID
  - characterID (who uttered this phrase)
  - movieID
  - character name
  - text of the utterance

#### <b>movie_conversations.txt</b>
- the structure of the conversations
- fields
  - characterID of the first character involved in the conversation
  - characterID of the second character involved in the conversation
  - movieID of the movie in which the conversation occurred
  - list of the utterances that make the conversation, in chronological order: ['lineID1','lineID2',...,'lineIDN']
has to be matched with movie_lines.txt to reconstruct the actual content

In [1]:
### Importing some libraries
import numpy as np
import pandas as pd

In [2]:
### Reading the data as list
lines = open('/content/drive/My Drive/Chat-Bot Project/movie_lines.txt', encoding='utf-8', errors='ignore').read().split('\n')
lines[:10]

['L1045 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ They do not!',
 'L1044 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ They do to!',
 'L985 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ I hope so.',
 'L984 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ She okay?',
 "L925 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ Let's go.",
 'L924 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ Wow',
 "L872 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ Okay -- you're gonna need to learn how to lie.",
 'L871 +++$+++ u2 +++$+++ m0 +++$+++ CAMERON +++$+++ No',
 'L870 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ I\'m kidding.  You know how sometimes you just become this "persona"?  And you don\'t know how to quit?',
 'L869 +++$+++ u0 +++$+++ m0 +++$+++ BIANCA +++$+++ Like my fear of wearing pastels?']

In [3]:
conversation = open('/content/drive/My Drive/Chat-Bot Project/movie_conversations.txt', encoding='utf-8', errors='ignore').read().split('\n')
conversation[:10]

["u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L194', 'L195', 'L196', 'L197']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L198', 'L199']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L200', 'L201', 'L202', 'L203']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L204', 'L205', 'L206']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L207', 'L208']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L271', 'L272', 'L273', 'L274', 'L275']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L276', 'L277']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L280', 'L281']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L363', 'L364']",
 "u0 +++$+++ u2 +++$+++ m0 +++$+++ ['L365', 'L366']"]

### <b>Data Preprocessing</b>
In the data preprocessing process we would first filter the data out from unnecessary ids and separators.
<br>
<b>Step 1: Data Retrieving</b><br>
Here we will create a list of dialogue exchange id list all over from the conversation list.
<br>
After retrieving the dialogue exchange id list from the conversation list, I will create a dictionary to map the conversation lines with their respective ids.

In [4]:
exchange_list= []
for conver in conversation:
  exchange_list.append(conver.split(' +++$+++ ')[-1][1:-1].replace("'", "").replace(",","").split())
exchange_list[:10]

[['L194', 'L195', 'L196', 'L197'],
 ['L198', 'L199'],
 ['L200', 'L201', 'L202', 'L203'],
 ['L204', 'L205', 'L206'],
 ['L207', 'L208'],
 ['L271', 'L272', 'L273', 'L274', 'L275'],
 ['L276', 'L277'],
 ['L280', 'L281'],
 ['L363', 'L364'],
 ['L365', 'L366']]

In [5]:
### Creating a dictionary with key as id and dialogues as their values
dialogues = {}
for line in lines:
  dialogues[line.split(' +++$+++ ')[0]]= line.split(' +++$+++ ')[-1]

val_cnt=0
for ind in dialogues:
  print('{}: "{}"'.format(ind,dialogues[ind]))
  val_cnt+=1
  if val_cnt==20:
    break

L1045: "They do not!"
L1044: "They do to!"
L985: "I hope so."
L984: "She okay?"
L925: "Let's go."
L924: "Wow"
L872: "Okay -- you're gonna need to learn how to lie."
L871: "No"
L870: "I'm kidding.  You know how sometimes you just become this "persona"?  And you don't know how to quit?"
L869: "Like my fear of wearing pastels?"
L868: "The "real you"."
L867: "What good stuff?"
L866: "I figured you'd get to the good stuff eventually."
L865: "Thank God!  If I had to hear one more story about your coiffure..."
L864: "Me.  This endless ...blonde babble. I'm like, boring myself."
L863: "What crap?"
L862: "do you listen to this crap?"
L861: "No..."
L860: "Then Guillermo says, "If you go any lighter, you're gonna look like an extra on 90210.""
L699: "You always been this selfish?"


<b>Step 2: Data Reshaping</b><br>
In this process we need to create two separate lists, one for questions and other for answers. The `questions` and `answers` list formation can be explained as follows:<br>
The first dialogue of a particular conversation will act as the question for another person, and the next dialogue/reply of another person will be the answer for the first dialogue as well as it will act as a question for upcoming dialogue by any other person or maybe the first person. This is how a conversation between two or more people works. This cycle goes on until the conversation ends.

In [6]:
### Creating list of questions and answers
questions = []
answers = []
for conver in exchange_list:
  for i in range(len(conver)-1):
      questions.append(dialogues[conver[i]])
      answers.append(dialogues[conver[i+1]])

print(np.shape(questions))
print(np.shape(answers))
questions[:10]

(221616,)
(221616,)


['Can we make this quick?  Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad.  Again.',
 "Well, I thought we'd start with pronunciation, if that's okay with you.",
 'Not the hacking and gagging and spitting part.  Please.',
 "You're asking me out.  That's so cute. What's your name again?",
 "No, no, it's my fault -- we didn't have a proper introduction ---",
 'Cameron.',
 "The thing is, Cameron -- I'm at the mercy of a particularly hideous breed of loser.  My sister.  I can't date until she does.",
 'Why?',
 'Unsolved mystery.  She used to be really popular when she started high school, then it was just like she got sick of it or something.',
 'Gosh, if only we could find Kat a boyfriend...']

In [7]:
answers[:10]

["Well, I thought we'd start with pronunciation, if that's okay with you.",
 'Not the hacking and gagging and spitting part.  Please.',
 "Okay... then how 'bout we try out some French cuisine.  Saturday?  Night?",
 'Forget it.',
 'Cameron.',
 "The thing is, Cameron -- I'm at the mercy of a particularly hideous breed of loser.  My sister.  I can't date until she does.",
 'Seems like she could get a date easy enough...',
 'Unsolved mystery.  She used to be really popular when she started high school, then it was just like she got sick of it or something.',
 "That's a shame.",
 'Let me see what I can do.']

<b>Step 3: Cutting down data size</b><br>
As constrained to the computation limitatons, we need to chop down the data on the basis of length of the question. We are going to consider only those questions whose length is below 13. I know it is pretty short, but I dont have enough RAM to compute such a huge amount of data all at once.<br>
So further from here we will be using only those conversation(`sorted_ques` and `sorted_ans`)whose length of the question is below 13.


In [8]:
### Creating a list of sorted questions and answers(questions wih length less than 13)
sorted_ques=[]
sorted_ans=[]
for i in range(len(questions)):
  if len(questions[i])<13:
    sorted_ques.append(questions[i])
    sorted_ans.append(answers[i])
print("Size of Sorted questions: {}".format(np.size(sorted_ques)))
print("Size of Sorted answers: {}".format(np.size(sorted_ans)))
sorted_ques[:10]

Size of Sorted questions: 31416
Size of Sorted answers: 31416


['Cameron.',
 'Why?',
 'There.',
 'Sure have.',
 'Hi.',
 'I was?',
 'Well, no...',
 'But',
 'What crap?',
 'No']

<b>Step 4: Data Cleaning</b><br>
A regular expression, often abbreviated to regex, is a method of using a sequence of characters to define a search to match strings, i.e. “find and replace”-like operations.<br>
This step is required because there are many words used in english language which is used as their short-form for example:<br>
He's for He is<br>
'd for would<br>
'll for will<br>...etc.<br>

So here we will be changing all such words into their respective standard words using the <b>"Regular Expression(re)"</b> Library.
we will be also converting all the uppercase letters into lowercase, then we will remove all the puncuation marks from our data.<br>
If we don't change all these words, our model will interpret them as two different words, which is not desirable. Puncuation marks are not required because our bot won't understand them.<br>




In [9]:
# Importing Regular Expression Library(re)
import re
# Creating a function to clean the text (removing punctuation marks)
def clean_text(txt):
  txt = txt.lower()
  txt = re.sub(r"i'm", "i am", txt)
  txt = re.sub(r"he's", "he is", txt)
  txt = re.sub(r"she's", "she is",txt)
  txt = re.sub(r"that's", "that is", txt)
  txt = re.sub(r"what's", "what is", txt)
  txt = re.sub(r"where's", "where is", txt)
  txt = re.sub(r"\'ll", " will", txt)
  txt = re.sub(r"\'ve", " have", txt)
  txt = re.sub(r"\'re", " are", txt)
  txt = re.sub(r"\'d", " would", txt)
  txt = re.sub(r"won't", "will not", txt)
  txt = re.sub(r"can't", "can not", txt)
  txt = re.sub(r"[^\w\s]", "", txt)
  return txt
  

In [10]:
# Creating a list of clean questions and clean answers
clean_ques=[]
clean_ans=[]
for line in sorted_ques:
  clean_ques.append(clean_text(line))
for line in sorted_ans:
  clean_ans.append(clean_text(line))
clean_ques[:10]

['cameron',
 'why',
 'there',
 'sure have',
 'hi',
 'i was',
 'well no',
 'but',
 'what crap',
 'no']

Along with the questions we will also cut down the length for the answers for the same reason, and faster computation.

In [11]:
# Cutting down long size answers
for i in range(len(clean_ans)):
  clean_ans[i]= ' '.join(clean_ans[i].split()[:15])
clean_ans[:20]

['the thing is cameron i am at the mercy of a particularly hideous breed of',
 'unsolved mystery she used to be really popular when she started high school then it',
 'where',
 'i really really really wanna go but i can not not unless my sister goes',
 'looks like things worked out tonight huh',
 'you never wanted to go out with me did you',
 'then that is all you had to say',
 'you always been this selfish',
 'me this endless blonde babble i am like boring myself',
 'okay you are gonna need to learn how to lie',
 'lets go',
 'i hope so',
 'they do not',
 'you might wanna think about it',
 'joey',
 'would you mind getting me a drink cameron',
 'expensive',
 'its a gay cruise line but i will be like wearing a uniform and stuff',
 'my agent says i have got a good shot at being the prada guy next',
 'you are concentrating awfully hard considering its gym class']

There are several words in our dialogue list(`clean_ques` and `clean_ans`) which does not appear very frequently, like names of some place, person or thing. We need to remove all those words whose length is lesser tham=n some threshold value, because it won't be helping our model to generalise efficiently.<br>
Here we can see that there are 17713 different words available in our cleaned dialogues list, and it reduced to 3467 after removing the less repeating words(here it is less than 5 times).

In [12]:
### Creating a word count dictionary for the counts of different words
word_count ={}
for line in clean_ques:
  for word in line.split():
    if word not in word_count:
      word_count[word]=1
    else:
      word_count[word]+=1

for line in clean_ans:
  for word in line.split():
    if word not in word_count:
      word_count[word]=1
    else:
      word_count[word]+=1

print(len(word_count))


17713


In [13]:
# Removing words whose count is below threshold value
threshold= 5
# Creating a new word wise dictionary/vocabulary to store filtered data
vocab = {}
cnt =0;
for word,count in word_count.items():
  if count>=threshold:
    vocab[word]=cnt
    cnt+=1
print(len(vocab))

3467


<b> Step 4: Solving the possible variation between the input and output sequence sizes</b><br>
One issue to be confronted when choosing a seq2seq approach is the possible variation between the input and output sequence sizes. To handle that, we can introduce the SOS (Start Of Sequence) and EOS (End Of Sequence) tokens. By adding the EOS token at the end of each input we provide a consistent signal, facilitating the system’s capacity of learning how to finish the creation of the new sequence. We can then ask the decoder to give as many tokens as it wants until it raises the EOS token to signal the end of the output sequence.

In [14]:
# Adding <SOS> and <EOS> at begining and end of string
for i in range(len(clean_ans)):
  clean_ans[i]= '<SOS> ' + clean_ans[i] + ' <EOS>'
clean_ans[:10]

['<SOS> the thing is cameron i am at the mercy of a particularly hideous breed of <EOS>',
 '<SOS> unsolved mystery she used to be really popular when she started high school then it <EOS>',
 '<SOS> where <EOS>',
 '<SOS> i really really really wanna go but i can not not unless my sister goes <EOS>',
 '<SOS> looks like things worked out tonight huh <EOS>',
 '<SOS> you never wanted to go out with me did you <EOS>',
 '<SOS> then that is all you had to say <EOS>',
 '<SOS> you always been this selfish <EOS>',
 '<SOS> me this endless blonde babble i am like boring myself <EOS>',
 '<SOS> okay you are gonna need to learn how to lie <EOS>']

We also need to add tokens for `<SOS>` and `<EOS>` in our vocabulary, along with this two we will also be adding two more tokens: `<PAD>` and `<OUT>`, `<PAD>` token is used to represent the padding in the string(which will be provided further). `<OUT>` token represents those words(entered from user end) which will not be available in our dictionary.

In [15]:
### Creating other necessary tokens
tokens = ['<PAD>', '<EOS>', '<OUT>', '<SOS>']
x = len(vocab)
for token in tokens:
  vocab[token]=x
  x+=1
  

In [16]:

### Setting index of <PAD> token as 0 
vocab['cameron']= vocab['<PAD>']
vocab['<PAD>']=0

In [17]:
# Inversing the dictionary
inv_vocab = {w:v for v,w in vocab.items()}

### <b>Creating Encoder and Decoder Input for Training </b>
The encoder-decoder model is a way of using recurrent neural networks for sequence-to-sequence prediction problems.<br>
The overall most basic structure of sequence to sequence model(encoder-decoder) which is commonly used is as shown below-
![alt text](https://miro.medium.com/max/1250/1*zq1G3mPSuy-KoMlBldohww.png)
It consists of 3 parts:<b> encoder, intermediate vector</b> and <b>decoder.</b><br>
<br>
<b>Encoder-</b>It accepts a single element of the input sequence at each time step, process it, collects information for that element and propagates it forward.<br>
<b>Intermediate vector-</b> This is the final internal state produced from the encoder part of the model. It contains information about the entire input sequence to help the decoder make accurate predictions.<br>
<b>Decoder-</b> given the entire sentence, it predicts an output at each time step.

In [18]:
### Creating encoder input to feed the model
encoder_inp = []
for line in clean_ques:
  lst=[]
  for word in line.split():
    if word not in vocab:
      lst.append(vocab['<OUT>'])
    else:
      lst.append(vocab[word])
  encoder_inp.append(lst)
encoder_inp[:10]

[[3467], [1], [2], [3, 4], [5], [6, 7], [8, 9], [10], [11, 12], [9]]

In [19]:
### Creating decoder input for answers
decoder_inp = []
for line in clean_ans:
  lst=[]
  for word in line.split():
    if word not in vocab:
      lst.append(vocab['<OUT>'])
    else:
      lst.append(vocab[word])
  decoder_inp.append(lst)

decoder_inp[5:10]

[[3470, 47, 33, 1175, 18, 31, 35, 209, 50, 111, 47, 3468],
 [3470, 85, 27, 28, 147, 47, 783, 18, 74, 3468],
 [3470, 47, 835, 1300, 46, 1638, 3468],
 [3470, 50, 46, 3469, 1581, 3469, 6, 40, 370, 1462, 516, 3468],
 [3470, 15, 47, 83, 2225, 1377, 18, 1943, 60, 18, 804, 3468]]

#### <b>Padding</b>
What does this mean? It means that we need to search for the length of the longest sentence, convert every sentence to a vector of that length, and fill the gap between the number of words of each sentence, and the number of words of the longest sentences with zeros.<br>
Why Padding?We need padding because our model expects all
the sentences in a batch whether they are questions or answers must have the same length.

In [20]:
from keras.preprocessing.sequence import pad_sequences
### Padding our encoder and decoder values
encoder_inp = pad_sequences(encoder_inp, 13, padding='post', truncating='post')
decoder_inp = pad_sequences(decoder_inp, 13, padding='post', truncating='post')

In [21]:
encoder_inp[:5]

array([[3467,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0],
       [   1,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0],
       [   2,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0],
       [   3,    4,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0],
       [   5,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0]], dtype=int32)

In [22]:
decoder_inp[:5]

array([[3470,   61,  538,   28, 3467,    6,   40,  285,   61, 3469,   69,
          88, 2405],
       [3470, 3469,  967,   14, 1077,   18,  230,  126, 2406,   71,   14,
        2407,  647],
       [3470,  152, 3468,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0],
       [3470,    6,  126,  126,  126, 1069,   31,   10,    6,  238,   29,
          29, 1413],
       [3470, 1551,  370,  911, 1018,   35,  106,   65, 3468,    0,    0,
           0,    0]], dtype=int32)

We created this new list of decoder as there is always a same token(3470) at the begining of every list, this is the token of `<SOS>` string, we need to remove this from our decoder model, otherwise the predicting model will always be predicting this at the start of every replied string.

In [23]:
final_decoder = []
for inp in decoder_inp:
  final_decoder.append(inp[1:])
final_decoder = pad_sequences(final_decoder, 13, padding='post', truncating='post')
final_decoder[:5]

array([[  61,  538,   28, 3467,    6,   40,  285,   61, 3469,   69,   88,
        2405,    0],
       [3469,  967,   14, 1077,   18,  230,  126, 2406,   71,   14, 2407,
         647,    0],
       [ 152, 3468,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0],
       [   6,  126,  126,  126, 1069,   31,   10,    6,  238,   29,   29,
        1413,    0],
       [1551,  370,  911, 1018,   35,  106,   65, 3468,    0,    0,    0,
           0,    0]], dtype=int32)

In [24]:
np.shape(final_decoder)

(31416, 13)

<b> Reshaping Decoder Input</b><br>
We need to convert our final_decoder data to 3D data, as our model expects 3D data for training.
Here, we will we using `to_categorical()` function for this purpose, it converts a class vector (integers) to binary class matrix.

In [25]:
from tensorflow.keras.utils import to_categorical
final_decoder = to_categorical(final_decoder, len(vocab))
final_decoder.shape

(31416, 13, 3471)

## <b>Create Model</b>
Here we will be using Recurrent Neural Networks.The RNN used here is <b>Long Short Term Memory(LSTM).</b> 
<br>
Long Short Term Memory networks – usually just called “LSTMs” – are a special kind of RNN, capable of learning long-term dependencies.
<br>
LSTMs are explicitly designed to avoid the long-term dependency problem. Remembering information for long periods of time is practically their default behavior.

The Encoder loops over the sequence. At each step it takes the encoded representation of a token and passes it through the LSTM cell, that also gets an input of the state of the sentence representation so far. When EOS is reached, we collect the final state of our cell (cell_state, hidden_state).
![alt text](https://mk0caiblog1h3pefaf7c.kinstacdn.com/wp-content/uploads/2018/07/encoder.png)
<br>
<br>
The Decoder shares the same basic architecture, with one added layer (here a perceptron layer) to predict the new token.
![alt text](https://mk0caiblog1h3pefaf7c.kinstacdn.com/wp-content/uploads/2018/07/decoder.png)

In [26]:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Embedding, LSTM, Input


In [27]:
### Creating input placeholders for our encoder and decoder
enc_inp = Input(shape=(13,))
dec_inp = Input(shape=(13,))


#### <b>Word Embedding</b>
Word embeddings are mathematical representations of words encoded as vectors in n-dimentional space. Similar words are close to each other in this space. This means that we can compare 2 or more words to each other not by e.g. the number of overlapping characters but by how close they are to each other in they embedded form.
<br>
<b>Why Word Embedding?</b><br>
Our goal in designing an intent detection is to create a system that, given a few examples for intent, can detect that a sentence given by the user is similar to these examples and therefore should have the same intent.
The problem behind this system is that we have to design a system for checking if 2 sentences are similar. This could be achieved by eg. counting how many overlapping words are in the new sentence and the sentences in training data set. This is however a naive approach because a user can use a word that has similar meaning, but is different from the ones in the train examples.


In [28]:
### Creating embedding layer
Vocab_Size=len(vocab)
embed = Embedding(Vocab_Size+1, output_dim=70, input_length=13, trainable=True)

In [29]:
### Embedding encoder input
enc_embed = embed(enc_inp)
### Creating object for encoder LSTM Layer
enc_lstm = LSTM(700, return_sequences=True, return_state=True)
enc_op, h,c = enc_lstm(enc_embed)
enc_states=[h,c]


In [30]:
### Embedding decoder input
dec_embed = embed(dec_inp)
### Creating object for decoder LSTM Layer
dec_lstm = LSTM(700, return_sequences=True, return_state=True)
dec_op, _, _ = dec_lstm(dec_embed, initial_state=enc_states)


#### <b>Dense Layer</b>
A Dense Layer is just a regular layer of neurons in a neural network. Each neuron recieves input from all the neurons in the previous layer, thus densely connected. The layer has a weight matrix <b>W</b>, a bias vector <b>b</b>, and the activations of previous layer <b>a</b>. 

In [31]:
### Creating dense layer
dense = Dense(Vocab_Size, activation='softmax')
dense_output = dense(dec_op)

In [32]:
### Creating and Compiling Model 
model = Model([enc_inp, dec_inp], dense_output)
model.compile(loss='categorical_crossentropy',metrics=['acc'], optimizer='adam')

In [33]:
Vocab_Size

3471

In [34]:
### Fitting the data to our model
model.fit([encoder_inp, decoder_inp], final_decoder, epochs=60)

Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


<tensorflow.python.keras.callbacks.History at 0x7f79be9e1c88>

In [35]:
model.summary()

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            [(None, 13)]         0                                            
__________________________________________________________________________________________________
input_1 (InputLayer)            [(None, 13)]         0                                            
__________________________________________________________________________________________________
embedding (Embedding)           (None, 13, 70)       243040      input_1[0][0]                    
                                                                 input_2[0][0]                    
__________________________________________________________________________________________________
lstm (LSTM)                     [(None, 13, 700), (N 2158800     embedding[0][0]       

In [42]:
import os

import tensorflow as tf
from tensorflow import keras
# Save the entire model as a SavedModel.
!mkdir -p saved_model
model.save('saved_model/my_model') 

INFO:tensorflow:Assets written to: saved_model/my_model/assets


In [None]:
# # my_model directory
# !ls saved_model

# # Contains an assets folder, saved_model.pb, and variables folder.
# !ls saved_model/my_model

In [65]:
import os

import tensorflow as tf
from tensorflow import keras

In [66]:
model = tf.keras.models.load_model('saved_model/my_model')

# Check its architecture
model.summary()

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            [(None, 13)]         0                                            
__________________________________________________________________________________________________
input_1 (InputLayer)            [(None, 13)]         0                                            
__________________________________________________________________________________________________
embedding (Embedding)           (None, 13, 70)       243040      input_1[0][0]                    
                                                                 input_2[0][0]                    
__________________________________________________________________________________________________
lstm (LSTM)                     [(None, 13, 700), (N 2158800     embedding[0][0]       

##<b> Inference</b>

In [36]:
### Creating encoder model 
enc_model = Model([enc_inp], enc_states)

In [37]:
### Creating decoder model
decoder_state_input_h = Input(shape=(700,))
decoder_state_input_c = Input(shape=(700,))
### Creating list of decoder h,c
decoder_state_input = [decoder_state_input_h, decoder_state_input_c]

#### <b>How decoder creates new sentences?</b>
The encoder provides a representation of the whole sequence. One option is to initialize the decoder state with this representation, then just send an SOS token to start the generation of the new sequence.<br>
In order to provide to the decoder a maximum of information about its progression, we can pass the decoded token as the new input when generating the next one. We repeat the process until the decoder raises the EOS signal.<br>
![alt text](https://mk0caiblog1h3pefaf7c.kinstacdn.com/wp-content/uploads/2018/07/Seq2Seq_inference-1-1920x453.png)


In [40]:
decoder_outputs, state_h, state_c = dec_lstm(dec_embed, initial_state=decoder_state_input)
decoder_states = [state_h, state_c]
decoder_model = Model([dec_inp]+decoder_state_input, [decoder_outputs]+decoder_states)
print("#######################################################################")
print("#                       Starting Chatbot ver 1.0                      #")
print("#######################################################################")

### Creating user input condition
prepro1= ""
while prepro1 != 'q':
  prepro1 = input("You: ")
  # Cleaning the text
  prepro1 = clean_text(prepro1)
  prepro = [prepro1]

  # Creating a list for processed text
  new_txt = []
  for x in prepro:
    new_list = []
    for y in x.split():
      try:
        new_list.append(vocab[y])
      except:
        new_list.append(vocab['<OUT>'])
    new_txt.append(new_list)

  # Applying pad to our new_txt list
  new_txt = pad_sequences(new_txt, 13, padding='post')
  # Predicting the txt input
  stat = enc_model.predict(new_txt)

  empty_target_seq = np.zeros( (1, 1) )
  empty_target_seq[0,0] = vocab['<SOS>']

  stop_condition = False
  decoded_translation = ''

  while not stop_condition:
    decoder_outputs,h,c = decoder_model.predict([empty_target_seq]+stat)
    decoder_concat_inp = dense(decoder_outputs)

    sampled_word_index = np.argmax(decoder_concat_inp[0, -1, :])
    sampled_word = inv_vocab[sampled_word_index]+ ' '

    if sampled_word!= '<EOS> ':
      decoded_translation += sampled_word
    
    if sampled_word == '<EOS> ' or len(decoded_translation.split())>13:
      stop_condition=True
  
    ## Reseting the variabls
    empty_target_seq = np.zeros( (1, 1) ) 
    empty_target_seq[0,0] = sampled_word_index
    stat = [h,c]
    # decoded_trans_out=''
    # for word in decoded_translation:
    #   if word!='<PAD>' or word!='<OUT>':
    #     decoded_trans_out+= word


  
  print("Chatbot Attention: ", decoded_translation)
  print("==============================================================")



#######################################################################
#                       Starting Chatbot ver 1.0                      #
#######################################################################
You: hello
Chatbot Attention:  <OUT> is that you 
You: yeah its me again
Chatbot Attention:  i will be in my quarters if needed but i would prefer <PAD> <PAD> 
You: ok
Chatbot Attention:  you dont 
You: ok
Chatbot Attention:  you dont 
You: are you a man
Chatbot Attention:  i dont think so 
You: then what are you
Chatbot Attention:  i am sorry 
You: why sorry
Chatbot Attention:  i am not allowed to date 
You: ooh are you committed?
Chatbot Attention:  well its more than we had ten minutes ago 
You: did you broke up
Chatbot Attention:  no this happened to me 
You: ohh thats amazing
Chatbot Attention:  no you wouldnt that is a good thing 
You: what really? is that good thing?
Chatbot Attention:  <OUT> 
You: tell me more about yourself
Chatbot Attention:  i would just as soon 