Use a Seq2Seq model to Train a ChatBot
Seq2seq was first introduced for machine translation, by Google. As the name suggests, seq2seq takes as input a sequence of words(sentence or sentences) and generates an output sequence of words. It does so by use of the recurrent neural network (RNN). Although the vanilla version of RNN is rarely used, its more advanced version i.e. LSTM or GRU are used. This is because RNN suffers from the problem of vanishing gradient(A GRU is used in this example). It develops the context of the word by taking 2 inputs at each point of time. One from the user and other from its previous output, hence the name recurrent (output goes as input).
It mainly has two components i.e encoder and decoder, and hence sometimes it is called the Encoder-Decoder Network.
Encoder: It uses deep neural network layers and converts the input words to corresponding hidden vectors. Each vector represents the current word and the context of the word.
Decoder: It is similar to the encoder. It takes as input the hidden vector generated by encoder, its own hidden states and current word to produce the next hidden vector and finally predict the next word.
- Scikit-learn (https://pypi.org/project/scikit-learn/)
- Tensorflow (https://pypi.org/project/tensorflow/)
- Tensorlayer (https://pypi.org/project/tensorlayer/)
- Numpy (https://pypi.org/project/numpy/)
- TQDM (https://pypi.org/project/tqdm/)
- NLTK (https://pypi.org/project/nltk/)
-
The datasets are already processed and saved as .npy & .pkl files using the data.py script in their respective folders in the data directory.
-
To start training run the train.py script
python train.py --help
usage: train.py [-h] --dataset DATASET [--epochs EPOCHS]
Train Seq2Seq Chatbot
optional arguments:
-h, --help show this help message and exit
--dataset DATASET Dataset to be used twitter or cornell
--epochs EPOCHS Number of training epochs
Note: During training after every epoch the model is saved in the model folder. Training can be resumed by just running the script, it will load the model with least loss and continue training.
- To test the trained model, run the test.py script which will give an interactive prompt. The script automatically chooses the model with least loss.
python test.py --help
usage: test.py [-h] --dataset DATASET [--topn TOPN]
Test Seq2Seq Chatbot
optional arguments:
-h, --help show this help message and exit
--dataset DATASET Dataset to be used twitter or cornell
--topn TOPN Number of Top Outputs
Query > happy birthday honey
> thank you dear
> thank u babe
> thank ya
Query > the world is going bad
> i agree with that statement but i dont know what to do
> i agree world is unk but
> i agree with it unk and unk dont know unk to do
Query > how are you ?
> i dont think im okay you
> i told them to say i was just good
> i always thought i was ok
Query > are you a spy ?
> no sir
> no its been a long unk
> yes for long time
For doubts email me at: atinsaki@gmail.com