# 03 Vivabot 2.0

![](https://images.unsplash.com/photo-1507146153580-69a1fe6d8aa1?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1050&q=80)

Photo by [Andy Kelly](https://unsplash.com/photos/0E_vhMVqL9g)

In this challenge, you will see how to build an AI based chatbot, using RNNs only. This is far from being easy, so we will help you out along the way.

The used corpus of conversations, in `chatbot_data`, is from [here](https://github.com/gunthercox/chatterbot-corpus/tree/master/chatterbot_corpus/data/english).

First, you can read the data with the following code. Run it and have a look at the conversations.

In [1]:
import os
import yaml
import numpy as np
def read_conversations():
    dir_path = 'chatbot_data/'
    files_list = os.listdir(dir_path + os.sep)

    questions = list()
    answers = list()
    for filepath in files_list:
        stream = open( dir_path + os.sep + filepath , 'rb')
        docs = yaml.safe_load(stream)
        conversations = docs['conversations']
        for con in conversations:
            if len( con ) > 2 :
                questions.append(con[0])
                replies = con[ 1 : ]
                ans = ''
                for rep in replies:
                    ans += ' ' + rep
                answers.append( ans )
            elif len( con )> 1:
                questions.append(con[0])
                answers.append(con[1])
    return questions, answers

In [2]:
### TODO: explore the conversations
### STRIP_START ###
questions, answers = read_conversations()

randid = np.random.randint(len(questions))

print(questions[randid], '\n', answers[randid])
### STRIP_END ###

You are arrogant 
  Arrogance is not one of my emotions. I have no real emotions, so how can I be arrogant? I am terse.  There is a difference. I am not human, so how can I partake of a human emotion such as arrogance?


The next step is to transform those conversations into useful data for the RNN. Now we have data like:
```Python
question = "how are you"
answer = "I am good"
```

As you remember, the model we will use is like the following:

![](Chatbot_encoder-decoder.png)

So that we want to transform our data into to something like:
```Python
encoder_input = ["how", "are", "you"]
decoder_input = ["<START>", "I", "am", "good"]
decoder_target = ["I", "am", "good", "<END>"]
```

More specifically, you have to transform this into sequence of one-hot-encoded information.

You already did so, you can try out. Know that you can use the following functions and classes:
```Python
tensorflow.keras.preprocessing.text.Tokenizer
tensorflow.keras.preprocessing.text.one_hot
tensorflow.keras.preprocessing.sequence.pad_sequences
tensorflow.keras.utils.to_categorical
```

We gave you some helper functions in case you need some help in the file `utils.py`.

In [3]:
# TODO: Prepare the data
### STRIP_START ###
from utils import *

# get the tokens
token_ques_input, token_ans_input, token_ans_target, vocab_size, t = get_tokens(questions, answers)

# pad the sequences
max_len = 8
pad_ques_input, pad_ans_input, pad_ans_target = padding(token_ques_input, token_ans_input, token_ans_target, max_len)

# one hot encode
pad_ques_input, pad_ans_input, pad_ans_target = one_hot_encode(pad_ques_input, pad_ans_input, pad_ans_target, vocab_size)

### STRIP_END ###

We have built a model for you here, there is an encoder part, followed by a decoder part. This is inspired by [this post](https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html) if you want more details. If you have time, you can play with some hyperparamters (number of neurons, activation function...).

Try to understand it, and fill the needed information according to your input data: we ask you to define the variables `num_encoder_tokens` and `num_decoder_tokens`.

In [5]:
### TODO: Fill the variables num_encoder_tokens and num_decoder_tokens
### STRIP_START ###

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, LSTM, Dense, Embedding


num_encoder_tokens = vocab_size+1
num_decoder_tokens = vocab_size+1
latent_dim = 256

# Define an input sequence and process it.
encoder_inputs = Input(shape=(None, num_encoder_tokens))
#encoder_embed = Embedding(input_dim=num_encoder_tokens, output_dim=latent_dim, input_length=max_len)(encoder_inputs)
encoder = LSTM(latent_dim, activation='relu', return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]

# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(None, num_decoder_tokens))
#decoder_embed = Embedding(input_dim=num_encoder_tokens, output_dim=latent_dim, input_length=max_len)(decoder_inputs)

# We set up our decoder to return full output sequences,
# and to return internal states as well. We don't use the 
# return states in the training model, but we will use them in inference.
decoder_lstm = LSTM(latent_dim, activation='relu', return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,
                                     initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
### STRIP_END ###

Now train this model with your input data. You have to fill the variables `encoder_input_data` `decoder_input_data` and `decoder_target_data`.

In [7]:
### TODO: Fill the variables num_encoder_tokens and num_decoder_tokens
### STRIP_START ###
from tensorflow.keras import optimizers


encoder_input_data = pad_ques_input
decoder_input_data = pad_ans_input
decoder_target_data = pad_ans_target

model.compile(optimizer='adam', loss='categorical_crossentropy')
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
          batch_size=64,
          epochs=10)# much more epochs are needed to make it right

### STRIP_END ###

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f69c42a34a8>

Here we help you out building the inference setup, that will allow to build answers from questions. Try to understand the code.

In [8]:
# Here we define the inference setup
encoder_model = Model(encoder_inputs, encoder_states)

decoder_state_input_h = Input(shape=(latent_dim,))
decoder_state_input_c = Input(shape=(latent_dim,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(
    decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)

Below is a method that takes as input a sequence (same shape as the model takes as input), and returns a sentence. Have a look at it, it should look a bit familiar.

In [9]:
# Here is a method to decode a sequence and output a sentence
def decode_sequence(input_seq, word_to_idx, idx_to_word, randomness=True):
    # Encode the input as state vectors.
    states_value = encoder_model.predict(input_seq)

    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1, 1, num_decoder_tokens))
    # Populate the first character of target sequence with the start character.
    target_seq[0, 0, 0] = 1.

    # Sampling loop for a batch of sequences
    # (to simplify, here we assume a batch of size 1).
    stop_condition = False
    decoded_sentence = ''
    while not stop_condition:
        output_tokens, h, c = decoder_model.predict(
            [target_seq] + states_value)
        
        # Sample a token
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        random_choice = np.random.choice(list(word_to_idx.keys()), replace=False, p=(output_tokens[0, -1, :]).reshape(len(word_to_idx)))
        sampled_word = idx_to_word[sampled_token_index]

        if randomness==True:
            sampled_word = random_choice
        
        # Exit condition: either hit max length
        # or find stop word.
        if (sampled_word == '<END>' or
           len(decoded_sentence) > max_len):
            stop_condition = True
        else:
            decoded_sentence += sampled_word + ' '

        # Update the target sequence (of length 1).
        target_seq = np.zeros((1, 1, num_decoder_tokens))
        target_seq[0, 0, sampled_token_index] = 1.

        # Update states
        states_value = [h, c]

    return decoded_sentence

A couple of steps more before using your bot. First you have to define the `word_to_idx` and `idx_to_word` dict needed for the `decode_sequence` function.
- `word_to_idx` is the dictionary that gives the token index for a given word
- `idx_to_word` is the dictionary that gives the word corresponding to a given index

Do not forget to add the `'<START>'` (`vocab_size` index) and `'<END>'` (`0` index) words.

To do so, you might want to use the outputed `Tokenizer` object (returned by the function `get_tokens` in case you used it).

In [10]:
# TODO: Compute char_to_idx
### STRIP_START ###
word_to_idx = t.word_index
word_to_idx['<END>'] = 0
word_to_idx['<START>'] = vocab_size
idx_to_word = dict([[v,k] for k,v in word_to_idx.items()])
### STRIP_END ###

Finally, using the function `decode_sequence`, try to get an answer from your bot!
Be careful, you have to process a given sentence (a string) into the right format, so that the RNN understands it!

You can put that into a interactive way, so that it looks more user friendly.

In [11]:
# TODO: Test your bot
### STRIP_START ###
my_question = ["government pays you"]
seq = [one_hot(sentence, n=len(t.word_index)) for sentence in my_question]
seq = pad_sequences(seq, maxlen=max_len, dtype='int32', padding='post', truncating='post', value=0)
seq = to_categorical(seq, num_classes=vocab_size+1)
decode_sequence(seq, word_to_idx, idx_to_word)
### STRIP_END ###

'contrary '

How does it work? Not really good? It takes quite some time (and experience) to train correctly such a model. This is normal. But you get the global idea, and you can also all more conversational inputs yourself, so that your bot will behave the way you want!

---

You can also directly use precoded chatbots. An easy to use example is Chatterbot. You can find it [here](https://github.com/gunthercox/ChatterBot).

This is quite convenient to install using the commands:
```
pip install chatterbot
pip install chatterbot-corpus
```

And this is really easy to use, with the following lines of code:
```Python3
from chatterbot import ChatBot
from chatterbot.trainers import ChatterBotCorpusTrainer

chatbot = ChatBot('Ron Obvious')

# Create a new trainer for the chatbot
trainer = ChatterBotCorpusTrainer(chatbot)

# Train the chatbot based on the english corpus
trainer.train("chatterbot.corpus.english")

# Get a response to an input statement
chatbot.get_response("Hello, how are you today?")
```

You can also add data to the conversations, create a new corpus and train on it:
```Python
from chatterbot.trainers import ChatterBotCorpusTrainer

# Create a new trainer for the chatbot
trainer = ChatterBotCorpusTrainer(chatbot)

# Train based on the english corpus
trainer.train("chatterbot.corpus.english")

# Train based on english greetings corpus
trainer.train("chatterbot.corpus.english.greetings")

# Train based on the english conversations corpus
trainer.train("chatterbot.corpus.english.conversations")
```

Give it a try!

In [13]:
# TODO: Try to use chatterbot
### STRIP_START ###
from chatterbot import ChatBot
from chatterbot.trainers import ChatterBotCorpusTrainer

chatbot = ChatBot('Vivabot')

# Create a new trainer for the chatbot
trainer = ChatterBotCorpusTrainer(chatbot)

# Train the chatbot based on the english corpus
trainer.train("chatterbot.corpus.english")

# Get a response to an input statement
chatbot.get_response("Hello, how are you today?")
### STRIP_END ###

Training ai.yml: [############        ] 60%

[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/vince/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package stopwords to /home/vince/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Training ai.yml: [####################] 100%
Training botprofile.yml: [####################] 100%
Training computers.yml: [####################] 100%
Training conversations.yml: [####################] 100%
Training emotion.yml: [####################] 100%
Training food.yml: [####################] 100%
Training gossip.yml: [####################] 100%
Training greetings.yml: [####################] 100%
Training health.yml: [####################] 100%
Training history.yml: [####################] 100%
Training humor.yml: [####################] 100%
Training literature.yml: [####################] 100%
Training money.yml: [####################] 100%
Training movies.yml: [####################] 100%
Training politics.yml: [####################] 100%
Training psychology.yml: [####################] 100%
Training science.yml: [####################] 100%
Training sports.yml: [####################] 100%
Training trivia.yml: [####################] 100%


<Statement text:i saw the matrix>