# Generating text with Recurrent Neural Networks

In this notebook we will make use of the Recurrent Neural Networks to make sequence predictions. We will use the book "The Three Musketeers" by Alexandre Dumas as our dataset and we will predict characters in order to generate text.

## Data Reading
Load the file from the data folder and inspect it. Standardize to lowercase. 

In [None]:
filename = "../data/musquetairesShort"
raw_text = open(filename).read()

In [None]:
raw_text[:100]

In [None]:
text = raw_text.lower()
print('corpus length:', len(text))

### Text preparation
We create a set with the different characters and two dictionaries from indices to chars
<font color=red><b>Generate dictionaries for the char to indices and indices to chars.
<br>_Hint: use the enumerate function on the chars set_</b>
</font>

In [None]:
chars = sorted(list(set(text)))
print('total chars:', len(chars))
## Add code here
# char_indices = 
# indices_char = 

Next we generate the input and output arrays:

The input will consist on sentences of a fixed (_maxlen_) lenght, while the outputs will be the next characters in the text.

So, if the text is "Welcome to Big Data Spain" with _maxlen_ = 5, we will have:


In order to avoid overfitting (and improve performances) we can add a _step_ to the structure so that with step = 3, for example:

<font color=red><b>Fill the sentences and next_char lists with the input and output data</b></font>

In [None]:
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    # Add code here
print('nb sequences:', len(sentences))

In [None]:
sentences[:5]

In [None]:
next_chars[:5]

### Dataset generation
We turn the text into one-hot-like vectors. Initialize the Input and output arrays to zero as boolean

In [None]:
import numpy as np
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
Y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_indices[char]] = 1
    Y[i, char_indices[next_chars[i]]] = 1

In [None]:
X[0]

In [None]:
print ("timesteps = ", len (X[0]), ", numchars = ", len (X[0][0]))

## Model Generation
Build the LSTM model to be trained train on the data, on this config:
- LSTM layer, with 256 units
- LSTM layer, with 256 units
- Dense layer, with 64 units
- Dense softmax layer
- On compilation, use adam as the optimizer and categorical_crossentropy as the loss function.
- Print the summary


<font color=red><b>Remember to initialize it propperly and to include input_shape on the first layer. <br> Hints: input_shape= (maxlen, len(chars))
- Use the imported libraries</b></font>

In [None]:
import os
import tensorflow as tf

physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
tf.keras.backend.clear_session() 

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.layers import LSTM

In [None]:
## Add code here

### Model Training
Train the model for two of epochs and see how it works. Use a batch_size of 128

In [None]:
## Add code here

### Model Evaluation
Let's test our model. In order to obtain a probabilistic answer we can sample from a probability array instead of just taking the max argument:

<font color=red><b> Sometimes probabilities are rounded. Apply a normalization-like tratment to them in order to avoid this when sampling</b> </font>


$$ p_i = \frac{p_i}{\sum_j p_j}$$


In [None]:
## Build a function to get the next predicted index:
def sample(preds, sample = True):
    # take a sample from the probabilities
    if sample:
        # probs can be rounded and not sum up to one. Recalculate the probs in order to avoid this
        # Add code here
        probas = np.random.multinomial(1, preds, 1)
    else:
        probas = preds
    return np.argmax(probas)

We get a seed in order to predict:

In [None]:
import random
start_index = random.randint(0, len(text) - maxlen - 1)
generated = ''
sentence = text[start_index: start_index + maxlen]
generated += sentence
print (generated)

#### Predictions
This will be the secuence for which we are going to predict the next character:

<font color=red> <b> Predict the next character given the input x_pred. <br>Hint: remember to take the first item in list</b>  </font>

In [None]:
## Predict next character given a model and the sequence to predict 
def get_next_char (model, x_pred, indices_char, Sample = True):
    preds = ## Add code here
    next_index = sample(preds, 1.0)
    return indices_char[next_index]

In [None]:
x_pred = np.zeros((1, maxlen, len(chars)))
for t, char in enumerate(sentence):
    x_pred[0, t, char_indices[char]] = 1.

Sample = True
# this gets the next character     
next_char = get_next_char (model, x_pred, indices_char, Sample)  

print (next_char)

Let's predict some more characters:

In [None]:
import sys
start_index = random.randint(0, len(text) - maxlen - 1)
sentence = text[start_index: start_index + maxlen]
print('Seed: ' + sentence + '"')
print('---------------------- Generated Text -----------------------')
chars_to_predict = 400
for i in range(chars_to_predict):
    x_pred = np.zeros((1, maxlen, len(chars)))
    for t, char in enumerate(sentence):
        x_pred[0, t, char_indices[char]] = 1.

    next_char = get_next_char (model, x_pred, indices_char, Sample)
    sentence = sentence[1:] + next_char

    sys.stdout.write(next_char)
    sys.stdout.flush()

## Load a trained model
Training Deep Learning Models is time consuming. So, some pretrained models are available to be loaded and take a look at better predictions. We will load a model for each 5 epochs in order to see the evolution. 

<font color=red> <b> Load a model for each time and predict the text <br> Hint: You can load the whole model or just the weights as the configuration is the same</b>  </font>

In [None]:
count = 0
partial_n_epoch = 5
times = 12

np.random.seed (1)
for j in range (times):
    
    count += partial_n_epoch
    print ("")
    print ("-------------- Next Model --------------")
    print ("Trained on ", count, " epochs")
    modelName = '../models/MusquetairesModelOptimizedMode_' + str (count) + '.h5' 
    ## Add code here