# RNN Play Generator

We are going to use a RNN to generate a play. We will simply show the RNN an example of something we want it to recreate and it will learn how to write a version of it on its own. We'll do this using a character predictive model that will take as input a variable length sequence and predict the next character. We can use the model many times in a row with the output from the last prediction as the input for the next call to generate a sequence

In [1]:
from keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np

### Dataset

In [3]:
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt
[1m1115394/1115394[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1us/step


### Read contents of file

In [4]:
# Read, then decode for py2 compat
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
#length of the text is the number of characters in it
print('Length of the text: {} characters'.format(len(text)))

Length of the text: 1115394 characters


In [5]:
print(text[:250]) #first 250 characters

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



### Encoding

In [9]:
vocab = sorted(set(text)) #crea un conjunto con todos los caracteres unicos presentes en text, solo tiene elementos unicos eliminando duplicados
#creating a mapping from unique characters to indices.
char2idx = {u:i for i, u in enumerate (vocab)} #itera sobre los caracteres en vocab proporcionando tanto el indice como el caracter
idx2char = np.array(vocab) #crea un numoy array donde cada indice contiene el caracter correspondiente

def text_to_int(text):
    return np.array([char2idx[c] for c in text]) #convierte un texto en una secuencia de indices utilizando char2idx (diccionario)

text_as_int = text_to_int(text) #convierte el texto original en su representacion numerica

In [10]:
#example
print('text: ', text[:13])
print ('Encoded: ', text_to_int(text[:13]))

text:  First Citizen
Encoded:  [18 47 56 57 58  1 15 47 58 47 64 43 52]


We will make a function that can convert our numeric values to text

In [11]:
def int_to_text(ints):
    try:
        ints = ints.numpy()
    except:
        pass
    return ''.join(idx2char[ints])

print(int_to_text(text_as_int[:13]))

First Citizen


# Creating training examples

Our task is to feed the model a sequence and have it return to use the next character. This means we need to split our text data from above into many shorter sequences that we can pass to the model as training examples.


In [13]:
seq_length = 100 #length of sequence for a training example
examples_per_epochs = len(text) // (seq_length+1) #determina cuantos ejemplos de entrenamiento se pueden generar a partir del texto completo

#create trianing examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

Now we can use the batch method to turn this stream of characters into batches of desired length

In [15]:
sequences = char_dataset.batch(seq_length+1, drop_remainder=True) #agrupa los caracteres en lotes para luego dividirlas en entradas y objetivos para el modelo

Now we need to use these sequences of length 101 and split them into input and output

In [16]:
def split_input_target (chunk): #for example hello
    input_text = chunk[:1] #hell
    target_text = chunk[1:] #ello
    return input_text, target_text #hell, ello

dataset = sequences.map(split_input_target) #we apply the function to every entry

In [17]:
for x, y in dataset.take(2):
    print('\n\nEXAMPLE\n')
    print('INPUT')
    print(int_to_text(x))
    print('\nOUTPUT')
    print(int_to_text(y))



EXAMPLE

INPUT
F

OUTPUT
irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


EXAMPLE

INPUT
a

OUTPUT
re all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you k


In [32]:
# We need to make training batches
BATCH_SIZE = 64 #nro de secuencias por lote
VOCAB_SIZE = len(vocab)
EMBEDDING_DIM = 256
RNN_UNITS = 1024

#Buffer size to shuffle the datset
BUFFER_SIZE = 10000 #tamanio del buffer para mezclar datos
data = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True) #mezcla los datos y los agrupa en lotes de 64 secuencias

### Building the model

We will use an embedding layer as LSTM and one dense layer that contains a node for each unique character in our training data. The dense layer will give us a probability distribution over all nodes.

In [42]:
def build_model(vocab_size=65, embedding_dim=256, rnn_units=1024, batch_size=1):
            model = tf.keras.Sequential([
                tf.keras.layers.Embedding(vocab_size, embedding_dim,
                                            input_shape=[None]),
                tf.keras.layers.LSTM(rnn_units,
                                     return_sequences=True,
                                     recurrent_initializer='glorot_uniform'),
                tf.keras.layers.Dense(vocab_size)
            ])
            return model

model = build_model (VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, BATCH_SIZE)
model.summary()

### Creating a loss function

We are going to create our own loss function for this problem. This is because our model will output a (64,sequence_length,65) shaped tensor that represents the probability distribution of each character at each timestep for everysequence in the batch

In [44]:
for input_example_batch, target_example_batch in data.take(1):
    example_batch_predictions = model(input_example_batch) #ask our model for a prediction on our first batch of training data
    print(example_batch_predictions.shape, ' # (batch_size, sequence_length, vocab_size)')

(64, 1, 65)  # (batch_size, sequence_length, vocab_size)


In [45]:
#we can see that the prediction is an array og 64 arrays, one of each entry in the batch
print(len(example_batch_predictions))
print(example_batch_predictions)

64
tf.Tensor(
[[[ 0.00084473 -0.0039167  -0.00027998 ... -0.00042276 -0.00242643
   -0.00126719]]

 [[-0.00045802 -0.0034497  -0.00278005 ...  0.00529857 -0.00026991
   -0.00157921]]

 [[-0.00104507  0.00662218 -0.00374693 ... -0.00279519  0.00117202
    0.00125124]]

 ...

 [[ 0.00281009 -0.00205802  0.00508987 ...  0.0080635   0.00622505
   -0.00433598]]

 [[-0.00130141  0.00303683  0.00358691 ... -0.0012959   0.00061572
    0.00091814]]

 [[-0.00064009 -0.00173238 -0.00069314 ...  0.00787447  0.0014259
    0.00496525]]], shape=(64, 1, 65), dtype=float32)


In [46]:
#lets examine one prediction
pred = example_batch_predictions[0]
print(len(pred))
print(pred)

1
tf.Tensor(
[[ 8.4472803e-04 -3.9167013e-03 -2.7998089e-04  3.6458466e-03
   3.3922936e-03 -3.2044337e-03 -9.9140592e-04 -8.4976144e-03
   1.3786200e-03 -4.5419512e-03  2.3674737e-03  4.9765897e-04
  -2.0140095e-03 -1.4755888e-03  2.0022567e-03 -4.1431710e-03
  -5.4329513e-03 -4.6823488e-04 -3.6489798e-03 -9.2500728e-04
   1.8572946e-03  8.1285305e-04  1.4409475e-03 -3.8316734e-03
   1.2534949e-03 -2.6387624e-03  3.2205116e-05 -6.7871953e-03
   3.3915101e-03  4.2421701e-03  4.6426882e-03  4.1499394e-03
  -7.2288765e-03  3.8778494e-04 -2.5696394e-03 -5.7106027e-03
   6.5895268e-03  4.7666919e-03 -2.5213219e-04  3.5457001e-03
   5.9067360e-03  3.2069664e-03  6.5590777e-03  1.4707451e-03
   6.0625141e-03  1.0940537e-03  2.4661864e-03 -3.3347628e-03
  -5.5747312e-03  4.8583322e-03  1.8465486e-03  5.7611354e-03
  -2.5521400e-03 -4.2813770e-03  1.7337244e-03 -7.0099919e-03
  -4.1173031e-03  1.0714016e-03 -2.8384864e-04 -1.4552565e-03
  -5.5819270e-03  6.3846060e-03 -4.2276474e-04 -2.4264269

In [47]:
#finally, we look at a prediction at the first timestep
time_pred = pred[0]
print(len(time_pred))
print(time_pred)
#65 values reoresenting the probability of each character occuring next

65
tf.Tensor(
[ 8.4472803e-04 -3.9167013e-03 -2.7998089e-04  3.6458466e-03
  3.3922936e-03 -3.2044337e-03 -9.9140592e-04 -8.4976144e-03
  1.3786200e-03 -4.5419512e-03  2.3674737e-03  4.9765897e-04
 -2.0140095e-03 -1.4755888e-03  2.0022567e-03 -4.1431710e-03
 -5.4329513e-03 -4.6823488e-04 -3.6489798e-03 -9.2500728e-04
  1.8572946e-03  8.1285305e-04  1.4409475e-03 -3.8316734e-03
  1.2534949e-03 -2.6387624e-03  3.2205116e-05 -6.7871953e-03
  3.3915101e-03  4.2421701e-03  4.6426882e-03  4.1499394e-03
 -7.2288765e-03  3.8778494e-04 -2.5696394e-03 -5.7106027e-03
  6.5895268e-03  4.7666919e-03 -2.5213219e-04  3.5457001e-03
  5.9067360e-03  3.2069664e-03  6.5590777e-03  1.4707451e-03
  6.0625141e-03  1.0940537e-03  2.4661864e-03 -3.3347628e-03
 -5.5747312e-03  4.8583322e-03  1.8465486e-03  5.7611354e-03
 -2.5521400e-03 -4.2813770e-03  1.7337244e-03 -7.0099919e-03
 -4.1173031e-03  1.0714016e-03 -2.8384864e-04 -1.4552565e-03
 -5.5819270e-03  6.3846060e-03 -4.2276474e-04 -2.4264269e-03
 -1.267188

In [48]:
#if we want to determine the predicted character we need to sample the output distribution (pick a value based on probability)
sampled_indices = tf.random.categorical(pred,num_samples=1)
#now we can reshape that array and convert all the integers to numbers to see the actual characters
sampled_indices = np.reshape(sampled_indices, (1,-1))[0]
predicted_chars = int_to_text(sampled_indices)
predicted_chars #and this is what the model predicted for training sequence 1

predicted_chars


'X'

In [49]:
def loss (labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

### Compiling the model

At this point we can think of our problem as a classification problem where the model predicts the probability of each unique letter coming next

In [50]:
model.compile(optimizer='adam', loss=loss)


### Training

In [51]:
checkpint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpint_dir, 'ckpt_{epoch}')

checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath = checkpoint_prefix,
    save_weights_only=True
)

ValueError: When using `save_weights_only=True` in `ModelCheckpoint`, the filepath provided must end in `.weights.h5` (Keras weights format). Received: filepath=./training_checkpoints\ckpt_{epoch}

In [None]:
history = model.fit(data, epochs=40, callbacks=[checkpoint_callback])