**RNN PLAY GENERATOR**

We are going to use a *RNN* to generate a play. We will show the *RNN* an example of something we want it to recreate and it will learn how to write a version on of it on its own. Based on: https://www.tensorflow.org/tutorials/text/text_generation

In [1]:
from keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np

In [2]:
# DOWNLOADING THE DATASET

# Loading romeo and juliet shakespeare play
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 
                                       'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

# OR IF I WANTED TO LOAD MY OWN DATA I CAN JUST (TXT FILE ONLY)

# from google.colab import files
# path_to_file = list(files.upload().keys())[0] 

In [3]:
# READ CONTENTS OF FILE

text = open(path_to_file, 'rb').read().decode(encoding='utf-8') # read and decode to py2 compat
print('Text length: {} characters\n'.format(len(text)))
print(text[:250])

Text length: 1115394 characters

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



In [4]:
# ENCODING
# we are going to encode each unique character as a different integer

vocab = sorted(set(text))

# creating the StringLookup layer
ids_from_chars = tf.keras.layers.StringLookup( 
    vocabulary=list(vocab), mask_token=None)

# same layer but to do the opposite
chars_from_ids = tf.keras.layers.StringLookup(
    vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)

char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

def text_to_int(text):
    return np.array([char2idx[c] for c in text])

text_as_int = text_to_int(text)

# DECODING
# Function that do the opposite (numeric to text)
def int_to_text(ints):
    try:
        ints = ints.numpy()
    except:
        pass
    return ''.join(idx2char[ints])

print("Text:", text[:13])
print("Encoded:", text_to_int(text[:13]))
print("Decoded:", int_to_text(text_as_int[:13]))

Text: First Citizen
Encoded: [18 47 56 57 58  1 15 47 58 47 64 43 52]
Decoded: First Citizen


In [5]:
# CREATING TRAINING EXAMPLES
# we need to to split our data from above into many shorter sequences that we can pass to the model as training examples
# will use a seq_length sequence as input and a seq_length sequence as the output, where the original one is shifted
# one letter to the right as below
''' INPUT: Hell || OUTPUT: ello '''

seq_length = 100 
examples_per_epoch = len(text)//(seq_length+1)

char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int) # create training examples/targets

# Using the batch method to turn this stream of characters into batches of desired length
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

In [6]:
# Splitting those sequences into input and output

def split_input_target(chunk): # Hello
    input_text = chunk[:-1] # hell
    target_text = chunk[1:] # ello
    return input_text, target_text

dataset = sequences.map(split_input_target) # using MAP to apply the function to every entry

# peeking at some examples:
for x, y in dataset.take(2):
    print("\n\nEXAMPLE\nINPUT:", int_to_text(x))
    print("\nOUTPUT:", int_to_text(y))



EXAMPLE
INPUT: First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You

OUTPUT: irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


EXAMPLE
INPUT: are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you 

OUTPUT: re all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you k


In [7]:
# MAKING TRAINING BATCHES

BATCH_SIZE = 64
VOCAB_SIZE = len(vocab) # number of unique characters
EMBEDDING_DIM = 256
RNN_UNITS = 1024
BUFFER_SIZE = 10000 # Buffer size to shuffle the dataset 

data = ( # shuffling the data maintaining a buffer
    dataset
    .shuffle(BUFFER_SIZE)
    .batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE)) 

In [8]:
# BUILDING THE MODEL
# We will be using a embedding layer, a LSTM and one dense layer that contains a node for each unique character in train data.

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Input(batch_size=batch_size, 
                              shape=[None,]),
        tf.keras.layers.Embedding(vocab_size, 
                                  embedding_dim),  
        tf.keras.layers.LSTM(rnn_units, 
                             return_sequences=True, 
                             stateful=True, 
                             recurrent_initializer='glorot_uniform'),
        tf.keras.layers.Dense(vocab_size)
    ])
    return model

model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, BATCH_SIZE)
model.summary()


**CREATING A LOSS FUNCTION**

Actually creating our own loss function. Because our model will output a (64, sequence_length, 65) shaped tensor 
that represents the probability distribution of each character at each timestep for every sequence in the batch

In [9]:
# looking  at a sample input and the output from our untrained model (to understand what the model is giving us)

for input_example_batch, target_example_batch in data.take(1):
    example_batch_predictions = model(input_example_batch) # ask our mopdel for a predition on our first batch of train data
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

(64, 100, 65) # (batch_size, sequence_length, vocab_size)


In [10]:
# the prediction is an array of 64 arrays, one for each entry in the batch

print(len(example_batch_predictions))
print(example_batch_predictions)

64
tf.Tensor(
[[[-1.5734300e-03  4.6204082e-03 -5.5786385e-03 ...  5.4804608e-03
    2.1179349e-03  7.4942363e-04]
  [-4.8483294e-03  7.3886430e-03 -4.3548988e-03 ...  4.9965410e-03
   -1.1944374e-03  1.6559919e-03]
  [-4.1226028e-03  6.4025582e-03 -4.4738958e-03 ...  1.1736347e-03
    2.0886306e-03  3.8021035e-04]
  ...
  [-8.5923159e-03  5.6818016e-03 -4.0263501e-03 ... -1.0340010e-02
   -1.2922564e-03  6.1157295e-03]
  [-8.9632273e-03  5.0648283e-03 -4.2159297e-03 ... -9.7164242e-03
    6.0256356e-03  4.2135273e-03]
  [-9.0943985e-03  7.1077449e-03 -1.0176841e-02 ... -3.5476063e-03
    5.9917886e-03  4.3523982e-03]]

 [[ 2.4283074e-03  3.1801707e-03  5.2154232e-03 ... -9.9292421e-04
    2.4134885e-03  4.5613199e-03]
  [-2.1494378e-03  7.0607150e-03  3.7700222e-03 ... -5.9401756e-04
   -1.0570920e-03  3.8574662e-03]
  [-3.0913348e-03  5.1345997e-03  7.4095314e-04 ... -2.3850887e-03
   -1.4665499e-03  3.7637157e-03]
  ...
  [-8.2229115e-03  2.4397820e-03 -3.5996814e-03 ... -5.0724423e

In [11]:
# Examination of one prediction (2d array of length 100, where each interior array is the prediction for the next character)

pred = example_batch_predictions[0]
print(len(pred))
print(pred)

100
tf.Tensor(
[[-0.00157343  0.00462041 -0.00557864 ...  0.00548046  0.00211793
   0.00074942]
 [-0.00484833  0.00738864 -0.0043549  ...  0.00499654 -0.00119444
   0.00165599]
 [-0.0041226   0.00640256 -0.0044739  ...  0.00117363  0.00208863
   0.00038021]
 ...
 [-0.00859232  0.0056818  -0.00402635 ... -0.01034001 -0.00129226
   0.00611573]
 [-0.00896323  0.00506483 -0.00421593 ... -0.00971642  0.00602564
   0.00421353]
 [-0.0090944   0.00710774 -0.01017684 ... -0.00354761  0.00599179
   0.0043524 ]], shape=(100, 65), dtype=float32)


In [12]:
# A prediction at the first timestep (65 values representing the probability of each character occurring next)

time_pred = pred[0]
print(len(time_pred))
print(time_pred)

65
tf.Tensor(
[-1.5734300e-03  4.6204082e-03 -5.5786385e-03  3.1328262e-03
  2.5488259e-03  1.1373349e-03  2.1269997e-03  1.1709330e-03
 -1.4056347e-03  7.6413795e-04 -5.6327092e-03  3.5331040e-03
  5.9535261e-05 -3.5696612e-03  3.3087162e-03  4.2239451e-03
 -5.8806292e-04  2.3046955e-03 -5.6508100e-03 -2.7600070e-03
 -1.0754797e-04 -5.7352521e-04  1.1849438e-03 -1.7078919e-04
  6.2300009e-03 -4.8578568e-03 -1.9640685e-03 -3.8548068e-03
 -4.5219050e-03  8.1414748e-03 -6.5348169e-04 -1.5593697e-03
 -2.7652807e-04 -3.4174628e-03 -1.9910380e-03 -9.3363579e-03
  5.8505400e-03 -4.1685752e-03  1.8875634e-03 -4.7506667e-03
  5.7707522e-03  2.0410002e-03  4.5604544e-04  6.6733062e-03
 -2.4346036e-03  4.7878451e-03  4.9976055e-03 -9.3357614e-04
 -3.6273650e-03  4.7359955e-03 -1.1433380e-03 -5.1783142e-04
  6.0219970e-03  4.6808347e-03 -2.4821749e-04 -1.9831012e-03
  3.8046035e-04  8.2397519e-04  1.8877513e-04 -5.4651136e-03
 -2.6196926e-03  2.2336445e-03  5.4804608e-03  2.1179349e-03
  7.494236

In [13]:
# Sample the output distribution (picking a value based on probability)

sampled_indices = tf.random.categorical(pred, num_samples=1)

# reshape that array and convert all the integers to numbers to see the actual characters
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]
predicted_chars = int_to_text(sampled_indices)
predicted_chars # this is what the model predicts for sequence 1

'wyZgzmwFhswCNySvGh.fLzk&\nf$:SyJIjMwrTz$UJeIQBWRz,HV?.X,3ruRU:!i snQl.EJit,gTJsPKMhhuLsaXjjnQpqGUQzbg'

In [14]:
# CREATION OF THE LOSS FUNCTION
# The loss function needs to compare the output to the expected output and give a numeric value of how close the two were

def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

In [15]:
# COMPILING THE MODEL

model.compile(optimizer='adam', loss=loss)

In [16]:
# CREATING CHECKPOINTS
# allowing us to load our model from a checkpoint and continue training it

checkpoint_dir = './RNN_PG_training_checkpoints' # directory will be saving it
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}.weights.h5") # file name

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True
)

In [35]:
# TRAINING THE MODEL

history = model.fit(data, epochs=15, callbacks=[checkpoint_callback])

Epoch 1/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m149s[0m 861ms/step - loss: 1.5590
Epoch 2/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m158s[0m 909ms/step - loss: 1.4497
Epoch 3/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m156s[0m 899ms/step - loss: 1.3822
Epoch 4/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m157s[0m 908ms/step - loss: 1.3312
Epoch 5/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m160s[0m 925ms/step - loss: 1.2936
Epoch 6/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m162s[0m 935ms/step - loss: 1.2590
Epoch 7/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m154s[0m 891ms/step - loss: 1.2248
Epoch 8/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m157s[0m 907ms/step - loss: 1.1954
Epoch 9/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m153s[0m 881ms/step - loss: 1.1627
Epoch 10/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[

In [36]:
# LOADING THE MODEL
# Rebuilding the model from a checkpoint 

checkpoint_num = 15
model.load_weights("./RNN_PG_training_checkpoints/ckpt_" + str(checkpoint_num) + ".weights.h5")
model.build(tf.TensorShape([1, None]))

In [38]:
def generate_text(model, start_string, char2idx, idx2char):
    num_generate = 800  # number of characters to generate
    
    # Vectorize input string
    input_eval = [char2idx[s] for s in start_string]
    input_eval = tf.expand_dims(input_eval, 0)
    
    text_generated = []  # empty list to store results
    
    # Temperature parameter controls randomness
    temperature = 1.0
    
    for i in range(num_generate):
        predictions = model(input_eval)
        
        # Get the predictions for the last character in the sequence
        predictions = predictions[0, -1, :]  # Shape [vocab_size]
        
        # Apply temperature scaling
        predictions = predictions / temperature
        
        # Sample from the distribution
        predicted_id = tf.random.categorical(tf.expand_dims(predictions, 0), num_samples=1)[0, 0].numpy()
        
        # Add predicted character to generated text
        text_generated.append(idx2char[predicted_id])
        
        # Prepare the next input (just the predicted character)
        input_eval = tf.expand_dims([predicted_id], 0)
    
    return (start_string + ''.join(text_generated))

In [39]:
inp = input("Type a starting string: ")
print(generate_text(model, inp, char2idx, idx2char))

romeous news:
Throw died thy mother, wife, have warm or lie;
And who possess'd them that with a guest
Marr'd for his shame; and here I strive to-morrow,
Or never yet this vount to high a dozen of the park,
Have thought i' the might have given alightent,
With that slept with thems! were I but think
Incurted to his chiefest finger,
Hoothing abuse much more, or gallows, dresses thee at hangs:
For, being tricks now, the Duke of Norfolk, boy.

MIRANDA:
He; and, my lord, thy kinsman moved, had
conscience to your great course to know a whoreison
That would believe me: ther straight restrained so.

Tribundance took you shield not her offences,
A circumstance, is now home more heads your king.
Rost know, gentle vanotock them down;
Grace of my brafs can king, I'll wear thy strength,
Yet by blood came fro
