**RNN PLAY GENERATOR**

We are going to use a *RNN* to generate a play. We will show the *RNN* an example of something we want it to recreate and it will learn how to write a version on of it on its own. Based on: https://www.tensorflow.org/tutorials/text/text_generation

In [1]:
from keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np

2025-03-07 18:03:20.740184: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-03-07 18:03:21.012105: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1741381401.127592    5347 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1741381401.160876    5347 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-03-07 18:03:21.425665: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

In [2]:
# DOWNLOADING THE DATASET

# Loading romeo and juliet shakespeare play
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 
                                       'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

# OR IF I WANTED TO LOAD MY OWN DATA I CAN JUST (TXT FILE ONLY)

# from google.colab import files
# path_to_file = list(files.upload().keys())[0] 

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt
[1m1115394/1115394[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


In [3]:
# READ CONTENTS OF FILE

text = open(path_to_file, 'rb').read().decode(encoding='utf-8') # read and decode to py2 compat
print('Text length: {} characters\n'.format(len(text)))
print(text[:250])

Text length: 1115394 characters

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



In [4]:
# ENCODING
# we are going to encode each unique character as a different integer

vocab = sorted(set(text))

# creating the StringLookup layer
ids_from_chars = tf.keras.layers.StringLookup( 
    vocabulary=list(vocab), mask_token=None)

# same layer but to do the opposite
chars_from_ids = tf.keras.layers.StringLookup(
    vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)

char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

def text_to_int(text):
    return np.array([char2idx[c] for c in text])

text_as_int = text_to_int(text)

# DECODING
# Function that do the opposite (numeric to text)
def int_to_text(ints):
    try:
        ints = ints.numpy()
    except:
        pass
    return ''.join(idx2char[ints])

print("Text:", text[:13])
print("Encoded:", text_to_int(text[:13]))
print("Decoded:", int_to_text(text_as_int[:13]))

I0000 00:00:1741381415.708027    5347 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3542 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


Text: First Citizen
Encoded: [18 47 56 57 58  1 15 47 58 47 64 43 52]
Decoded: First Citizen


In [5]:
# CREATING TRAINING EXAMPLES
# we need to to split our data from above into many shorter sequences that we can pass to the model as training examples
# will use a seq_length sequence as input and a seq_length sequence as the output, where the original one is shifted
# one letter to the right as below
''' INPUT: Hell || OUTPUT: ello '''

seq_length = 100 
examples_per_epoch = len(text)//(seq_length+1)

char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int) # create training examples/targets

# Using the batch method to turn this stream of characters into batches of desired length
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

In [6]:
# Splitting those sequences into input and output

def split_input_target(chunk): # Hello
    input_text = chunk[:-1] # hell
    target_text = chunk[1:] # ello
    return input_text, target_text

dataset = sequences.map(split_input_target) # using MAP to apply the function to every entry

# peeking at some examples:
for x, y in dataset.take(2):
    print("\n\nEXAMPLE\nINPUT:", int_to_text(x))
    print("\nOUTPUT:", int_to_text(y))



EXAMPLE
INPUT: First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You

OUTPUT: irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


EXAMPLE
INPUT: are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you 

OUTPUT: re all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you k


2025-03-07 18:03:44.716963: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


In [7]:
# MAKING TRAINING BATCHES

BATCH_SIZE = 64
VOCAB_SIZE = len(vocab) # number of unique characters
EMBEDDING_DIM = 256
RNN_UNITS = 1024
BUFFER_SIZE = 10000 # Buffer size to shuffle the dataset 

data = ( # shuffling the data maintaining a buffer
    dataset
    .shuffle(BUFFER_SIZE)
    .batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE)) 

In [8]:
# BUILDING THE MODEL
# We will be using a embedding layer, a LSTM and one dense layer that contains a node for each unique character in train data.

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Input(batch_size=batch_size, 
                              shape=[None,]),
        tf.keras.layers.Embedding(vocab_size, 
                                  embedding_dim),  
        tf.keras.layers.LSTM(rnn_units, 
                             return_sequences=True, 
                             stateful=True, 
                             recurrent_initializer='glorot_uniform'),
        tf.keras.layers.Dense(vocab_size)
    ])
    return model

model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, BATCH_SIZE)
model.summary()


**CREATING A LOSS FUNCTION**

Actually creating our own loss function. Because our model will output a (64, sequence_length, 65) shaped tensor 
that represents the probability distribution of each character at each timestep for every sequence in the batch

In [9]:
# looking  at a sample input and the output from our untrained model (to understand what the model is giving us)

for input_example_batch, target_example_batch in data.take(1):
    example_batch_predictions = model(input_example_batch) # ask our mopdel for a predition on our first batch of train data
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

I0000 00:00:1741381452.123957    5695 cuda_dnn.cc:529] Loaded cuDNN version 90300


(64, 100, 65) # (batch_size, sequence_length, vocab_size)


2025-03-07 18:04:12.371202: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


In [10]:
# the prediction is an array of 64 arrays, one for each entry in the batch

print(len(example_batch_predictions))
print(example_batch_predictions)

64
tf.Tensor(
[[[-1.96854398e-03 -4.06005347e-06  2.72023561e-03 ...  1.47894592e-04
   -4.24593966e-03  4.34238883e-03]
  [ 5.63524780e-04 -2.21901410e-03  2.98735965e-03 ...  8.23849812e-03
    2.41841725e-03  3.93206673e-03]
  [ 4.06407844e-03  6.05705834e-04  4.11187578e-03 ...  6.77464856e-03
   -2.27501616e-03  2.21639406e-03]
  ...
  [-7.16114591e-04 -1.47113705e-03 -4.07920731e-03 ... -5.86260343e-03
   -3.07614054e-03  8.27086624e-03]
  [ 3.79860261e-03  1.40824879e-04 -3.52426013e-03 ... -2.50373571e-03
   -5.25681162e-03  7.07595702e-03]
  [ 1.98652130e-03 -4.02235286e-03 -4.53911070e-03 ... -6.30303461e-04
   -7.65364757e-03  8.51250347e-03]]

 [[ 6.47204835e-03 -2.52244575e-03  3.17517365e-03 ... -2.19842908e-03
   -1.32966763e-03  6.30191993e-04]
  [ 8.63817986e-03  1.17235968e-03  3.47405975e-03 ... -7.84174306e-04
   -2.19022436e-03  3.04889376e-03]
  [ 4.11198288e-03  2.61287880e-03  3.60067328e-03 ...  1.56026625e-03
    2.56186537e-03  1.02714375e-02]
  ...
  [ 6.151

In [11]:
# Examination of one prediction (2d array of length 100, where each interior array is the prediction for the next character)

pred = example_batch_predictions[0]
print(len(pred))
print(pred)

100
tf.Tensor(
[[-1.9685440e-03 -4.0600535e-06  2.7202356e-03 ...  1.4789459e-04
  -4.2459397e-03  4.3423888e-03]
 [ 5.6352478e-04 -2.2190141e-03  2.9873597e-03 ...  8.2384981e-03
   2.4184173e-03  3.9320667e-03]
 [ 4.0640784e-03  6.0570583e-04  4.1118758e-03 ...  6.7746486e-03
  -2.2750162e-03  2.2163941e-03]
 ...
 [-7.1611459e-04 -1.4711370e-03 -4.0792073e-03 ... -5.8626034e-03
  -3.0761405e-03  8.2708662e-03]
 [ 3.7986026e-03  1.4082488e-04 -3.5242601e-03 ... -2.5037357e-03
  -5.2568116e-03  7.0759570e-03]
 [ 1.9865213e-03 -4.0223529e-03 -4.5391107e-03 ... -6.3030346e-04
  -7.6536476e-03  8.5125035e-03]], shape=(100, 65), dtype=float32)


In [12]:
# A prediction at the first timestep (65 values representing the probability of each character occurring next)

time_pred = pred[0]
print(len(time_pred))
print(time_pred)

65
tf.Tensor(
[-1.9685440e-03 -4.0600535e-06  2.7202356e-03 -3.4619339e-03
  3.8295788e-05  3.4039812e-03  4.4636996e-04 -1.0261127e-03
  3.5417613e-03 -8.9507521e-04  1.7350902e-03  4.4658128e-04
 -2.4777646e-03  7.5575238e-04 -2.8466645e-03 -1.4116968e-03
 -2.7982614e-04  3.6650426e-03  3.0770232e-03  1.2445184e-03
 -3.1421266e-03 -3.0151573e-03  3.8424530e-03 -8.3704939e-04
 -8.0422452e-03  1.7571847e-03 -6.1106654e-03  1.8430933e-03
 -1.7096031e-04 -1.0426819e-03  3.3868779e-03 -6.1513754e-05
 -6.9712028e-03 -3.6431253e-03 -3.1073650e-04  9.9349848e-04
 -3.1357654e-03 -3.0471792e-03  1.8864697e-03  3.4473019e-03
  3.6862069e-03  2.5113180e-04 -2.6744672e-03  1.6955547e-03
  4.0773940e-03  4.5194463e-03  3.1239199e-03  3.0303988e-04
 -4.2824759e-04  2.3619139e-03 -2.1978570e-03  5.6861914e-03
 -6.9150759e-04 -1.5157755e-03 -6.2308386e-03 -2.2787327e-04
 -1.6843302e-03  4.4376603e-03 -2.2571564e-03 -7.6102716e-04
 -2.3208600e-03  1.6376371e-03  1.4789459e-04 -4.2459397e-03
  4.342388

In [13]:
# Sample the output distribution (picking a value based on probability)

sampled_indices = tf.random.categorical(pred, num_samples=1)

# reshape that array and convert all the integers to numbers to see the actual characters
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]
predicted_chars = int_to_text(sampled_indices)
predicted_chars # this is what the model predicts for sequence 1

'NqB?F$pRTjiXfg?loEfkf!AAfsa,:MWDGiDlRQdtV\nJYfS\neMkQ&Tk3lZPG.ef3$svbKa;N-wcv-Y$Gt? Jrd,C$smCQyKCorkW '

In [14]:
# CREATION OF THE LOSS FUNCTION
# The loss function needs to compare the output to the expected output and give a numeric value of how close the two were

def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

In [15]:
# COMPILING THE MODEL

model.compile(optimizer='adam', loss=loss)

In [16]:
# CREATING CHECKPOINTS
# allowing us to load our model from a checkpoint and continue training it

checkpoint_dir = './RNN_PG_training_checkpoints2' # directory will be saving it
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}.weights.h5") # file name

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True
)

In [18]:
# TRAINING THE MODEL

history = model.fit(data, epochs=15, callbacks=[checkpoint_callback])

Epoch 1/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 46ms/step - loss: 2.9574
Epoch 2/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 46ms/step - loss: 1.9205
Epoch 3/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 46ms/step - loss: 1.6544
Epoch 4/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 45ms/step - loss: 1.5204
Epoch 5/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 45ms/step - loss: 1.4443
Epoch 6/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 45ms/step - loss: 1.3884
Epoch 7/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 45ms/step - loss: 1.3512
Epoch 8/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 44ms/step - loss: 1.3193
Epoch 9/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 44ms/step - loss: 1.2876
Epoch 10/15
[1m172/172[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 4

In [20]:
# LOADING THE MODEL
# Rebuilding the model from a checkpoint 

checkpoint_num = 15
model.load_weights("./RNN_PG_training_checkpoints2/ckpt_" + str(checkpoint_num) + ".weights.h5")
model.build(tf.TensorShape([64, None]))

In [21]:
def generate_text(model, start_string, char2idx, idx2char):
    num_generate = 800  # number of characters to generate
    
    # Vectorize input string
    input_eval = [char2idx[s] for s in start_string]
    input_eval = tf.expand_dims(input_eval, 0)
    
    text_generated = []  # empty list to store results
    
    # Temperature parameter controls randomness
    temperature = 1.0
    
    for i in range(num_generate):
        predictions = model(input_eval)
        
        # Get the predictions for the last character in the sequence
        predictions = predictions[0, -1, :]  # Shape [vocab_size]
        
        # Apply temperature scaling
        predictions = predictions / temperature
        
        # Sample from the distribution
        predicted_id = tf.random.categorical(tf.expand_dims(predictions, 0), num_samples=1)[0, 0].numpy()
        
        # Add predicted character to generated text
        text_generated.append(idx2char[predicted_id])
        
        # Prepare the next input (just the predicted character)
        input_eval = tf.expand_dims([predicted_id], 0)
    
    return (start_string + ''.join(text_generated))

In [22]:
inp = input("Type a starting string: ")
print(generate_text(model, inp, char2idx, idx2char))

2025-03-07 18:09:03.869472: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:04.466743: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:04.466820: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
	 [[{{node CudnnRNNV3}}]]
2025-03-07 18:09:04.511995: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:04.545772: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:04.583584: W tensorflow/core/f

romeofalty,
Proclaifising sovereign.

First Murderer:
As if thou wert, nor meet forgive before divint.
Dost Clifford and our probe noble finds,
The wrong, speak good, down to break with maid you were,
And now by thy other's vieat yet together, all
the true service: Tranio, to part your father. say, that I
Will take her uncle fast, or I would be ruled
And bring the traitor with this business?

HORTENSIO:
Sir, what say I did? What shouldged,
That gave the effected hearts? Our company
So frank i' the vantage, how do ere
he is deeds, when you had presume,
It calleated, and not to be are so,
Time of Rome, trush, to marry him ous a bride alt,
and makes our farther flies that frownd and mistake.

MENENIUS:
Oy, if you be soper than; I'll agreated
Ten thousand drums cont but each ears and present.

KING


2025-03-07 18:09:33.280116: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:33.317546: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:33.351941: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:33.384180: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:33.416388: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at cudnn_rnn_ops.cc:1723 : INVALID_ARGUMENT: Invalid input_h shape: [64,1,1024] [1,1,1024]
2025-03-07 18:09:33.448034: W tensorflow/core/framework/op_kernel.cc:1841] 