<h1 style=\"text-align: center; font-size: 50px;\"> 📜 Text Generation with Neural Networks and Tensorflow</h1>

In this notebook our objective is to demonstrate how to generate text using a character-based RNN and Tensorflow working with a dataset of Shakespeare's  writing

Notebook Overview
- Start Execution
- Install and Import Libraries
- Configure Settings
- Verify Assets
- Get Text Data
- Preparing textual data
- Text Vectorization
- Creating Training Batches
- Creating the GRU Model
- Instance of the Model
- Training the model
- Saving the Model
- Load Model
- Generating Predictions

## Start Execution

In [1]:
import logging
import time

# Configure logger
logger: logging.Logger = logging.getLogger("run_workflow_logger")
logger.setLevel(logging.INFO)
logger.propagate = False  # Prevent duplicate logs from parent loggers

# Set formatter
formatter: logging.Formatter = logging.Formatter(
    fmt="%(asctime)s - %(levelname)s - %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S"
)

# Configure and attach stream handler
stream_handler: logging.StreamHandler = logging.StreamHandler()
stream_handler.setFormatter(formatter)
logger.addHandler(stream_handler)

In [2]:
start_time = time.time()  

logger.info("Notebook execution started.")

2025-08-20 17:56:11 - INFO - Notebook execution started.


## Install and Import Libraries

In [3]:
%%time

%pip install -r ../requirements.txt --quiet

# ------------------------ Standard Library Imports ------------------------
import logging
import warnings
from pathlib import Path
from datetime import datetime

# ------------------------ Third-Party Libraries ------------------------
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Embedding, GRU, InputLayer
from tensorflow.keras.losses import sparse_categorical_crossentropy
from tensorflow.keras.models import load_model
from tensorflow.keras.callbacks import TensorBoard
import torch
from torch import nn


Note: you may need to restart the kernel to use updated packages.


2025-08-20 17:56:18.804249: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-08-20 17:56:19.070750: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1755712579.211481     302 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1755712579.254066     302 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-08-20 17:56:19.528192: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

CPU times: user 11 s, sys: 5.94 s, total: 17 s
Wall time: 17.9 s


## Configure Settings

In [4]:
warnings.filterwarnings("ignore")

In [5]:
# Define global experiment and run names to be used throughout the notebook
MODEL_NAME = "tf_rnn_model.h5"

# Set up the paths
DATA_PATH = "../data/shakespeare.txt"
TENSORBOARD_PATH = "/phoenix/tensorboard/tensorlogs"


# Set up the chunk separator for text processing
CHUNK_SEPARATOR = "\n\n"

In [6]:
start_time_all_execution = datetime.now() # This variable is to help us to see in how much time this notebook will run

## Verify Assets

In [7]:
# Check whether the Dataset file exists
is_dataset_available = Path(DATA_PATH).exists()

# Log the configuration status of the dataset
if is_dataset_available:
    logger.info("The Dataset is properly configured.")
else:
    logger.info(
        "The Dataset is not properly configured. Please check if Dataset was downloaded"
        "in your project on AI Studio."
    )

2025-08-20 17:56:29 - INFO - The Dataset is properly configured.


In [8]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cuda


## Get Text Data

This is the text we'll use as a basis for our generations: let's try to generate 'Shakespearean' texts.

This text is from Shakespeare's Sonnet 1. It's one of the 154 sonnets written by William Shakespeare that were first published in 1609. This particular sonnet, like many others, discusses themes of beauty, procreation, and the transient nature of life, urging the beautiful to reproduce so their beauty can live on through their offspring.

In [9]:
path_to_file = DATA_PATH
text = open(path_to_file, 'r').read()

In [10]:
logger.info('First 600 chars: \n')
print(text[:600])

2025-08-20 17:56:30 - INFO - First 600 chars: 




                     1
  From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But as the riper should by time decease,
  His tender heir might bear his memory:
  But thou contracted to thine own bright eyes,
  Feed'st thy light's flame with self-substantial fuel,
  Making a famine where abundance lies,
  Thy self thy foe, to thy sweet self too cruel:
  Thou that art now the world's fresh ornament,
  And only herald to the gaudy spring,
  Within thine own bud buriest thy content,
  And tender churl mak'st waste in niggarding:
    Pity the world, or else th


## Preparing textual data

We need to encode our data to give the model a proper numerical representation of our text.

In [11]:
# creates a set of unique characters found in the text
vocab = sorted(set(text))

## Text Vectorization

In [12]:
char_to_int = {u:i for i, u in enumerate(vocab)}
# assigns a unique integer to each character in a dictionary format, 
# creating a mapping that can later be used to transform encoded predictions back into characters

In [13]:
int_to_char = np.array(vocab)
# reverses the decoder dictionary, providing a mapping from characters to their respective assigned integers, which is used to encode the text.

In [14]:
encoded_text = np.array([char_to_int[c] for c in text])
# encodes the entire text as an array of integers, with each integer representing the character at that position
# in the text according to the encoder dictionary

## Creating Training Batches

Training batches are a way of dividing the dataset into smaller, manageable groups of data points that are fed into a machine learning model during the training process.

In [15]:
seq_len = 120 # length of sequence for a training example
total_num_seq = len(text)//(seq_len+1) # total number of training examples

# Create Training Sequences
char_dataset = tf.data.Dataset.from_tensor_slices(encoded_text)
sequences = char_dataset.batch(seq_len+1, drop_remainder=True)

I0000 00:00:1755712591.299069     302 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13021 MB memory:  -> device: 0, name: NVIDIA RTX 5000 Ada Generation Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


In [16]:
def create_seq_targets(seq):
    """
    Function that takes a sequence as input, duplicates, and shifts it to align the input and label. 

    Args:
        seq: sequence of characters

    Returns:
        The text input and corresponding target.
    """
    try:
        input_txt = seq[:-1]
        target_txt = seq[1:]
        return input_txt, target_txt
    except Exception as e:
            logger.error(f"Error creating sequences of targets: {str(e)}")

In [17]:
dataset = sequences.map(create_seq_targets)

In [18]:
# Batch size
batch_size = 128
buffer_size = 10000

dataset = dataset.shuffle(buffer_size).batch(batch_size, drop_remainder=True)

## Creating the GRU Model

In [19]:
# Length of the vocabulary in chars
vocab_size = len(vocab)
# The embedding dimension
embed_dim = 64
# Number of RNN units
rnn_neurons = 1026

In [20]:
def sparse_cat_loss(y_true,y_pred):
  return sparse_categorical_crossentropy(y_true, y_pred, from_logits=True)

In [21]:
def create_model(vocab_size, embed_dim, rnn_neurons, batch_size):
    """Architecture to create the model.

    Args:
        vocab_size: Length of the vocabulary in chars.
        embed_dim: Embedding dimension.
        rnn_neurons: Number of RNN units.
        batch_size: Size of the batchs.

    Returns:
        Model.
    """
    try:
        model = Sequential()
        model.add(InputLayer(batch_shape=(batch_size, None)))
        
        model.add(Embedding(input_dim=vocab_size, output_dim=embed_dim))

        model.add(GRU(rnn_neurons,
                    return_sequences=True,
                    stateful=True,
                    recurrent_initializer='glorot_uniform'))

        model.add(Dense(vocab_size))
        model.compile(optimizer='adam', loss=sparse_cat_loss)
        logger.info("Model architecture created successfully")
        return model
    except Exception as e:
            logger.error(f"Error creating model architecture: {str(e)}")

model = create_model(vocab_size, embed_dim, rnn_neurons, batch_size)
model.summary()

2025-08-20 17:56:33 - INFO - Model architecture created successfully


## Instance of the Model

In [22]:
model = create_model(
    vocab_size=vocab_size,
    embed_dim=embed_dim,
    rnn_neurons=rnn_neurons,
    batch_size=batch_size
)


2025-08-20 17:56:33 - INFO - Model architecture created successfully


In [23]:
# TensorBoard
log_dir = TENSORBOARD_PATH
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)

## Training the model

In [24]:
for input_example_batch, target_example_batch in dataset.take(1):

  # Predict off some random batch
  example_batch_predictions = model(input_example_batch)

  # Display the dimensions of the predictions
  print(example_batch_predictions.shape, " <=== (batch_size, sequence_length, vocab_size)")

I0000 00:00:1755712597.921392     449 cuda_dnn.cc:529] Loaded cuDNN version 90300


(128, 120, 84)  <=== (batch_size, sequence_length, vocab_size)


2025-08-20 17:56:38.142572: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


In [25]:
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
# Reformat to not be a lists of lists
sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()

In [26]:
%%time

epochs = 20
model.fit(dataset,epochs=epochs, callbacks=[tensorboard_callback])

Epoch 1/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 41ms/step - loss: 2.8689
Epoch 2/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 43ms/step - loss: 1.6471
Epoch 3/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 40ms/step - loss: 1.4022
Epoch 4/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 38ms/step - loss: 1.3079
Epoch 5/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 38ms/step - loss: 1.2562
Epoch 6/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 38ms/step - loss: 1.2221
Epoch 7/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 38ms/step - loss: 1.1953
Epoch 8/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 37ms/step - loss: 1.1734
Epoch 9/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 37ms/step - loss: 1.1541
Epoch 10/20
[1m351/351[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15

<keras.src.callbacks.history.History at 0x7ca326544ec0>

## Saving the Model

In [27]:
model_name = MODEL_NAME

In [28]:
model.save(f'models/{model_name}') 
logger.info("Model saved")

2025-08-20 18:01:50 - INFO - Model saved


## Load Model

In [29]:
model = create_model(vocab_size, embed_dim, rnn_neurons, batch_size=1)

model.load_weights(f'models/{model_name}')

model.build(tf.TensorShape([1, None]))

2025-08-20 18:01:51 - INFO - Model architecture created successfully


# Generating Predictions

In [30]:
def generate_text(model, start_seed="The ", gen_size=100, temp=1.0):
    """
    Generates a sequence of text using the trained character-level language model.

    Args:
        model: Model created on function create_model
        start_seed: Set of characters that will be the beginning of the text. 
        gen_size : Number of characters. Defaults to 100.
        temp: Controls the randomness of the predictions made by the model.

    Returns:
        The full generated text including the seed and the newly predicted characters.
    """
    try:
        num_generate = gen_size
        input_eval = [char_to_int[s] for s in start_seed]
        input_eval = tf.expand_dims(input_eval, 0)
        text_generated = []
        temperature = temp


        for i in range(num_generate):
            predictions = model(input_eval)
            predictions = tf.squeeze(predictions, 0)
            predictions = predictions / temperature
            predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
            input_eval = tf.expand_dims([predicted_id], 0)
            text_generated.append(int_to_char[predicted_id])

        return start_seed + ''.join(text_generated)
    except Exception as e:
            logger.error(f"Error making predictions: {str(e)}")


#### Generating a text with 1000 chars starting with word 'Confidence'

In [31]:
for layer in model.layers:
    if hasattr(layer, 'reset_states'):
        layer.reset_states()
print(generate_text(model, start_seed="Confidence", gen_size=1000))

Confidence.
  KING RICHARD. Thanks, gracious. Even here 'tis ript.
  CASSIO. Do not you look, and bear you; much my mistress,
    He hear the Queen's brains, not foolish that blow shielded thou know'st it, sir. Call
    thus.  good brother,
    Allow me,'t fast, and, spite, every one of child.
  TROILUS. Be thou as fathors paying homage. Well, the uglye simplicity.
    Now start-something fellow; but therefore
    Was never your custom'd look each other
    To man of mildly ignorant back,
    None else now forc'd so God. What news, your plainness
    finds not their age to bear'd?
  VALENTINE. No, how the better door! Look of our journey,
    No longer gnaw thee to out,
    And in this kind of five for dreams on spur;
    Peering her faction; pothey, and his point.
  4. Watch. Note learning, brother, for all, Come, puth a madman himself.

                  Enter CHIRON,  How now, Sir Thouchs a park him; and
    no more than join'd, I must needs but bell retain you
    against all. Marc

#### Generating a text with 1000 chars starting with word 'Love'

In [32]:
print(generate_text(model, start_seed="Love", gen_size=1000))

Love! That cuttingly
    say't by her thread as English toans, as false Lupitol, son.
  HELENA. What news, how long bach way?.
    I have sat for me, my lord among,
    Thou unchanted from the bulk
    That thou art torn and bring you first.  
    Then thou art true-like the grief that wring you king.
    Saddle which are these the fools some wealth INE. Tell me, my your company-hangings is in lume, yet sweetly lack'd. Here comes it, Trim. Unless I was to de
    Which from the pretty overcale weak
    The men that follows here, and you mistake
    Betwixt torn furnish'd firstown that by A maid or sweetest flames,
    But he himself is my pot's to this tomb,
    I do not to the Tepera's cloak cants in thee,
    O swill fills kindly walk, and song to Greeks,
    Behold what voice may awake thee stone;

                   Some the ladiest climbing!
  Edg. [To VALENTIN]
  KING RICHARK. My love. Believe will I come to Aufidemisonemay should stink a young venon and a familiar buy to lish and

In [33]:
end_time: float = time.time()
elapsed_time: float = end_time - start_time
elapsed_minutes: int = int(elapsed_time // 60)
elapsed_seconds: float = elapsed_time % 60

logger.info(f"⏱️ Total execution time: {elapsed_minutes}m {elapsed_seconds:.2f}s")
logger.info("✅ Notebook execution completed successfully.")

2025-08-20 18:02:17 - INFO - ⏱️ Total execution time: 6m 5.31s
2025-08-20 18:02:17 - INFO - ✅ Notebook execution completed successfully.


Built with ❤️ using Z by HP AI Studio.