The aim of this project is to explore the process of generating text utilizing a character-based RNN(CREDITS GO TO GOOGLE CLOUD FOR THE PROJECT). The dataset I will be utilizing here is a dataset containing Shakespeare's writings sourced from Andrej Karpathy's "The Unreasonable Effectiveness of Recurrent Neural Networks". The task involves training a model to predict the succeeding character in a given sequence of characters extracted from this data ("Shakespear" in this case), aiming to predict the next character ("e"). Subsequently, by repeatedly invoking the model, longer sequences of text can be generated.

Environemnt Setup:
Importing TensorFlow and any other necessary libraries.

In [1]:
import os
import warnings

# Ignore warnings
warnings.filterwarnings("ignore")

# Set TensorFlow logging level to suppress unnecessary messages
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

import time
import numpy as np
import tensorflow as tf

Downloading the Shakespeare dataset.
Modify the subsequent line to execute this code with any custom data.

In [2]:
path_to_file = tf.keras.utils.get_file(
    "shakespeare.txt",
    "https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt",
)

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt


Read the data
Initially, we will download the file and subsequently decode it.

In [3]:
# Open the file and read its contents as bytes, then decode using UTF-8 encoding
with open(path_to_file, "rb") as file:
    text = file.read().decode(encoding="utf-8")

# Print the length of the text to verify the number of characters
print(f"The length of the text is: {len(text)} characters")

The length of the text is: 1115394 characters


Let's examine the initial 250 characters within the text.

In [4]:
# Display the first 250 characters of the text
print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



Let's determine the total count of unique characters present in our corpus or document.

In [5]:
# Create a sorted set to obtain unique characters, then print the count
vocab = sorted(set(text))
print(f"There are {len(vocab)} unique characters.")

There are 65 unique characters.


Process the text
Convert the text into numerical vectors
Prior to training, it's essential to transform the strings into numerical forms.

By employing the tf.keras.layers.StringLookup layer, each character can be converted into a unique numeric identifier. Initially, the text needs to be segmented into tokens for this process to be effective.

In [7]:
example_texts = ["abcdefg", "xyz"]

# Split the texts into characters
chars = tf.strings.unicode_split(example_texts, input_encoding="UTF-8")
chars

<tf.RaggedTensor [[b'a', b'b', b'c', b'd', b'e', b'f', b'g'], [b'x', b'y', b'z']]>

Next, instantiate the tf.keras.layers.StringLookup layer.

In [9]:
# Create the tf.keras.layers.StringLookup layer
ids_from_chars = tf.keras.layers.StringLookup(
    vocabulary=list(vocab), mask_token=None
)

It facilitates the conversion from tokens to character IDs.

In [10]:
# Utilize the created StringLookup layer to convert tokens into character IDs
ids = ids_from_chars(chars)
ids

<tf.RaggedTensor [[40, 41, 42, 43, 44, 45, 46], [63, 64, 65]]>

As the objective of this project revolves around text generation, it's crucial to reverse this encoding process and reconstruct human-readable strings from it. To accomplish this, I am going to employ tf.keras.layers.StringLookup(..., invert=True).

It is to be kept in mind that instead of providing the original vocabulary created with sorted(set(text)), the get_vocabulary() method of the tf.keras.layers.StringLookup layer is to be utilized. This ensures that the [UNK] token is handled consistently.

In [11]:
# Create the StringLookup layer to convert character IDs back to tokens
chars_from_ids = tf.keras.layers.StringLookup(
    vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None
)

This layer retrieves characters from the ID vectors and presents them as a tf.RaggedTensor composed of characters.

In [12]:
# Utilize the StringLookup layer to convert character IDs back into tokens
chars = chars_from_ids(ids)
chars

<tf.RaggedTensor [[b'a', b'b', b'c', b'd', b'e', b'f', b'g'], [b'x', b'y', b'z']]>

I aim to use tf.strings.reduce_join to concatenate the characters back into strings.

In [13]:
# Utilize tf.strings.reduce_join to concatenate characters into strings
tf.strings.reduce_join(chars, axis=-1).numpy()

array([b'abcdefg', b'xyz'], dtype=object)

In [14]:
# Define a function to reconstruct text from character IDs
def text_from_ids(ids):
    return tf.strings.reduce_join(chars_from_ids(ids), axis=-1)

Prediction Task:
The goal is to predict the most probable next character given a single character or a sequence of characters. This forms the basis of the model's training. The model receives a sequence of characters as input and is trained to predict the subsequent character at each time step.

As recurrent neural networks (RNNs) maintain an internal state dependent on previously observed elements, they can infer the next character based on all characters seen up to that point.

Creating Training Examples and Targets:
The next step involves dividing the text into example sequences. Each input sequence will comprise seq_length characters from the text. Correspondingly, the targets for each input sequence will contain text of the same length, but shifted by one character to the right.

To accomplish this, the text is segmented into chunks of size seq_length+1. For instance, if seq_length is 4 and the text is "Hello", the input sequence would be "Hell", and the target sequence would be "ello".

Initially, employ the tf.data.Dataset.from_tensor_slices function to convert the text vector into a stream of character indices.

In [15]:
# Convert the entire text into character IDs
all_ids = ids_from_chars(tf.strings.unicode_split(text, "UTF-8"))
all_ids

<tf.Tensor: shape=(1115394,), dtype=int64, numpy=array([19, 48, 57, ..., 46,  9,  1], dtype=int64)>

In [17]:
# Create a dataset from the character IDs
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)

# Display the first 10 characters from the dataset
for ids in ids_dataset.take(10):
    print(chars_from_ids(ids).numpy().decode("utf-8"))

F
i
r
s
t
 
C
i
t
i


In [18]:
# Define the sequence length and calculate the number of examples per epoch
seq_length = 100
examples_per_epoch = len(text) // (seq_length + 1)

The batch method simplifies the process of converting these individual characters into sequences of the specified size.

In [19]:
# Create sequences of the desired length using the batch method
sequences = ids_dataset.batch(seq_length + 1, drop_remainder=True)

# Display the characters from the first sequence
for seq in sequences.take(1):
    print(chars_from_ids(seq))

tf.Tensor(
[b'F' b'i' b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':'
 b'\n' b'B' b'e' b'f' b'o' b'r' b'e' b' ' b'w' b'e' b' ' b'p' b'r' b'o'
 b'c' b'e' b'e' b'd' b' ' b'a' b'n' b'y' b' ' b'f' b'u' b'r' b't' b'h'
 b'e' b'r' b',' b' ' b'h' b'e' b'a' b'r' b' ' b'm' b'e' b' ' b's' b'p'
 b'e' b'a' b'k' b'.' b'\n' b'\n' b'A' b'l' b'l' b':' b'\n' b'S' b'p' b'e'
 b'a' b'k' b',' b' ' b's' b'p' b'e' b'a' b'k' b'.' b'\n' b'\n' b'F' b'i'
 b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':' b'\n' b'Y'
 b'o' b'u' b' '], shape=(101,), dtype=string)


It becomes clearer to understand what this process entails when you concatenate the tokens back into strings.

In [20]:
# Display the text reconstructed from the first 5 sequences
for seq in sequences.take(5):
    print(text_from_ids(seq).numpy())

b'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '
b'are all resolved rather to die than to famish?\n\nAll:\nResolved. resolved.\n\nFirst Citizen:\nFirst, you k'
b"now Caius Marcius is chief enemy to the people.\n\nAll:\nWe know't, we know't.\n\nFirst Citizen:\nLet us ki"
b"ll him, and we'll have corn at our own price.\nIs't a verdict?\n\nAll:\nNo more talking on't; let it be d"
b'one: away, away!\n\nSecond Citizen:\nOne word, good citizens.\n\nFirst Citizen:\nWe are accounted poor citi'


During training, we need a dataset consisting of pairs of (input, label), where both input and label represent sequences. At each time step, the input corresponds to the current character, while the label corresponds to the subsequent character.

Below is a function that accepts a sequence as input, duplicates it, and shifts it to ensure alignment between the input and label for each time step:

In [22]:
def split_input_target(sequence):
    input_text = sequence[:-1]
    target_text = sequence[1:]
    return input_text, target_text

# Example usage of the function
split_input_target(list("Tensorflow"))

(['T', 'e', 'n', 's', 'o', 'r', 'f', 'l', 'o'],
 ['e', 'n', 's', 'o', 'r', 'f', 'l', 'o', 'w'])

In [23]:
# Create a dataset of (input, label) pairs using the split_input_target function
dataset = sequences.map(split_input_target)

# Display an example input and target pair from the dataset
for input_example, target_example in dataset.take(1):
    print("Input :", text_from_ids(input_example).numpy())
    print("Target:", text_from_ids(target_example).numpy())

Input : b'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou'
Target: b'irst Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '


Prepare Training Batches
After segmenting the text into manageable sequences using tf.data, the next step is to shuffle the data and organize it into batches before feeding it into the model.

In [24]:
# Define batch size
BATCH_SIZE = 64

# Define buffer size for shuffling the dataset
# (TensorFlow's data pipeline is designed to handle potentially infinite sequences,
# so it shuffles elements within a buffer rather than shuffling the entire sequence in memory)
BUFFER_SIZE = 10000

# Shuffle, batch, and prefetch the dataset for optimal performance
dataset = (
    dataset.shuffle(BUFFER_SIZE)
    .batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE)
)

dataset

<_PrefetchDataset element_spec=(TensorSpec(shape=(64, 100), dtype=tf.int64, name=None), TensorSpec(shape=(64, 100), dtype=tf.int64, name=None))>

Constructing the Model
This section outlines the creation of the model as a subclass of keras.Model (For more details, refer to Creating New Layers and Models through subclassing).

Developing a model with the following layers:

tf.keras.layers.Embedding: Serves as the input layer, constituting a trainable lookup table that maps each character-ID to a vector with embedding_dim dimensions.
tf.keras.layers.GRU: An RNN type with size units=rnn_units. (Alternatively, you can opt for using an LSTM layer here.)
tf.keras.layers.Dense: Represents the output layer, featuring vocab_size outputs. It generates one logit for each character within the vocabulary. These logits represent the log-likelihood of each character as per the model's estimation.

In [25]:
# Determine the length of the vocabulary in characters
vocab_size = len(vocab)

# Set the embedding dimension
embedding_dim = 256

# Specify the number of RNN units
rnn_units = 1024

The following class accomplishes the following tasks:

    1.It inherits from tf.keras.Model.
    2.The constructor is employed to specify the layers of the model.
    3.The forward pass is defined using the layers outlined in the constructor.

In [26]:
class MyModel(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, rnn_units):
        super().__init__(self)
        # Define an embedding layer
        self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
        # Create a GRU layer
        self.gru = tf.keras.layers.GRU(
            rnn_units, return_sequences=True, return_state=True
        )
        # Connect with a dense layer
        self.dense = tf.keras.layers.Dense(vocab_size)

    def call(self, inputs, states=None, return_state=False, training=False):
        # Embedding layer
        x = self.embedding(inputs, training=training)
        
        # GRU layer
        # During training for text generation, utilize the previous state.
        # If no state is available, initialize it.
        if states is None:
            states = self.gru.get_initial_state(x)
        x, states = self.gru(x, initial_state=states, training=training)
        
        # Dense layer
        x = self.dense(x, training=training)

        if return_state:
            return x, states
        else:
            return x

# Instantiate the model
model = MyModel(
    # Ensure the vocabulary size matches the `StringLookup` layers.
    vocab_size=len(ids_from_chars.get_vocabulary()),
    embedding_dim=embedding_dim,
    rnn_units=rnn_units,
)

For every character, the model retrieves the embedding, executes one time step of the GRU with the embedding as input, and applies the dense layer to produce logits that predict the log-likelihood of the succeeding character.

Experiment with the Model
Next, execute the model to verify that it performs as intended.

Begin by examining the shape of the output:

In [27]:
# Fetch an input example batch and a target example batch from the dataset
for input_example_batch, target_example_batch in dataset.take(1):
    # Generate predictions for the example batch
    example_batch_predictions = model(input_example_batch)
    
    # Display the shape of the predictions
    print(
        example_batch_predictions.shape,
        "# (batch_size, sequence_length, vocab_size)",
    )

(64, 100, 66) # (batch_size, sequence_length, vocab_size)


In the preceding example, although the input sequence length is set to 100, the model is capable of processing inputs of variable lengths.

In [28]:
# Display a summary of the model's architecture
model.summary()

Model: "my_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       multiple                  16896     
                                                                 
 gru (GRU)                   multiple                  3938304   
                                                                 
 dense (Dense)               multiple                  67650     
                                                                 
Total params: 4022850 (15.35 MB)
Trainable params: 4022850 (15.35 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


To obtain real predictions from the model, wq must sample from the output distribution to acquire actual character indices. This distribution is determined by the logits across the character vocabulary.

Note: Sampling from this distribution is crucial, as relying solely on the argmax of the distribution can potentially cause the model to become trapped in a loop.

Testing this approach with the first example in the batch:

In [30]:
# Sample character indices from the output distribution
sampled_indices = tf.random.categorical(
    example_batch_predictions[0], num_samples=1
)

# Squeeze the sampled indices to remove unnecessary dimensions and convert to numpy array
sampled_indices = tf.squeeze(sampled_indices, axis=-1).numpy()
sampled_indices

array([24, 55, 28, 29, 61, 11, 55, 26, 27, 47, 49, 25, 59, 43, 59, 51, 12,
       27,  9, 62, 11,  3,  6, 41, 31, 30, 44, 48, 22, 44, 63, 32, 22, 60,
       23, 17, 44, 12, 33,  7, 20, 21, 55, 23, 14, 60, 23,  3, 45, 57, 41,
       42, 60,  0, 55, 51, 20, 13, 64, 51, 27, 56,  0, 31, 16, 24, 55,  3,
       12, 12, 21, 19, 21, 50, 45, 54, 31, 28, 17, 37, 60,  2, 59,  7, 58,
       37, 50, 65, 27, 64, 12, 43, 42, 56, 52, 34, 53, 26, 40, 21],
      dtype=int64)

This provides us with a prediction of the next character index at each timestep:

Decoding these indices to observe the text predicted by this untrained model:

In [33]:
# Display the input text
print("Input:\n", text_from_ids(input_example_batch[0]).numpy())
print()

# Display the predicted next characters
print("Next Char Predictions:\n", text_from_ids(sampled_indices).numpy())

Input:
 b' justice of your dealing?\n\nProvost:\nBut what likelihood is in that?\n\nDUKE VINCENTIO:\nNot a resemblan'

Next Char Predictions:
 b"KpOPv:pMNhjLtdtl;N.w:!'bRQeiIexSIuJDe;T,GHpJAuJ!frbcu[UNK]plG?ylNq[UNK]RCKp!;;HFHkfoRODXu t,sXkzNy;dcqmUnMaH"


Proceed with training the model.
At this juncture, the problem can be approached as a conventional classification task. Given the previous RNN state and the input for this time step, predict the class of the subsequent character.

Include an optimizer and a loss function.
The standard tf.keras.losses.sparse_categorical_crossentropy loss function is suitable for this scenario as it operates across the last dimension of the predictions.

Since the model produces logits, it's necessary to set the from_logits flag.

In [34]:
# Define the loss function
loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True)

# Calculate the mean loss for the example batch
example_batch_mean_loss = loss(target_example_batch, example_batch_predictions)

# Display prediction shape and mean loss
print(
    "Prediction shape: ",
    example_batch_predictions.shape,
    " # (batch_size, sequence_length, vocab_size)",
)
print("Mean loss:        ", example_batch_mean_loss)

Prediction shape:  (64, 100, 66)  # (batch_size, sequence_length, vocab_size)
Mean loss:         tf.Tensor(4.1887097, shape=(), dtype=float32)


In the initial stages, a freshly initialized model shouldn't exhibit excessive confidence; hence, the output logits should possess comparable magnitudes. To verify this, you can assess whether the exponential of the mean loss is roughly equivalent to the size of the vocabulary. If the loss is significantly higher, it suggests that the model is overly confident in its incorrect predictions, indicating poor initialization.

In [37]:
# Calculate the exponential of the mean loss and convert to numpy array
tf.exp(example_batch_mean_loss).numpy()

65.93766

Set up the training process by utilizing the tf.keras.Model.compile method. Employ tf.keras.optimizers.Adam with default parameters along with the specified loss function.

In [38]:
# Configure the model for training using the Adam optimizer and the specified loss function
model.compile(optimizer="adam", loss=loss)

Set up checkpoints
Utilize tf.keras.callbacks.ModelCheckpoint to guarantee that checkpoints are saved throughout the training process:

In [39]:
# Specify the directory for saving checkpoints
checkpoint_dir = "./training_checkpoints"
# Define the prefix for checkpoint filenames
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

# Create a callback to save model weights only
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix, save_weights_only=True
)

Proceed with the training process.
To maintain reasonable training duration, conduct training over 10 epochs.

In [40]:
# Define the number of epochs
EPOCHS = 10

# Train the model over the specified number of epochs, utilizing the checkpoint callback
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Text Generation
One straightforward method to generate text using this model is to iterate through it in a loop while monitoring the model's internal state during execution.

For each model call, provide some text along with its internal state. The model will then predict the next character and provide its updated state. To continue generating text, feed the prediction and state back into the model.

The following code snippet demonstrates making a single-step prediction:

In [41]:
class OneStep(tf.keras.Model):
    def __init__(self, model, chars_from_ids, ids_from_chars, temperature=1.0):
        super().__init__()
        # Set the temperature parameter for sampling
        self.temperature = temperature
        # Store the model, functions for character-to-ID and ID-to-character conversion
        self.model = model
        self.chars_from_ids = chars_from_ids
        self.ids_from_chars = ids_from_chars

        # Create a mask to prevent "[UNK]" from being generated.
        skip_ids = self.ids_from_chars(["[UNK]"])[:, None]
        sparse_mask = tf.SparseTensor(
            # Set -inf at each invalid index.
            values=[-float("inf")] * len(skip_ids),
            indices=skip_ids,
            # Match the shape to the vocabulary
            dense_shape=[len(ids_from_chars.get_vocabulary())],
        )
        self.prediction_mask = tf.sparse.to_dense(sparse_mask)

    @tf.function
    def generate_one_step(self, inputs, states=None):
        # Convert strings to token IDs.
        input_chars = tf.strings.unicode_split(inputs, "UTF-8")
        input_ids = self.ids_from_chars(input_chars).to_tensor()

        # Run the model.
        # predicted_logits.shape: [batch, char, next_char_logits]
        predicted_logits, states = self.model(
            inputs=input_ids, states=states, return_state=True
        )
        # Consider only the last prediction.
        predicted_logits = predicted_logits[:, -1, :]
        predicted_logits = predicted_logits / self.temperature
        # Apply the prediction mask to prevent "[UNK]" generation.
        predicted_logits = predicted_logits + self.prediction_mask

        # Sample token IDs from the output logits.
        predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
        predicted_ids = tf.squeeze(predicted_ids, axis=-1)

        # Convert token IDs to characters
        predicted_chars = self.chars_from_ids(predicted_ids)

        # Return the characters and model state.
        return predicted_chars, states

# Instantiate the OneStep model
one_step_model = OneStep(model, chars_from_ids, ids_from_chars)

Execute the model within a loop to produce some text. Upon examining the generated text, we can notice that the model demonstrates an understanding of when to capitalize, create paragraphs, and mimic a vocabulary reminiscent of Shakespeare's writing style. However, due to the limited number of training epochs, it has not yet acquired the ability to construct coherent sentences.

In [43]:
start = time.time()
states = None
next_char = tf.constant(["ROMEO:"])
result = [next_char]

# Generate text in a loop
for n in range(1000):
    next_char, states = one_step_model.generate_one_step(
        next_char, states=states
    )
    result.append(next_char)

# Join the generated text
result = tf.strings.join(result)
end = time.time()

# Print the generated text
print(result[0].numpy().decode("utf-8"), "\n\n" + "_" * 80)
print("\nRun time:", end - start)

ROMEO:
The town of, or dow, the Volsces' haspy true
Reisolder: but, one firely on this them.

PETRUCHIO:
Within your accuselers are all this: and you go?

ROMEO:
We'll to be found, give me, and I hade, for their
intercaited:
Fare you well mouth by me?' touch my prettience
to him to make it the assisting on their curse to any.

YORK:
Trayon with his great surses, brave Gruicent:
If I protest, was brave men's, and myself and be me
But little like an old than thy thing achieved, so

JUFIE:
Where is young cick, word of her king?

ESTALUS:
Ay, widow'll gold it, if you be revenged?
If they do the time leave onserve:
Add, pray, so would Marcius borne so heart's
That when I wishey persuade,--
Since being none but determined on thee.

DUKE VINCENTIO:
Of these he has me near out of mine arch you of it!

MARIANA:
'Tis receive's will take to make hast thus but no hand.

KING RICHARD III:
Ready than thou hast wound 'em; shall this one according,
That banish'd his friends, shall she is buried
Bingui

To enhance the results, consider the following options:

    1.Extend the training duration by increasing the number of epochs (e.g., try EPOCHS = 30).
    2.Experiment with different start strings.
    3.Enhance the model's accuracy by adding an additional RNN layer.
    4.Adjust the temperature parameter to control the level of randomness in predictions.

For faster text generation, batch the text generation process. The following example demonstrates generating five outputs in approximately the same time it took to generate one output above.

In [44]:
# Record the start time
start = time.time()

# Initialize model states
states = None

# Define the starting string
next_char = tf.constant(["ROMEO:", "ROMEO:", "ROMEO:", "ROMEO:", "ROMEO:"])
result = [next_char]

# Generate text in a loop
for n in range(1000):
    next_char, states = one_step_model.generate_one_step(
        next_char, states=states
    )
    result.append(next_char)

# Join the generated text
result = tf.strings.join(result)

# Record the end time
end = time.time()

# Print the generated text
print(result, "\n\n" + "_" * 80)
print("\nRun time:", end - start)

tf.Tensor(
[b"ROMEO:\nNay, I do forgot the fault o' the infinity\nProud I proce'd to Polizenes.\n\nHENRY SE:\nThen let the sour of Clarence' nessore'd him,\nSome with the holy manners of his mind:\nOn an hour confessors are for't,\nNo worer tham we dather with riage: fellow, sir;\nFor do I their betites now your own fieth,\nBut kill the worl: he long in our rebories,\nBut like the wasting on the very sense to come from you,\nthat let it was your nock; what, if it consul?\n\nBRAKENBURY:\nAre you say 'twere at the wind; then, sound God, he'll thy fight\nAnd hidune and the Earl of your redain,\nBegins an Engress valiant credit to rust him;\nFor once make hands frewly he't be so doubt it, shall I do bid\nthin. Worthy save my poor man?\nWe have heard of love and secret words\nFor can carried to pluck with the villain:\nWhich'el carnot blume my plaint\nIn her whole wratich, who bad's made thee?\n\nKING RICHARD III:\nSo long his heart; 'twomanity and his up, a\ncounsel master men.\n\nSecond S

Export the text generator
This one-step model can be effortlessly saved and restored, enabling its use wherever a tf.saved_model is required.

In [45]:
# Save the one-step model
tf.saved_model.save(one_step_model, "one_step")

# Load the saved model
one_step_reloaded = tf.saved_model.load("one_step")

# Initialize model states
states = None

# Define the starting string
next_char = tf.constant(["ROMEO:"])
result = [next_char]

# Generate text in a loop
for n in range(100):
    next_char, states = one_step_reloaded.generate_one_step(
        next_char, states=states
    )
    result.append(next_char)

# Print the generated text
print(tf.strings.join(result)[0].numpy().decode("utf-8"))

INFO:tensorflow:Assets written to: one_step\assets


INFO:tensorflow:Assets written to: one_step\assets


ROMEO:
But I drink not requite stretchrels disormeded
By a piscourse in the duke with his fingerous shigh;
