# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'> Text Generation with TensorFlow</div></b>

![](https://img.freepik.com/free-photo/mother-daughter-business-workers-smiling-confident-working-office_839833-10625.jpg?w=1380&t=st=1694628913~exp=1694629513~hmac=7816cdfa99022330d58c52023e337ffe53aef04e0a9b19f5d21778b876c82e9e)

In this notebook, we'll walk you through how to generate text using a character RNN model. Here are the topics we'll cover:
- Imports the required libraries
- Downloads the Shakespeare dataset
- Preprocesses the text data
- Defines a model architecture
- Compiles the model
- Trains the model
- Generates text using the trained model

# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'>1. Data Loading</div></b>

In this section, we begin by importing the TensorFlow library and proceed to download a dataset containing Shakespearean text from a remote URL. The downloaded text is stored in a variable called text, and we display the first 100 characters of the text for initial exploration.

In [2]:
# Let's import TensorFlow library:
import tensorflow as tf

In [7]:
with open("../Data/tinyshakespeare.txt") as f:
    text = f.read()
    print(text[:100])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You


In [None]:
# Let's display the first 100 characters of the text:
print(text[:100])

In [None]:
# Let's examine characters:
"".join(sorted(set(text.lower())))

In [None]:
# Let's take a look at the length of characters:
len("".join(sorted(set(text.lower()))))

# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'>2. Text Preprocessing</div></b>

This section focuses on preprocessing the raw text data. We create a TextVectorization layer that tokenizes the text at the character level and converts all characters to lowercase for consistency. The layer is adapted to the text data, allowing us to efficiently encode the text into numerical sequences. We also check the shape of the encoded text to understand its dimensions.

In [None]:
# Let's create a TextVectorization layer for character-level tokenization:
text_vec_layer = tf.keras.layers.TextVectorization(
    split="character",standardize="lower")

In [None]:
# Let's adapt the TextVectorization layer to the text data:
text_vec_layer.adapt([text])

In [None]:
# Let's check the shape of the encoded text:
text_vec_layer([text]).shape

In [None]:
# Let's preprocess the text:
encoded = text_vec_layer([text])[0]
encoded

The TextVectorization layer assigns 0 for padding tokens and 1 for unknown characters. Since we currently don't need these tokens, we subtract 2 from the character IDs and calculate both the count of distinct characters and the total character count.

In [None]:
# Let’s subtract 2 from the character IDs and compute the number of distinct characters and the total number of characters:
encoded -= 2
n_tokens = text_vec_layer.vocabulary_size()-2 
n_tokens

In [None]:
# Let's take a look at the length of the dataset:
dataset_size = len(encoded)
dataset_size

# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'>3. Dataset Preparation </div></b>

Here, we define a function called to_dataset that converts the encoded text sequences into a dataset suitable for training. This function segments the text into overlapping sequences of a specified length and organizes them into batches. Optionally, it shuffles the dataset to enhance randomness during training. An example usage of the to_dataset function is provided to illustrate its functionality.

In [None]:
# Let's create a function to convert text sequences into a dataset
def to_dataset(sequence,length,shuffle=False,seed=None,batch_size=32):
    ds = tf.data.Dataset.from_tensor_slices(sequence)
    ds = ds.window(length + 1, shift=1,drop_remainder=True)
    ds = ds.flat_map(lambda window_ds: window_ds.batch(length + 1))
    if shuffle:
        ds = ds.shuffle(100_000, seed=seed)
    ds = ds.batch(batch_size)
    return ds.map(lambda window: (window[:, :-1], window[:, 1:])).prefetch(1)

In [None]:
# Let's get an example and pass it to the function:
list(to_dataset(text_vec_layer(["I like"])[0], length=5))

Let's create the training, validation and test datasets.

In [None]:
length = 100
tf.random.set_seed(42)
# The training dataset:
train_set = to_dataset(encoded[:1_000_000], length=length, shuffle=True,seed=42)
# The validation dataset:
valid_set = to_dataset(encoded[1_000_000:1_060_000], length=length)
# Test dataset:
test_set = to_dataset(encoded[1_060_000:], length=length)

# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'>4. Model Definition and Training </div></b>

In this part of the code, we define the architecture of a neural network model for text generation. The model consists of an Embedding layer for representing tokens, a GRU (Gated Recurrent Unit) layer for sequence modeling, and a Dense layer with a softmax activation for predicting the next character. We compile the model using the sparse categorical cross-entropy loss and the Nadam optimizer. We also incorporate a ModelCheckpoint callback to save the best model weights during training. The model is then trained on the prepared datasets using the fit method.

In [None]:
# Let's define the model architecture:
model = tf.keras.Sequential([
    tf.keras.layers.Embedding(input_dim=n_tokens, output_dim=16),
    tf.keras.layers.GRU(128,return_sequences=True),
    tf.keras.layers.Dense(n_tokens,activation="softmax")
])
# Let's compile the model:
model.compile(loss="sparse_categorical_crossentropy", 
              optimizer="nadam", metrics=["accuracy"])

#  Let's train the model and save the best checkpoints:
model_ckpt = tf.keras.callbacks.ModelCheckpoint("my_shakespeare_model", monitor="val_accuracy", save_best_only=True)

# Let's train the model:
history = model.fit( train_set, validation_data=valid_set, epochs=3,callbacks=[model_ckpt])

In [None]:
# Let's add the text preprocessing layer:
shakespeare_model = tf.keras.Sequential([
    text_vec_layer,
    tf.keras.layers.Lambda(lambda X: X - 2),
    model
])

# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'>5. Text Generation </div></b>

This section defines a higher-level model for text generation, combining the TextVectorization layer, character-level adjustment, and the previously trained text generation model. This model can be used to generate text based on an initial input.

In [None]:
# Let's generate text using the trained model:
y_proba = shakespeare_model.predict(["To be or not to b"])[0, -1]
y_pred = tf.argmax(y_proba)
text_vec_layer.get_vocabulary()[y_pred + 2]

# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'>6. Text Generation Functions</div></b>

Here, we define two important functions for text generation. The next_char function predicts the next character in a sequence given a context and a temperature parameter that controls the randomness of predictions. The extend_text function extends a given text with additional characters by iteratively predicting the next character based on the context. Example usages of these functions are provided to demonstrate how to generate text with different temperatures.

In [None]:
# How to use the tf.random.categorical() method:
log_probas = tf.math.log([[0.6, 0.3, 0.1]])
tf.random.set_seed(42)
tf.random.categorical(log_probas, num_samples=10)

In [None]:
# Let's create a function to generate the next character based on input text:
def next_char(text, temperature=1):
    y_proba = shakespeare_model.predict([text])[0, -1:]
    rescaled_logits = tf.math.log(y_proba) / temperature
    char_id = tf.random.categorical(rescaled_logits, num_samples=1)[0, 0]
    return text_vec_layer.get_vocabulary()[char_id + 2]

In [None]:
# Let's create a function to extend a given text with additional characters:
def extend_text(text, n_chars=50, temperature=1):
    for _ in range(n_chars):
        text += next_char(text, temperature)
    return text

In [None]:
# Let's generate a text with a low temperature:
tf.random.set_seed(42)
print(extend_text("I like", temperature=0.01))

In [None]:
# Let's create a higher temperature text:
print(extend_text("I like", temperature=1))

# <b><div style='padding:15px;background-color:#850E35;color:white;border-radius:2px;font-size:110%;text-align: center'>Conclusion</div></b>

In this notebook, we covered how to build a RNN-based model with TensorFlow for text generation.

Thanks for reading. If you enjoy this notebook, don't forget upvote. 