Q1: Implementing an RNN for Text Generation

Task: Recurrent Neural Networks (RNNs) can generate sequences of text. You will train an LSTM-based RNN to predict the next character in a given text dataset.

1.	Load a text dataset (e.g., "Shakespeare Sonnets", "The Little Prince").

2.	Convert text into a sequence of characters (one-hot encoding or embeddings).

3.	Define an RNN model using LSTM layers to predict the next character.

4.	Train the model and generate new text by sampling characters one at a time.

5.	Explain the role of temperature scaling in text generation and its effect on randomness.

Hint: Use tensorflow.keras.layers.LSTM() for sequence modeling.


In [None]:
import requests
import numpy as np
import tensorflow as tf


Data Preprocessing

In [None]:

url = 'https://www.gutenberg.org/files/100/100-0.txt'
text = requests.get(url).text

# Extract just the Sonnets section
start = text.find("THE SONNETS")
end = text.find("THE END", start)
sonnets = text[start:end].strip()

In [None]:

# Create mapping from chars to integers
vocab = sorted(set(sonnets))
char2idx = {u: i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

# Encode the full text as integer indices
text_as_int = np.array([char2idx[c] for c in sonnets])

Creating Training Examples

In [15]:
seq_length = 100  # input characters per training example
examples_per_epoch = len(sonnets) // (seq_length + 1)

char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

# Create sequences of length seq_length + 1
sequences = char_dataset.batch(seq_length + 1, drop_remainder=True)

def split_input_target(chunk):
    input_text = chunk[:-1]   # First 100 chars
    target_text = chunk[1:]   # Next 100 chars (shifted by one)
    return input_text, target_text

dataset = sequences.map(split_input_target)

In [16]:
BATCH_SIZE = 64
BUFFER_SIZE = 10000

dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

In [None]:
# defile RNN model with LSTM layers
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    # create an imput layer with fixed batch size
    inputs = tf.keras.Input(batch_shape=(batch_size, None), dtype=tf.int32)
    x = tf.keras.layers.Embedding(vocab_size, embedding_dim)(inputs)
    x = tf.keras.layers.LSTM(
        rnn_units, return_sequences=True, stateful=True,
        recurrent_initializer='glorot_uniform')(x)
    outputs = tf.keras.layers.Dense(vocab_size)(x)
    return tf.keras.Model(inputs, outputs)

embedding_dim = 256
rnn_units = 1024
model = build_model(len(vocab), embedding_dim, rnn_units, batch_size=BATCH_SIZE)
model.compile(optimizer='adam', loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True))

In [None]:
# fit the model
model.fit(dataset, epochs=10)

Text Generation

Use pretrained model weights to feed into inference model for text generation

In [None]:
# This is the inference model
def build_inference_model(vocab_size, embedding_dim, rnn_units):
    inputs = tf.keras.Input(batch_shape=(1, None), dtype=tf.int32)
    x = tf.keras.layers.Embedding(vocab_size, embedding_dim)(inputs)
    x = tf.keras.layers.LSTM(
        rnn_units,
        return_sequences=True,
        stateful=True,
        recurrent_initializer='glorot_uniform')(x)
    outputs = tf.keras.layers.Dense(vocab_size)(x)
    return tf.keras.Model(inputs, outputs)

inference_model = build_inference_model(len(vocab), 256, 1024)

# Set weights from the trained model
inference_model.set_weights(model.get_weights())

In [27]:
# define how text generation is performed
def reset_model_states(model):
    for layer in model.layers:
        if hasattr(layer, 'reset_states'):
            layer.reset_states()

def generate_text(model, start_string, num_generate=500, temperature=1.0):
    # Vectorize input
    input_eval = [char2idx[s] for s in start_string]
    input_eval = tf.expand_dims(input_eval, 0)

    text_generated = []
    reset_model_states(model)

    for _ in range(num_generate):
        predictions = model(input_eval)
        predictions = tf.squeeze(predictions, 0)

        # Adjust with temperature
        predictions = predictions / temperature

        # Sample from the distribution
        predicted_id = tf.random.categorical(predictions, num_samples=1)[-1, 0].numpy()

        # Use predicted character as next input
        input_eval = tf.expand_dims([predicted_id], 0)
        text_generated.append(idx2char[predicted_id])

    return start_string + ''.join(text_generated)

In [30]:
print(generate_text(inference_model, start_string="Shall ",))

Shall this,
Slome love wayd Ingulim applouse, when thould wim, to truet,
Andaine baiter my my candew and hese grom ance,


S             10

I and in my grest ‘ond murthreds:
Aon thown with feme, wint my lozeast, bast, wild witt bling, con ghat I kelle
Dy thrs tost thit to dought croemr’s rpovjing ale wand,
Oot de reausth apt mavels walld now to muse.


                     17

Sine yothing oftile, where thate bads reaq me.


                17


Whand thate bus outing doth lite,


5.	Explain the role of temperature scaling in text generation and its effect on randomness.

Randomness causes the model to pick the next character stochastically. Higher temperature causes less predictability and more creativity.