# **Text generation with an RNN**
This tutorial demonstrates how to generate text using a character-based RNN. We will work with a Shakespeare dataset. Given a sequence of characters from this data ("Shakespear"), training a model to predict the next character in the sequence. Longer sequences of text can be generated by calling the model repeatedly.

### Importing Libraries

In [1]:
import tensorflow as tf

import numpy as np
import os
import time




### Downloading the Shakespeare dataset and Exploring it

In [2]:
data_url = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
dataset_text = open(data_url, 'rb').read().decode(encoding='utf-8')
print(dataset_text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



In [3]:
len(dataset_text)

1115394

In [4]:
# obtain the unique characters in the dataset and print out their length 
vocab = sorted(set(dataset_text))
print ('{} unique characters'.format(len(vocab)))

65 unique characters


In [5]:
vocab

['\n',
 ' ',
 '!',
 '$',
 '&',
 "'",
 ',',
 '-',
 '.',
 '3',
 ':',
 ';',
 '?',
 'A',
 'B',
 'C',
 'D',
 'E',
 'F',
 'G',
 'H',
 'I',
 'J',
 'K',
 'L',
 'M',
 'N',
 'O',
 'P',
 'Q',
 'R',
 'S',
 'T',
 'U',
 'V',
 'W',
 'X',
 'Y',
 'Z',
 'a',
 'b',
 'c',
 'd',
 'e',
 'f',
 'g',
 'h',
 'i',
 'j',
 'k',
 'l',
 'm',
 'n',
 'o',
 'p',
 'q',
 'r',
 's',
 't',
 'u',
 'v',
 'w',
 'x',
 'y',
 'z']

### Processing the data

In [2]:
# Creating a mapping from unique characters to indices
char2idx = {char:index for index, char in enumerate(vocab)}
char2idx

NameError: name 'vocab' is not defined

In [7]:
idx2char = {index:char for index, char in enumerate(vocab)}
idx2char

{0: '\n',
 1: ' ',
 2: '!',
 3: '$',
 4: '&',
 5: "'",
 6: ',',
 7: '-',
 8: '.',
 9: '3',
 10: ':',
 11: ';',
 12: '?',
 13: 'A',
 14: 'B',
 15: 'C',
 16: 'D',
 17: 'E',
 18: 'F',
 19: 'G',
 20: 'H',
 21: 'I',
 22: 'J',
 23: 'K',
 24: 'L',
 25: 'M',
 26: 'N',
 27: 'O',
 28: 'P',
 29: 'Q',
 30: 'R',
 31: 'S',
 32: 'T',
 33: 'U',
 34: 'V',
 35: 'W',
 36: 'X',
 37: 'Y',
 38: 'Z',
 39: 'a',
 40: 'b',
 41: 'c',
 42: 'd',
 43: 'e',
 44: 'f',
 45: 'g',
 46: 'h',
 47: 'i',
 48: 'j',
 49: 'k',
 50: 'l',
 51: 'm',
 52: 'n',
 53: 'o',
 54: 'p',
 55: 'q',
 56: 'r',
 57: 's',
 58: 't',
 59: 'u',
 60: 'v',
 61: 'w',
 62: 'x',
 63: 'y',
 64: 'z'}

In [8]:
# Convert the dataset from 'characters' to 'integers'
text_as_int = [char2idx[char] for char in dataset_text]
text_as_int[:250]

[18,
 47,
 56,
 57,
 58,
 1,
 15,
 47,
 58,
 47,
 64,
 43,
 52,
 10,
 0,
 14,
 43,
 44,
 53,
 56,
 43,
 1,
 61,
 43,
 1,
 54,
 56,
 53,
 41,
 43,
 43,
 42,
 1,
 39,
 52,
 63,
 1,
 44,
 59,
 56,
 58,
 46,
 43,
 56,
 6,
 1,
 46,
 43,
 39,
 56,
 1,
 51,
 43,
 1,
 57,
 54,
 43,
 39,
 49,
 8,
 0,
 0,
 13,
 50,
 50,
 10,
 0,
 31,
 54,
 43,
 39,
 49,
 6,
 1,
 57,
 54,
 43,
 39,
 49,
 8,
 0,
 0,
 18,
 47,
 56,
 57,
 58,
 1,
 15,
 47,
 58,
 47,
 64,
 43,
 52,
 10,
 0,
 37,
 53,
 59,
 1,
 39,
 56,
 43,
 1,
 39,
 50,
 50,
 1,
 56,
 43,
 57,
 53,
 50,
 60,
 43,
 42,
 1,
 56,
 39,
 58,
 46,
 43,
 56,
 1,
 58,
 53,
 1,
 42,
 47,
 43,
 1,
 58,
 46,
 39,
 52,
 1,
 58,
 53,
 1,
 44,
 39,
 51,
 47,
 57,
 46,
 12,
 0,
 0,
 13,
 50,
 50,
 10,
 0,
 30,
 43,
 57,
 53,
 50,
 60,
 43,
 42,
 8,
 1,
 56,
 43,
 57,
 53,
 50,
 60,
 43,
 42,
 8,
 0,
 0,
 18,
 47,
 56,
 57,
 58,
 1,
 15,
 47,
 58,
 47,
 64,
 43,
 52,
 10,
 0,
 18,
 47,
 56,
 57,
 58,
 6,
 1,
 63,
 53,
 59,
 1,
 49,
 52,
 53,
 61,
 1,
 15,
 39,
 47,

In [11]:
len(text_as_int)

1115394

In [12]:
# converting the text vector into a stream of character indices using from_tensor_slices function from tf.data.dataset
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

In [None]:
# visualizing some chars from char_dataset
for i in char_dataset.take(250):
  print(idx2char[i.numpy()])

F
i
r
s
t
 
C
i
t
i
z
e
n
:


B
e
f
o
r
e
 
w
e
 
p
r
o
c
e
e
d
 
a
n
y
 
f
u
r
t
h
e
r
,
 
h
e
a
r
 
m
e
 
s
p
e
a
k
.




A
l
l
:


S
p
e
a
k
,
 
s
p
e
a
k
.




F
i
r
s
t
 
C
i
t
i
z
e
n
:


Y
o
u
 
a
r
e
 
a
l
l
 
r
e
s
o
l
v
e
d
 
r
a
t
h
e
r
 
t
o
 
d
i
e
 
t
h
a
n
 
t
o
 
f
a
m
i
s
h
?




A
l
l
:


R
e
s
o
l
v
e
d
.
 
r
e
s
o
l
v
e
d
.




F
i
r
s
t
 
C
i
t
i
z
e
n
:


F
i
r
s
t
,
 
y
o
u
 
k
n
o
w
 
C
a
i
u
s
 
M
a
r
c
i
u
s
 
i
s
 
c
h
i
e
f
 
e
n
e
m
y
 
t
o
 
t
h
e
 
p
e
o
p
l
e
.




In [None]:
# function to convert ids to text
def idx2text(ids):
  return ''.join([idx2char[i] for i in ids])

In [None]:
# dividing the text into example sequences. Each input sequence will contain seq_length characters from the text.
seq_length = 100
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)
for item in sequences.take(5):
  print(idx2text(item.numpy()))

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 
are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you k
now Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us ki
ll him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be d
one: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor citi


In [None]:
# For each sequence, we duplicated and shifted it to form the input and target text by using the `map` method to apply a simple function to each batch:
def split_input_target(chunk):
    input_text = chunk[:-1]
    target_text = chunk[1:]
    return input_text, target_text

dataset = sequences.map(split_input_target)

In [None]:
for input_example, target_example in  dataset.take(1):
  print ('Input data:\n',idx2text(input_example.numpy()))
  print("---------------------------------------------------------------------")
  print ('Target data:\n',idx2text(target_example.numpy()))

Input data:
 First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You
---------------------------------------------------------------------
Target data:
 irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


In [None]:
# Shuffling the dataset and it into batches
BATCH_SIZE = 64
BUFFER_SIZE = 10000

dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

### Building and training the model
Using tf.keras.Sequential to define the model. Three layers are used:

- tf.keras.layers.Embedding: The first layer that mapping the numbers of each character to a vector with embedding_dim dimensions
- tf.keras.layers.GRU
- tf.keras.layers.Dense: The output layer, with vocab_size outputs

In [None]:
# Length of the vocabulary in chars
vocab_size = len(vocab)

# The embedding dimension
embedding_dim = 256

# Number of RNN units
rnn_units = 1024

In [None]:
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,return_sequences=True,stateful=True,recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

model = build_model(vocab_size = len(vocab),embedding_dim=embedding_dim,rnn_units=rnn_units,batch_size=BATCH_SIZE)

In [None]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (64, None, 256)           16640     
                                                                 
 gru (GRU)                   (64, None, 1024)          3938304   
                                                                 
 dense (Dense)               (64, None, 65)            66625     
                                                                 
Total params: 4,021,569
Trainable params: 4,021,569
Non-trainable params: 0
_________________________________________________________________


In [None]:
for input_example_batch, target_example_batch in dataset.take(1):
  example_batch_predictions = model(input_example_batch)
  print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

(64, 100, 65) # (batch_size, sequence_length, vocab_size)


To get actual predictions from the model we need to sample from the output distribution, to get actual character indices. This distribution is defined by the logits over the character vocabulary.

In [None]:
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()
sampled_indices

array([ 4, 45, 11, 40, 15,  5,  6, 16, 46, 19,  6, 40,  1, 11, 20, 58,  7,
       18, 28, 30, 54, 20, 19, 47, 12, 21, 11, 16, 20, 15,  9, 59, 11, 37,
        4, 48, 64, 47, 60, 40,  0, 62, 13, 40, 18, 37, 35,  7, 59, 44, 47,
       34, 19, 41,  8, 59, 60, 22, 31, 55, 28, 46,  2,  0, 48, 23, 33, 53,
       40, 58, 32, 53, 33, 41, 22, 46, 34, 39, 54, 26, 38, 30, 53, 48, 40,
       28,  0, 58, 50, 53,  3, 43,  6,  4, 53, 20, 17, 19, 56,  2])

In [None]:
# Results from an untrained model 
print("Input: \n", idx2text(input_example_batch[0].numpy()))
print()
print("Next Char Predictions: \n", idx2text(sampled_indices))

Input: 
 of whereof, there is my honour's pawn;
Engage it to the trial, if thou darest.

LORD FITZWATER:
How 

Next Char Predictions: 
 &g;bC',DhG,b ;Ht-FPRpHGi?I;DHC3u;Y&jzivb
xAbFYW-ufiVGc.uvJSqPh!
jKUobtToUcJhVapNZRojbP
tlo$e,&oHEGr!


In [None]:
# defining the loss and calculating it before training
loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True)

example_batch_loss  = loss(target_example_batch, example_batch_predictions)
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
print("scalar_loss:      ", example_batch_loss.numpy().mean())

Prediction shape:  (64, 100, 65)  # (batch_size, sequence_length, vocab_size)
scalar_loss:       4.1748095


In [None]:
# compiling the model
model.compile(optimizer='adam', loss=loss)

In [None]:
# training the model
history = model.fit(dataset, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [None]:
# saving the model weights of the last epoch
model.save_weights('model_weights.h5')

### Prediction

In [None]:
# building the model again with batch_size of 1 for prediction
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights('/content/model_weights.h5')
model.build(tf.TensorShape([1, None]))

In [None]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (1, None, 256)            16640     
                                                                 
 gru_1 (GRU)                 (1, None, 1024)           3938304   
                                                                 
 dense_1 (Dense)             (1, None, 65)             66625     
                                                                 
Total params: 4,021,569
Trainable params: 4,021,569
Non-trainable params: 0
_________________________________________________________________


#### The prediction loop
The following code block generates the text:

- It Starts by choosing a start string, initializing the RNN state and setting the number of characters to generate.

- Get the prediction distribution of the next character using the start string and the RNN state.

- Then, use a categorical distribution to calculate the index of the predicted character. Use this predicted character as our next input to the model.

- The RNN state returned by the model is fed back into the model so that it now has more context, instead than only one word. After predicting the next word, the modified RNN states are again fed back into the model, which is how it learns as it gets more context from the previously predicted words.

In [None]:

def generate_text(model, start_string):
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 1000

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  # Experiment to find the best setting.
  temperature = 1.0

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the word returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted word as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

In [None]:
print(generate_text(model, start_string=u"ROMEO: "))

ROMEO: Grumio, go you twenty princely general: for this time,
The golden carses for grise and my bring his lady passing breath: the
seven'd with our trobberous death, is dust.
Our ships mistaking to a second cause?

KING RICHARD II:
First, thenefore castle my for getting at your face?
Prepare you, Catesby.
Thou'rt any caust. This is another,
Thou know'st, as now but one, that Henry, for this dire,
And in their gartes shall be come.'

PETRUCHIO:
A deceit brooks, and sullen show
our dustiness spoken:
In all pleased for a little breeling left by the
butt thou wast barrant forbid, come again,
And I'll swear to rsy titles of thy soul!
Councilst Richard, now methld not be brief.

BUCKINGHAM:
Why should be cold, All to pieces: boy, here's none;
Away with her, she could be executine, so ling is done.
The Volick, word if about
To old freedom, rage: let me see thee better
Thus doing the instrument delivers' heirs,
A mark thee chamber-with world!
Fear not, my ass me Swoon;'
And, those Claudio to 