<h1>RNN PLay Generator

In [1]:
from keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np

<h1>Dataset</h1>
<p>We only need one peice of training data. In fact, we can write our own poem or play and pass that to the network for training if we'd like. However, to make things easy we'll use an extract from a shakesphere play.</p>

In [2]:
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

<h1>Read Contents of File

In [3]:
# Read and decode
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
# length of text is the number of characters in it
print ('Length of text: {} characters'.format(len(text)))

Length of text: 1115394 characters


In [4]:
# Take a look at the first 250 characters in text
print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



<h1>Encoding</h1>
<p>Since this text isn't encoded yet well need to do that ourselves. We are going to encode each unique character as a different integer.</p>

In [5]:
vocab = sorted(set(text))
# Creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}        #Dictionary as {a:0,b:1,....,A:26,B:27,......,@:99.....}
idx2char = np.array(vocab)                           #Dictionary as {0:a,1:b,....,26:A,27:B,......,99:@.....} just opposite

def text_to_int(text):
  return np.array([char2idx[c] for c in text])       #Doing same for whole file

text_as_int = text_to_int(text)

In [6]:
# lets look at how part of our text is encoded
print("Text:", text[:13])
print("Encoded:", text_to_int(text[:13]))

Text: First Citizen
Encoded: [18 47 56 57 58  1 15 47 58 47 64 43 52]


<p>Now, we will make a function that can convert our numeric values to text.

In [7]:
def int_to_text(ints):
  try:
    ints = ints.numpy()
  except:
    pass
  return ''.join(idx2char[ints])

print(int_to_text(text_as_int[:13]))

First Citizen


<h1>Creating Training Examples</h1>
<p>The training examples we will prepapre will use a seq_length sequence as input and a seq_length sequence as the output where that sequence is the original sequence shifted one letter to the right. For example:

input: Hell | output: ello

Our first step will be to create a stream of characters from our text data.</p>

In [8]:
seq_length = 100  # length of sequence for a training example
examples_per_epoch = len(text)//(seq_length+1)

# Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

<p>Next we can use the batch method to turn this stream of characters into batches of desired length.

In [9]:
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

<p>Now we need to use these sequences of length 101 and split them into input and output.</p>

In [10]:
def split_input_target(chunk):  # for the example: hello
    input_text = chunk[:-1]  # hell
    target_text = chunk[1:]  # ello
    return input_text, target_text  # hell, ello

dataset = sequences.map(split_input_target)  # we use map to apply the above function to every entry

In [11]:
for x, y in dataset.take(1):
  print("\n\nEXAMPLE\n")
  print("INPUT")
  print(int_to_text(x))
  print("\nOUTPUT")
  print(int_to_text(y))



EXAMPLE

INPUT
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You

OUTPUT
irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


<h1>Setting parameters for training batches.

In [12]:
BATCH_SIZE = 64
VOCAB_SIZE = len(vocab)  # vocab is number of unique characters
EMBEDDING_DIM = 256
RNN_UNITS = 1024
BUFFER_SIZE = 10000

data = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

<h1>Building the Model</h1>
<p>Using an embedding layer a LSTM and one dense layer that contains a node for each unique character in our training data. The dense layer will give us a probability distribution over all nodes.</p>

In [13]:
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.LSTM(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

model = build_model(VOCAB_SIZE,EMBEDDING_DIM, RNN_UNITS, BATCH_SIZE)
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (64, None, 256)           16640     
                                                                 
 lstm (LSTM)                 (64, None, 1024)          5246976   
                                                                 
 dense (Dense)               (64, None, 65)            66625     
                                                                 
Total params: 5,330,241
Trainable params: 5,330,241
Non-trainable params: 0
_________________________________________________________________


<h1>Loss Function

In [14]:
for input_example_batch, target_example_batch in data.take(1):
  example_batch_predictions = model(input_example_batch)  # ask our model for a prediction on our first batch of training data (64 entries)
  print(example_batch_predictions.shape, "(batch_size, sequence_length, vocab_size)")  # print out the output shape

(64, 100, 65) (batch_size, sequence_length, vocab_size)


In [15]:
# we can see that the predicition is an array of 64 arrays, one for each entry in the batch
print(len(example_batch_predictions))
print(example_batch_predictions)

64
tf.Tensor(
[[[ 4.4652903e-03  5.6560151e-04 -5.8756075e-03 ... -3.1550247e-03
    2.5409414e-03 -3.4473345e-03]
  [ 1.6467000e-03  5.4449183e-03 -7.7020298e-03 ... -7.5246103e-04
   -1.4435524e-03 -1.3011876e-03]
  [ 4.3017180e-03  1.0730582e-03 -2.3489846e-03 ...  3.8030453e-04
   -5.0856487e-04 -1.3130194e-03]
  ...
  [ 6.9130771e-04  5.9355116e-03 -1.0107053e-02 ...  5.5837510e-03
   -8.7660886e-03 -3.7239729e-03]
  [ 6.4380481e-03  5.0694137e-03 -1.4482056e-02 ...  2.5711700e-03
   -3.9925021e-03 -5.5233878e-03]
  [ 6.3166535e-04  3.3272374e-03 -1.3914468e-02 ...  4.2586708e-03
    2.6309100e-04 -1.6786798e-04]]

 [[-2.3845951e-03  4.9049510e-03 -3.5410062e-03 ...  1.3664316e-03
   -3.3933003e-03  1.1250216e-03]
  [ 3.1489141e-03  4.0377458e-03 -8.5084438e-03 ... -1.1129791e-03
    2.3252796e-06 -2.4200277e-03]
  [ 1.6007248e-03  3.5638264e-03 -8.0570029e-03 ... -3.9770254e-03
    3.0765240e-03 -2.7433480e-03]
  ...
  [ 3.0552661e-03  9.0393657e-04 -4.8591755e-03 ...  7.5257639e

In [16]:
# lets examine one prediction
pred = example_batch_predictions[0]
print(len(pred))
print(pred)
# notice this is a 2d array of length 100, where each interior array is the prediction for the next character at each time step

100
tf.Tensor(
[[ 0.00446529  0.0005656  -0.00587561 ... -0.00315502  0.00254094
  -0.00344733]
 [ 0.0016467   0.00544492 -0.00770203 ... -0.00075246 -0.00144355
  -0.00130119]
 [ 0.00430172  0.00107306 -0.00234898 ...  0.0003803  -0.00050856
  -0.00131302]
 ...
 [ 0.00069131  0.00593551 -0.01010705 ...  0.00558375 -0.00876609
  -0.00372397]
 [ 0.00643805  0.00506941 -0.01448206 ...  0.00257117 -0.0039925
  -0.00552339]
 [ 0.00063167  0.00332724 -0.01391447 ...  0.00425867  0.00026309
  -0.00016787]], shape=(100, 65), dtype=float32)


In [17]:
# finally well look at a prediction at the first timestep
time_pred = pred[0]
print(len(time_pred))
print(time_pred)
# its 65 values representing the probabillity of each character occuring next

65
tf.Tensor(
[ 4.4652903e-03  5.6560151e-04 -5.8756075e-03 -3.4436495e-03
  1.1693568e-03  8.4362030e-03  2.1111318e-03 -4.6981727e-03
 -5.6534534e-04 -1.5818445e-03  4.2583590e-04 -2.9432266e-03
 -7.7270420e-04 -6.2912786e-03  1.4406953e-02  5.0413981e-03
  4.4038510e-03  2.3786332e-03 -2.4063347e-03  4.0869191e-03
 -1.2744906e-03 -2.0532024e-03  1.0282603e-03 -6.2320475e-04
  2.4158605e-03 -4.3068863e-03  9.8785525e-04  2.0409461e-04
  2.0482712e-03 -2.1841149e-03 -2.3705035e-03  5.5924319e-03
  1.6063289e-03  1.4207226e-03  3.1630828e-03 -1.9738611e-04
 -1.8057879e-04 -1.2948040e-03 -3.3226958e-03 -2.7639382e-03
 -4.9469317e-04  5.2293232e-03 -2.7817287e-03 -2.5667367e-04
 -3.5711552e-04 -2.1074454e-03  2.2691051e-03  1.8123249e-03
  8.3695666e-04  3.0855567e-03  1.8933845e-03  1.8395863e-03
 -1.5628955e-04  9.5451018e-05 -4.5374362e-03  3.2223901e-03
  9.9055413e-03  1.5119186e-03  5.2650473e-03  1.3368174e-03
 -2.4043596e-03  1.7987591e-03 -3.1550247e-03  2.5409414e-03
 -3.447334

In [18]:
# If we want to determine the predicted character we need to sample the output distribution (pick a value based on probabillity)
sampled_indices = tf.random.categorical(pred, num_samples=1)

# Reshape that array and convert all the integers to numbers to see the actual characters
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]
predicted_chars = int_to_text(sampled_indices)

predicted_chars  # Model predicted for training sequence 1

"i;XialIPr.gA-a?RXQ,axNWaRqa'?YhA\nMBdVRAa$d3cYpuXalfPcsDf?tQmKy'cAbGURWGBBHeEy:;gFcL.\nvwwL$;bUfjCmKkY"

In [19]:
# Loss function 
def loss(labels, logits):
  return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

<h1>Compiling the Model</h1>

In [20]:
model.compile(optimizer='adam', loss=loss)

<h1>Creating Checkpoints to save weights and bias</h1>

In [21]:
# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

<h1>Training</h1>
<p>We'll train our model with high number of epochs to get better accuracy and lesser loss.</p>

In [22]:
history = model.fit(data, epochs=5, callbacks=[checkpoint_callback])

Epoch 1/5


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<h1>Loading the Model</h1>
<p>We have trained our model with high number of epochs and batch-size of 64</p>
<p>But we'll predict output for only one input(batch)</p>

In [23]:
model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, batch_size=1)

<p>Loading the weights and bias from checkpoints</p>

In [24]:
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))

<h1>Generating Play</h1>

In [25]:
def generate_text(model, start_string):
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 800

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  temperature = 1.0

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
    
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the character returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted character as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

<h1>Getting input....</h1>

In [26]:
inp = input("Type a starting string: ")
print(generate_text(model, inp))

romeo says.

QUEEN ELIZABETH:
Did, if I than a sound gursed men's ant
this monusien. I spyacoward where it thear no most ruther,
Jown: he' be not tite, my gradlip graces.

Provost:

PAPIS:
I'ld spring cet thee; sfe mearth the woret love?
And take those fraghness: therefore layferlaps, see the trief and daughter's blay of a battle cause
As you ronam, stand and grace and one
To whow most grace I have pride the modion;
I merry, true answer.

RONEZ:
Nay, my rogment bean spoke's wife, therefore
forget most me put in enemne blood on the most faster:
And, was he my masch, that thou seep often him,
Grseghing and orce subjoiciof clears.
Care wau surners; poor sweet clopit down my bodom.

LADY CAPULET:
I do not stone the warlbad and thiss morrow with
the foot to love with times and pardon'd was no more,
And n
