## Problem statement -

**Generate "fake English" text from an RNN**


### Task

Given a character, or a sequence of characters, what is the most probable next character?

In [0]:
from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf

import numpy as np
import os
import time

#### Read data

Dataset : Shakespeare's work, 
It is in text format

In [0]:
text = open('shakespeare_input.txt', 'rb').read().decode(encoding='utf-8')

In [3]:
#we can find the length of the text using len function
len(text)

4573338

In [4]:
text[:100]  #first 100 characters

'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou'

In [5]:
#we can find the vocabulary size by counting the unique characters in text
vocab=sorted(set(text))
len(vocab)

67

### Process data
As we can feed text directly to our model so need to map our text data to some vector, so that we can have integer representation of each character

In [0]:
# Creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

text_as_int = np.array([char2idx[c] for c in text])

In [7]:
print(repr(text[:15]))
print(' ---------------')
print(text_as_int[:15])

'First Citizen:\n'
 ---------------
[18 49 58 59 60  1 15 49 60 49 66 45 54 10  0]


#### Training

Now we need to divide our text into a sequences, Each input sequence will contain seq_length characters from the text.
For each input sequence, the corresponding targets contain the same length of text, except shifted one character to the right.

So break the text into chunks of seq_length+1. For example, say seq_length is 4 and our text is "Hello". The input sequence would be "Hell", and the target sequence "ello".

In [8]:
# The tf.data.Dataset.from_tensor_slices function is used to convert the text vector into a stream of character indices.

# The maximum length sentence we want for a single input in characters
seq_length = 100
examples_per_epoch = len(text)//seq_length

# Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

for i in char_dataset.take(5):
  print(i.numpy(),'-',idx2char[i.numpy()])

18 - F
49 - i
58 - r
59 - s
60 - t


In [10]:
# now we will convert this individual characters into sequences using batch method 
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

for item in sequences.take(5):
  print(item)
  print('-'*20)
  print(repr(''.join(idx2char[item.numpy()])))
  print('-'*20)
  break

tf.Tensor(
[18 49 58 59 60  1 15 49 60 49 66 45 54 10  0 14 45 46 55 58 45  1 63 45
  1 56 58 55 43 45 45 44  1 41 54 65  1 46 61 58 60 48 45 58  6  1 48 45
 41 58  1 53 45  1 59 56 45 41 51  8  0  0 13 52 52 10  0 31 56 45 41 51
  6  1 59 56 45 41 51  8  0  0 18 49 58 59 60  1 15 49 60 49 66 45 54 10
  0 37 55 61  1], shape=(101,), dtype=int64)
--------------------
'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '
--------------------


In [0]:
# as we want our input and target as : The input sequence would be "Hell", and the target sequence "ello"
# we can use map function to apply a simple function to each batch

def split_input_target(chunk):
    input_text = chunk[:-1]
    target_text = chunk[1:]
    return input_text, target_text

dataset = sequences.map(split_input_target)

In [12]:
for input_x, target_x in  dataset.take(1):
  print ('Input data: ', repr(''.join(idx2char[input_x.numpy()])))
  print ('Target data:', repr(''.join(idx2char[target_x.numpy()])))

Input data:  'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou'
Target data: 'irst Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '


Each index of these vectors are processed as one time step. For the input at time step 0, the model receives the index for "F" and trys to predict the index for "i" as the next character. At the next timestep, it does the same thing but the RNN considers the previous step context in addition to the current input character.

In [13]:
for i, (input_idx, target_idx) in enumerate(zip(input_x[:5], target_x[:5])):
    print("Step {:4d}".format(i))
    print("  input: {} ({:s})".format(input_idx, repr(idx2char[input_idx])))
    print("  expected output: {} ({:s})".format(target_idx, repr(idx2char[target_idx])))

Step    0
  input: 18 ('F')
  expected output: 49 ('i')
Step    1
  input: 49 ('i')
  expected output: 58 ('r')
Step    2
  input: 58 ('r')
  expected output: 59 ('s')
Step    3
  input: 59 ('s')
  expected output: 60 ('t')
Step    4
  input: 60 ('t')
  expected output: 1 (' ')


In [14]:
# Batch size
BATCH_SIZE = 64

# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements).
BUFFER_SIZE = 10000

dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

dataset

<BatchDataset shapes: ((64, 100), (64, 100)), types: (tf.int64, tf.int64)>

#### Model

I have used `tf.keras.Sequential` to define the model. 

We can add as many layers  as  we need in our  Sequential model

* `tf.keras.layers.Embedding`: The input layer. A trainable lookup table that will map the numbers of each character to a vector with `embedding_dim` dimensions;

* `tf.keras.layers.GRU`: A type of RNN with size `units=rnn_units` 

* `tf.keras.layers.LSTM`: A type of RNN with memory units of size `units=rnn_units`

* `tf.keras.layers.Dense`: The output layer, with `vocab_size` outputs.

In [0]:
# Length of the vocabulary in chars
vocab_size = len(vocab)

# The embedding dimension
embedding_dim = 256

# Number of RNN units
rnn_units = 1024

Thus can try different layers together,

#### Model 1
- Embedding -> Lstm -> Dense

In [0]:
def build_model1(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.LSTM(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

In [0]:
model1 = build_model1(
  vocab_size = len(vocab),
  embedding_dim=embedding_dim,
  rnn_units=rnn_units,
  batch_size=BATCH_SIZE)

For each character the model looks up the embedding, runs the LSTM one timestep with the embedding as input, and applies the dense layer to generate logits predicting the log-liklihood of the next character:

In [20]:
for input_example_batch, target_example_batch in dataset.take(1):
  example_batch_predictions = model1(input_example_batch)
  print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

(64, 100, 67) # (batch_size, sequence_length, vocab_size)


In [23]:
model1.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (64, None, 256)           17152     
_________________________________________________________________
lstm (LSTM)                  (64, None, 1024)          5246976   
_________________________________________________________________
dense (Dense)                (64, None, 67)            68675     
Total params: 5,332,803
Trainable params: 5,332,803
Non-trainable params: 0
_________________________________________________________________


In [0]:
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()

In [25]:
sampled_indices

array([62, 33, 36, 17,  9, 26, 28, 54, 63, 53, 57, 13, 24, 45, 27, 29, 44,
       55, 54, 49, 50, 52, 28, 27, 23, 62, 39, 13, 12, 10, 18, 36, 49, 62,
       12, 26, 41, 22, 33, 48, 13, 42,  3,  4,  2, 50, 42, 34, 13, 46, 27,
       28, 50, 59, 53, 59, 63, 52, 28, 42, 65, 48,  4, 40, 62, 66, 17, 66,
        9, 30, 25, 21, 31,  0, 23, 12, 65, 56, 40, 37, 28, 39, 43, 26, 32,
       24, 44, 49, 46, 12, 47, 64, 56, 13, 58,  3, 55, 27, 19, 56])

In [26]:
# we will decode this sampled_indices into characters
print("Input: \n", repr("".join(idx2char[input_example_batch[0]])))
print()
print("Next Char Predictions: \n", repr("".join(idx2char[sampled_indices ])))

Input: 
 "e stab him as he sleeps?\n\nFirst Murderer:\nNo; then he will say 'twas done cowardly, when he wakes.\n\n"

Next Char Predictions: 
 'vUXE3NPnwmqALeOQdonijlPOKv[A?:FXiv?NaJUhAb$&!jbVAfOPjsmswlPbyh&]vzEz3RMIS\nK?yp]YP[cNTLdif?gxpAr$oOGp'


Now we will train our model1

We have to add optimizer and loss function to our model1

In [0]:
def loss(labels, logits):
  return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

In [58]:
example_batch_loss  = loss(target_example_batch, example_batch_predictions)
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
print("scalar_loss:      ", example_batch_loss.numpy().mean())

Prediction shape:  (64, 100, 67)  # (batch_size, sequence_length, vocab_size)
scalar_loss:       4.2048573


Configure the training procedure using the tf.keras.Model.compile method. We'll use tf.keras.optimizers.Adam with default arguments and the loss function.

In [0]:
model1.compile(optimizer='adam', loss=loss)

We need to also save our model at checkpoints 

for this we will used a tf.keras.callbacks.ModelCheckpoint to ensure that checkpoints are saved during training:

In [0]:
# Directory where the checkpoints will be saved
checkpoint_dir = 'training_checkpoints_lstm_dense'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

tensorflow_checkpoint=tf.keras.callbacks.TensorBoard(log_dir='logs',write_graph=True)

Now we will run our model1

In [0]:
EPOCHS=10

In [33]:
history = model1.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


#### Now we will use our model1 for generating text

#### Restore



Because of the way the RNN state is passed from timestep to timestep, the model only accepts a fixed batch size once built.

To run the model with a different batch_size, we need to rebuild the model and restore the weights from the checkpoint.


In [34]:
tf.train.latest_checkpoint(checkpoint_dir)

'training_checkpoints/ckpt_10'

In [0]:
model = build_model1(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))

In [36]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (1, None, 256)            17152     
_________________________________________________________________
lstm_1 (LSTM)                (1, None, 1024)           5246976   
_________________________________________________________________
dense_1 (Dense)              (1, None, 67)             68675     
Total params: 5,332,803
Trainable params: 5,332,803
Non-trainable params: 0
_________________________________________________________________


### The prediction loop

The following code block generates the text:

* It Starts by choosing a start string, initializing the RNN state and setting the number of characters to generate.

* Get the prediction distribution of the next character using the start string and the RNN state.

* Then, use a categorical distribution to calculate the index of the predicted character. Use this predicted character as our next input to the model.

* The RNN state returned by the model is fed back into the model so that it now has more context, instead than only one word. After predicting the next word, the modified RNN states are again fed back into the model, which is how it learns as it gets more context from the previously predicted words.


To generate text the model's output is fed back to the input

Looking at the generated text, you'll see the model knows when to capitalize, make paragraphs and imitates a Shakespeare-like writing vocabulary. With the small number of training epochs, it has not yet learned to form coherent sentences.

In [0]:
def generate_text(model, start_string):
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 1000

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  # Experiment to find the best setting.
  
  tempera=[0.1,0.5,1.0,1.5,2]
  
  for temp in tempera:
    # Empty string to store our results
    text_generated = []
    # Here batch size == 1
    model.reset_states()
        
    for i in range(num_generate):
        predictions = model(input_eval)
        # remove the batch dimension
        predictions = tf.squeeze(predictions, 0)

        # using a categorical distribution to predict the word returned by the model
        predictions = predictions / temp
        predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

        # We pass the predicted word as the next input to the model
        # along with the previous hidden state
        input_eval = tf.expand_dims([predicted_id], 0)

        text_generated.append(idx2char[predicted_id])
    
    ## saving the generated text to txt file
    f = open("generatedtext_"+str(temp)+".txt", "a")
    f.writelines(text_generated)
    f.close()
    ##
    
    print('--'*30)
    print('With temperature:',temp)
    print('--'*30)
    print(start_string + ''.join(text_generated))
  #return (start_string + ''.join(text_generated))

In [83]:
#lstm
generate_text(model2load, start_string=u"ESCALUS: ")

------------------------------------------------------------
With temperature: 0.1
------------------------------------------------------------
ESCALUS: I will desire thee again.

ARVIRAGUS:
You are the course of my dear lordship.

LAUNCELOT:
I will confess the mean time to the parties: I will find
The court of England.

LAUNCELOT:
I will confess the money that I have not seen the mother of my love.

POSTHUMUS LEONATUS:
And I will speak a word of the moon.

Second Lord:
I thank you, good my lord.

CLOTEN:
The count hath been a solemn beard to stand at the devil.

SIR TOBY BELCH:
O, pardon me, my lord.

PRINCESS:
What is the matter? what should I stay the devil in the world?

PISANIO:
And I will follow thee a word of man that speaks of me
And the contrary of a soldier here a good deed.

CLIFFORD:
And I will speak a word of my dear lord.

LAUNCELOT:
I will desire thee to the ground, and the contrary care
I must be sounded in the world: I have not seen
The secrets of my father's love.

SU

#### Using GRU Layer

In [0]:
def build_model2(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

In [0]:
model2 = build_model2(
  vocab_size = len(vocab),
  embedding_dim=embedding_dim,
  rnn_units=rnn_units,
  batch_size=BATCH_SIZE)

In [57]:
model2.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (64, None, 256)           17152     
_________________________________________________________________
gru (GRU)                    (64, None, 1024)          3938304   
_________________________________________________________________
dense_2 (Dense)              (64, None, 67)            68675     
Total params: 4,024,131
Trainable params: 4,024,131
Non-trainable params: 0
_________________________________________________________________


In [0]:
model2.compile(optimizer='adam', loss=loss)

In [62]:
history2 = model2.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [0]:
model2load = build_model2(vocab_size, embedding_dim, rnn_units, batch_size=1)
model2load.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model2load.build(tf.TensorShape([1, None]))

In [67]:
model2load.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_3 (Embedding)      (1, None, 256)            17152     
_________________________________________________________________
gru_1 (GRU)                  (1, None, 1024)           3938304   
_________________________________________________________________
dense_3 (Dense)              (1, None, 67)             68675     
Total params: 4,024,131
Trainable params: 4,024,131
Non-trainable params: 0
_________________________________________________________________


In [72]:
#GRU
generate_text(model2load, start_string=u"ESCALUS: ")

------------------------------------------------------------
With temperature: 0.1
------------------------------------------------------------
ESCALUS: I will desire thee with the
strength of a strange and a man as he were as good as the sea,
Which should be so able to be but the devil in the world,
That hath a part of a fair conference that it doth begin
to give them me against the beards of love
And call the streets of England shall be sounded.

SIR ANDREW:
An you do not know my dear friends to the state of mine;
And I will stay the course of my dear lordship.

LAUNCELOT:
I will consider thee in the world:
I am a soldier to a soldier that shall be the world.

PISANIO:
I am not so well that the matter of my lord,
The cardinal, in the world is sound,
And then he shall be so able to be bless'd with their shores.
The cardinal, if thou didst not see the deed.

SUFFOLK:
A plague upon him, and I will not say
'The devil and my father's leaves and charge of fear.

LONGAVILLE:
You may that be

#### Model3 
**embedding -> lstm -> dense(1024) -> dense(67)**

In [0]:
def build_model3(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.LSTM(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(1024),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

In [0]:
model3 = build_model3(
  vocab_size = len(vocab),
  embedding_dim=embedding_dim,
  rnn_units=rnn_units,
  batch_size=BATCH_SIZE)

In [78]:
model3.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_5 (Embedding)      (64, None, 256)           17152     
_________________________________________________________________
lstm_2 (LSTM)                (64, None, 1024)          5246976   
_________________________________________________________________
dense_5 (Dense)              (64, None, 1024)          1049600   
_________________________________________________________________
dense_6 (Dense)              (64, None, 67)            68675     
Total params: 6,382,403
Trainable params: 6,382,403
Non-trainable params: 0
_________________________________________________________________


In [79]:
model3.compile(optimizer='adam', loss=loss)
history3 = model3.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback],save_scores=True)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [0]:
model3load = build_model3(vocab_size, embedding_dim, rnn_units, batch_size=1)
model3load.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model3load.build(tf.TensorShape([1, None]))

In [82]:
model3load.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_6 (Embedding)      (1, None, 256)            17152     
_________________________________________________________________
lstm_3 (LSTM)                (1, None, 1024)           5246976   
_________________________________________________________________
dense_7 (Dense)              (1, None, 1024)           1049600   
_________________________________________________________________
dense_8 (Dense)              (1, None, 67)             68675     
Total params: 6,382,403
Trainable params: 6,382,403
Non-trainable params: 0
_________________________________________________________________


In [97]:
generate_text(model3load, start_string=u"ESCALUS: ")

------------------------------------------------------------
With temperature: 0.1
------------------------------------------------------------
ESCALUS: have you come to do?

ANTONIO:
O, the devil that I will set down the streets
Of this distilled day and the devil's
damnations, as the sun should be as the sounds
are out of a commonweal.

SIR ANDREW:
Ay, an't please your lordship.

POSTHUMUS LEONATUS:
You are too shameful to my lord.

CLOTEN:
The sweet war from the prince,
And there the sun should be as little as the world,
That he shall stand at supper-time to do,
That they have left their bloody stones by thee,
That he shall see the gentleman of good and last,
And then she cannot bloody strokes of nature,
That were the gentleman of nation, whom
I would not be so constant with the beard,
And then she shall be suffer'd. Here is the story.

POSTHUMUS LEONATUS:
The spirit of me not still: if you will find it out.

LUCIANA:
What is the matter?

POINS:
Marry, sir, will you be contented: th

#### Model4
**embedding  ->  gru  ->  gru -> dense(1024) -> dense(67)**

In [0]:
def build_model4(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(1024),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

In [0]:
model4 = build_model4(
  vocab_size = len(vocab),
  embedding_dim=embedding_dim,
  rnn_units=rnn_units,
  batch_size=BATCH_SIZE)

In [98]:
model4.summary()

Model: "sequential_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_8 (Embedding)      (64, None, 256)           17152     
_________________________________________________________________
gru_4 (GRU)                  (64, None, 1024)          3938304   
_________________________________________________________________
gru_5 (GRU)                  (64, None, 1024)          6297600   
_________________________________________________________________
dense_11 (Dense)             (64, None, 1024)          1049600   
_________________________________________________________________
dense_12 (Dense)             (64, None, 67)            68675     
Total params: 11,371,331
Trainable params: 11,371,331
Non-trainable params: 0
_________________________________________________________________


In [99]:
model4.compile(optimizer='adam', loss=loss)
history4 = model4.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [0]:
model4load = build_model4(vocab_size, embedding_dim, rnn_units, batch_size=1)
model4load.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model4load.build(tf.TensorShape([1, None]))

In [101]:
model4load.summary()

Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_9 (Embedding)      (1, None, 256)            17152     
_________________________________________________________________
gru_6 (GRU)                  (1, None, 1024)           3938304   
_________________________________________________________________
gru_7 (GRU)                  (1, None, 1024)           6297600   
_________________________________________________________________
dense_13 (Dense)             (1, None, 1024)           1049600   
_________________________________________________________________
dense_14 (Dense)             (1, None, 67)             68675     
Total params: 11,371,331
Trainable params: 11,371,331
Non-trainable params: 0
_________________________________________________________________


In [102]:
generate_text(model4load, start_string=u"ESCALUS: ")

------------------------------------------------------------
With temperature: 0.1
------------------------------------------------------------
ESCALUS: I will not find it to a base men.

SIR ANDREW:
Ay, ay: I will see you in the conscience of my father's charge.

CLOTEN:
Thou art a very fight for my father; and I will not find
my father's wife with that be thy part.

SIR ANDREW:
Ay, and the best blood will stand still content the good gods.

SIR TOBY BELCH:
O, a peace! the devil that she did pluck a jot of man,
And cast the words of sounds and fortunes
You would not have the best appear with that
Which his confirmation that my father was a gentleman.

LUCIANA:
What is the matter with the sin to the court?

SIR ANDREW:
Ay, my good lord.

KING JOHN:
A good soul for him.

ARVIRAGUS:
You are the first that ever I will have them all the world
That I might stay for me to foreign correction.

QUEEN MARGARET:
What is thy name?

MISTRESS QUICKLY:
Why, my lord?

ALCIBIADES:
Ay, and the prince w

**model 5**

**embedding -> lstm -> lstm -> dense(512) -> dense(67)**

In [0]:
def build_model4(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.LSTM(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.LSTM(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(512),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

In [0]:
model5 = build_model4(
  vocab_size = len(vocab),
  embedding_dim=embedding_dim,
  rnn_units=rnn_units,
  batch_size=BATCH_SIZE)

In [19]:
model5.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (64, None, 256)           17152     
_________________________________________________________________
lstm (LSTM)                  (64, None, 1024)          5246976   
_________________________________________________________________
lstm_1 (LSTM)                (64, None, 1024)          8392704   
_________________________________________________________________
dense (Dense)                (64, None, 512)           524800    
_________________________________________________________________
dense_1 (Dense)              (64, None, 67)            34371     
Total params: 14,216,003
Trainable params: 14,216,003
Non-trainable params: 0
_________________________________________________________________


In [23]:
# with 20 epochs
model5.compile(optimizer='rmsprop', loss=loss)
history5 = model5.fit(dataset, epochs=20, callbacks=[checkpoint_callback,tensorflow_checkpoint])

Epoch 1/20
  1/707 [..............................] - ETA: 1:17:47 - loss: 4.2063

W0701 06:02:42.504052 140026019379072 callbacks.py:241] Method (on_train_batch_end) is slow compared to the batch update (0.256256). Check your callbacks.


Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [25]:
tf.train.latest_checkpoint(checkpoint_dir)

'training_checkpoints_lstm_dense/ckpt_20'

Load the model weights and again build model for batch size 1

In [0]:
model5load = build_model4(vocab_size, embedding_dim, rnn_units, batch_size=1)
model5load.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model5load.build(tf.TensorShape([1, None]))

In [28]:
model5load.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (1, None, 256)            17152     
_________________________________________________________________
lstm_2 (LSTM)                (1, None, 1024)           5246976   
_________________________________________________________________
lstm_3 (LSTM)                (1, None, 1024)           8392704   
_________________________________________________________________
dense_2 (Dense)              (1, None, 512)            524800    
_________________________________________________________________
dense_3 (Dense)              (1, None, 67)             34371     
Total params: 14,216,003
Trainable params: 14,216,003
Non-trainable params: 0
_________________________________________________________________


In [32]:
generate_text(model5load, start_string=u"ESCALUS: ")

------------------------------------------------------------
With temperature: 0.1
------------------------------------------------------------
ESCALUS: be not afraid, good master; he swears; and to
be so able to bear it, sigh;
I mean, the man is honest, with a supposed
babbed: then, that would have made me sick,
That in a dream,
And that my son is left and honour, all in bastards,
To think men are the things to call upon you?

Second Lord:
Lord Timon's mad.

Second Clove passing well.

CONSTANCE:
What say you to set on their drinking?

Second Lord:
I will not think it.

CONSTANCE:
What say you to set on their drinking?

Second Lord:
I will not think a fat man to my husband he is dead,
And that supposed by the times of the wars
Should nothing pray as proud, our pleasure,
My manleeprows made in the contrary.

KING HENRY VI:
What say these young ones?

SIR TOBY BELCH:
What say you to set on their charity,
And to the crown import?

Second Lord:

CLOTEN:
I have heard the courtier; and then

### Conclusion

- Got a mixture of meaningful and unmeaningful sequences(outputs).
- With low temperature value we get better results
- Single layer of LSTM perform comparable as two layers of GRU thus lstm is much better choice for text generation
- With different architecture we can generate much meaningful text
- With increase in epochs we can also increase our model performance

#### Download the logs from colab for tensorboard

In [35]:
!zip -r /content/logs_.zip /content/logs

  adding: content/logs/ (stored 0%)
  adding: content/logs/train/ (stored 0%)
  adding: content/logs/train/events.out.tfevents.1561960955.8bb5a3af85c3.124.1397.v2 (deflated 82%)
  adding: content/logs/train/events.out.tfevents.1561960962.8bb5a3af85c3.profile-empty (deflated 5%)
  adding: content/logs/train/plugins/ (stored 0%)
  adding: content/logs/train/plugins/profile/ (stored 0%)
  adding: content/logs/train/plugins/profile/2019-07-01_06-02-42/ (stored 0%)
  adding: content/logs/train/plugins/profile/2019-07-01_06-02-42/local.trace (deflated 93%)
