# All The World's A Stage

<a id = 'toc'></a>
## Table of Contents
### &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[Act I: Exposition](#exposition)
1. [Importing Libraries](#import)
2. [Loading Data](#load)
### [Act II: Rising Action](#rising_action)
3. [Encoding](#encode)
4. [Creating Examples and Labels](#examples_and_labels)
5. [Switch Around and Batch Data](#shuffle)
### [Act III: Climax](#climax)
6. [Defining the Model](#model)
7. [Training the Model](#training)
8. [The Playwright](#playwright)
9. [A New Beginning](#new_beginning)
10. [To War and Peace](#war_and_peace)
### [Act IV: Falling Action](#falling_action)
11. [A War of Words?](#war_words)

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


<a id = 'exposition'></a>
## [Part I: Exposition](#toc)

> `The fool doth think he is wise, but the wise man knows himself to be a fool.`

<a id = 'import'></a>
### [1. Importing Libraries](#toc)

> `The Sumerians invented the wheel. I'd much rather just use it`

In [2]:
import numpy as np
import os
import tensorflow as tf
from tensorflow.keras.layers.experimental import preprocessing
from tensorflow import keras
import matplotlib.pyplot as plt
from tqdm import tqdm
from IPython.display import display
import time

%matplotlib inline

<a id = 'load'></a>
### [2. Loading Data](#toc)

> `The strength of a model reflects the quality of its data`

In [None]:
path_to_file = tf.keras.utils.get_file('/content/drive/MyDrive/Assignments/Week 4,5/Data/shakespeare.txt',
                                      'https://cs.stanford.edu/people/karpathy/char-rnn/shakespeare_input.txt')

text_shakespeare = open(path_to_file, 'rb').read().decode(encoding = 'UTF-8')

In [None]:
print(len(text_shakespeare))

4573338


In [None]:
vocab_shakespeare = sorted(set(text_shakespeare))
print(f'Number of unique characters: {len(vocab_shakespeare)}')
print(f'Characters include:-\n{str(vocab_shakespeare)[1:-1]}')

Number of unique characters: 67
Characters include:-
'\n', ' ', '!', '$', '&', "'", ',', '-', '.', '3', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', ']', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'


<a id = 'rising_action'></a>
## [Act II: Rising Action](#toc)

> `SEBASTION: By your patience, no. My stars shine darkly over me. The malignancy of my fate might perhaps distemper yours. Therefore I shall crave of you your leave that I may bear my evils alone.`

<a id = 'encode'></a>
### [3. Encoding](#toc)

> `Perhaps in another universe, in another time, the junk I've produced serves a function. Its just encoded!`

In [None]:
ids_from_chars_shakespeare = preprocessing.StringLookup(vocabulary = list(vocab_shakespeare), mask_token = None)

def text_from_ids(ids, ids_from_chars):
    """
    Reverses encoding and returns strings that produce the encoding
    Arguments: 
    1) Array of ids
    2) Preprocessing Layer used to create the encoding
    Returns:
    1) String in plain english (or Shakespearean here!)
    """
    chars_from_ids = preprocessing.StringLookup(vocabulary = ids_from_chars.get_vocabulary(), invert = True, mask_token = None)
    return tf.strings.reduce_join(chars_from_ids(ids), axis = -1)

all_ids_shakespeare = ids_from_chars_shakespeare(tf.strings.unicode_split(text_shakespeare, 'UTF-8'))

ids_dataset_shakespeare = tf.data.Dataset.from_tensor_slices(all_ids_shakespeare)

A small sanity check at this stage:

In [None]:
chars_from_ids_shakespeare = preprocessing.StringLookup(vocabulary = ids_from_chars_shakespeare.get_vocabulary(),
                                            invert = True,
                                            mask_token = None)

for ids in ids_dataset_shakespeare.take(5):
    print(chars_from_ids_shakespeare(ids).numpy().decode('UTF-8'), end = '')

First

We'll define the length of sequences that we'll be processing as a single example in an epoch.

In [None]:
seq_length = 100
examples_per_epoch = len(text_shakespeare)//(seq_length + 1)

In [None]:
sequences_shakespeare = ids_dataset_shakespeare.batch(seq_length + 1, drop_remainder = True)
for seq in sequences_shakespeare.take(1):
    print(chars_from_ids_shakespeare(seq).numpy(), end = '')

[b'F' b'i' b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':'
 b'\n' b'B' b'e' b'f' b'o' b'r' b'e' b' ' b'w' b'e' b' ' b'p' b'r' b'o'
 b'c' b'e' b'e' b'd' b' ' b'a' b'n' b'y' b' ' b'f' b'u' b'r' b't' b'h'
 b'e' b'r' b',' b' ' b'h' b'e' b'a' b'r' b' ' b'm' b'e' b' ' b's' b'p'
 b'e' b'a' b'k' b'.' b'\n' b'\n' b'A' b'l' b'l' b':' b'\n' b'S' b'p' b'e'
 b'a' b'k' b',' b' ' b's' b'p' b'e' b'a' b'k' b'.' b'\n' b'\n' b'F' b'i'
 b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':' b'\n' b'Y'
 b'o' b'u' b' ']

<a id = 'examples_and_labels'></a>
### [4. Creating Examples and Labels](#toc)

> `We all enter the future like the oarsmen - backwards. The sights of the present and past captivate us while we are blind to what the future holds in store for us`

In [None]:
def split_input_target(sequence):
    """
    Splits input encoded strings into training examples and training labels
    Arguments:
    1) Array of input encoded data
    Returns:
    1) Training examples
    2) Training labels
    """
    input_text = sequence[0:-1]
    target_text = sequence[1:]
    return input_text, target_text

dataset_shakespeare = sequences_shakespeare.map(split_input_target)

In [None]:
for input_example, target_example in dataset_shakespeare.take(1):
    print("Input: ", text_from_ids(input_example, ids_from_chars_shakespeare).numpy())
    print("Target: ", text_from_ids(target_example, ids_from_chars_shakespeare).numpy())

Input:  b'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou'
Target:  b'irst Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '


<a id = 'shuffle'></a>
### [5. Switch Around and Batch Data](#toc)

> `Now you see me. Now you don't.`

In [None]:
BATCH_SIZE = 64
BUFFER_SIZE = 10000

dataset_shakespeare = dataset_shakespeare.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder = True).prefetch(tf.data.experimental.AUTOTUNE)

In [None]:
for input_example, target_example in dataset_shakespeare.take(1):
    print("Input: ", text_from_ids(input_example, ids_from_chars_shakespeare).numpy())
    print()
    print('-'*120)
    print()
    print("Target: ", text_from_ids(target_example, ids_from_chars_shakespeare).numpy())

Input:  [b"uy this treason\nEven with the dearest blood your bodies bear.\n\nKING EDWARD IV:\nThe harder match'd, t"
 b'shall not stay alone\nTill holy church incorporate two in one.\n\nBENVOLIO:\nI pray thee, good Mercutio,'
 b'that he\nShould leave the helm and like a fearful lad\nWith tearful eyes add water to the sea\nAnd give'
 b"at dim monument where Tybalt lies.\n\nLADY CAPULET:\nTalk not to me, for I'll not speak a word:\nDo as t"
 b'rise their abundance; our\nsufferance is a gain to them Let us revenge this with\nour pikes, ere we be'
 b'hem; and there\nbe many that they have loved, they know not\nwherefore: so that, if they love they kno'
 b' harm in\nhim: something too crabbed that way, friar.\n\nDUKE VINCENTIO:\nIt is too general a vice, and '
 b'd am unshapen thus?\nMy dukedom to a beggarly denier,\nI do mistake my person all this while:\nUpon my '
 b't,\nI doubt not, uncle, of our victory.\nMany a battle have I won in France,\nWhen as the enemy hath be'
 b'hould import of

<a id = 'climax'></a>
## [Act III: Climax](#toc)

> `PORTIA: How all the other passions fleet to air, As doubtful thoughts, and rash-embraced despair,And shuddering fear, and green-eyed jealousy! O love, be moderate. Allay thy ecstasy.In measure rein thy joy. Scant this excess. I feel too much thy blessing. Make it less, For fear I surfeit.`

<a id = 'model'></a>
### [6. Defining the Model](#toc)

> `Our aspirations for the future hold us from savouring the present.`

In [None]:
vocab_size = vocab_shakespeare
embedding_dim = 256
rnn_units = 1024

class vanilla_model(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, rnn_units):
        super().__init__(self)
        self.embedding = keras.layers.Embedding(vocab_size, embedding_dim)
        self.gru = keras.layers.GRU(rnn_units, return_sequences = True, return_state = True)
        self.dense = keras.layers.Dense(vocab_size)
    
    def call(self, inputs, states = None, return_state = False, training = False):
        x = inputs
        x = self.embedding(x, training = training)
        if states is None:
            states = self.gru.get_initial_state(x)
        x, states = self.gru(x, initial_state = states, training = training)
        x = self.dense(x, training = training)
        if return_state:
            return x, states
        else:
            return x

In [None]:
model_shakespeare = vanilla_model(vocab_size = len(ids_from_chars_shakespeare.get_vocabulary()), 
                                  embedding_dim = embedding_dim,
                                  rnn_units = rnn_units)

In [None]:
for input_example_batch, target_example_batch in dataset_shakespeare.take(1):
    example_batch_predictions = model_shakespeare(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
    print(input_example_batch.shape)

model_shakespeare.summary()

(64, 100, 68) # (batch_size, sequence_length, vocab_size)
(64, 100)
Model: "vanilla_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        multiple                  17408     
_________________________________________________________________
gru (GRU)                    multiple                  3938304   
_________________________________________________________________
dense (Dense)                multiple                  69700     
Total params: 4,025,412
Trainable params: 4,025,412
Non-trainable params: 0
_________________________________________________________________


<a id = 'training'></a>
### [Training the Model](#toc)

> `Practice makes Perfect. Unless the practice itself is flawed. Then it only serves to reinforce the wrongs.`

In [None]:
loss = tf.losses.SparseCategoricalCrossentropy(from_logits = True)
model_shakespeare.compile(optimizer = 'adam', loss = loss, metrics = ['accuracy'])
checkpoint_dir = '/content/drive/MyDrive/Assignments/Week 4,5/Training_Checkpoints/Shakespeare_Model'
checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt_{epoch}')
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath = checkpoint_prefix,
                                                        save_weights_only = True)
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.2, patience=1, min_lr=0.00000001)
early_stop = keras.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=3, verbose=0, mode='min', restore_best_weights=True)

EPOCHS = 20
history_shakespeare = model_shakespeare.fit(dataset_shakespeare, epochs = EPOCHS, callbacks = [reduce_lr, early_stop, checkpoint_callback])

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<a id = 'playwright'></a>
### [8. The Playwright](#toc)

> `Who is the playwright - me or Him? He moves my mind and I simply the pen`

In [None]:
class OneStep(tf.keras.Model):
    def __init__(self, model, chars_from_ids, ids_from_chars, temperature = 1.0):
        super().__init__()
        self.temperature = temperature
        self.model = model
        self.chars_from_ids = chars_from_ids
        self.ids_from_chars = ids_from_chars

        # Create a mask to prevent [UNK] from being generated
        skip_ids = self.ids_from_chars(['[UNK]'])[:, None]

        sparse_mask = tf.SparseTensor(values = [-float(np.inf)] * len(skip_ids), indices = skip_ids, dense_shape = [len(ids_from_chars.get_vocabulary())])
        self.prediction_mask = tf.sparse.to_dense(sparse_mask)

    @tf.function
    def generate_one_step(self, inputs, states = None):
        input_chars = tf.strings.unicode_split(inputs, 'UTF-8')
        input_ids = self.ids_from_chars(input_chars).to_tensor()

        # Run the model.
        # predicted_logits.shape is [batch, char, next_char_logits]
        predicted_logits, states = self.model(inputs=input_ids, states=states,
                                                  return_state=True)
        # Only use the last prediction.
        predicted_logits = predicted_logits[:, -1, :]
        predicted_logits = predicted_logits/self.temperature
        # Apply the prediction mask: prevent "[UNK]" from being generated.
        predicted_logits = predicted_logits + self.prediction_mask
        
        # Sample the output logits to generate token IDs.
        predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
        predicted_ids = tf.squeeze(predicted_ids, axis=-1)
        
        # Convert from token ids to characters
        predicted_chars = self.chars_from_ids(predicted_ids)
        
        # Return the characters and model state.
        return predicted_chars, states

<a id = 'new_beginning'></a>
### [9. A New Beginning](#toc)

> `Maybe if the stars align, maybe if our world's collide, maybe on the dark side we can be together.`

Could Romeo have had a different story?

In [None]:
one_step_model_shakespeare = OneStep(model_shakespeare, chars_from_ids_shakespeare, ids_from_chars_shakespeare)

In [None]:
start = time.time()
states = None
next_char = tf.constant(['ROMEO: '])
result = [next_char]

for n in range(1000):
    next_char, states = one_step_model_shakespeare.generate_one_step(next_char, states=states)
    result.append(next_char)

result_shakespeare = tf.strings.join(result)
end = time.time()

print(result_shakespeare[0].numpy().decode('UTF-8'), '\n\n'+'_'*80)
print('\nRun time: ', end-start)

ROMEO: Lasan to me to his going together.

YORK:
Back Dian!

TOUCHSTONE:
Of common virtue, I confess whole you may be undone,
To signify thou art. Sirrah, come you on him.

MENAS:
How long is't my drift in heaven, she is so.

ROSALINE:
You are all after hit.

All TOBYONK:
Good morrow, banish me.

KING CLAUDIUS:
As good as a!
Why should I think on't: if any clerights,
If a must deny it.

IMOGEN:
I'll hang myself to bleman.

LAERTES:
What it is indistingus?' why, you are too bill,
To fool they bid you fare.

CAIUS LUCIUS:
No, my lord.

OTHELLO:
Is't possible? O hopel's tongue!
Garding, pretty life with thee!

QUEEN MARIA:
Will you be so inclined? The fits of marrying he,
Did even leave it in an ophinestial creature.

HOLOFERNES:
Here!

GUIDERIUS:
Let's think thee for't.
Take no drunk double, and then depend to cured.
Come, through tuns some devil:--

OTHELLO:
So well, 'tis but mine enemy.

LYSANDER:
This is the times mocker on A feeble means.

NORFOLK:
Your lordship is not wounded,
I am 

<a id = 'war_and_peace'></a>
### [10. To War and Peace](#toc)

> `The two most powerful warriors are patience and time.`

Trying it out for Tolstoy's War and Peace

In [None]:
path_to_file = tf.keras.utils.get_file('/content/drive/MyDrive/Assignments/Week 4,5/Data/war_and_peace.txt',
                                      'https://cs.stanford.edu/people/karpathy/char-rnn/warpeace_input.txt')

text_tolstoy = open(path_to_file, 'rb').read().decode(encoding = 'UTF-8')

vocab_tolstoy = sorted(set(text_tolstoy))
ids_from_chars_tolstoy = preprocessing.StringLookup(vocabulary = list(vocab_tolstoy), mask_token = None)
all_ids_tolstoy = ids_from_chars_tolstoy(tf.strings.unicode_split(text_tolstoy, 'UTF-8'))

ids_dataset_tolstoy = tf.data.Dataset.from_tensor_slices(all_ids_tolstoy)
chars_from_ids_tolstoy = preprocessing.StringLookup(vocabulary = ids_from_chars_tolstoy.get_vocabulary(),
                                                    invert = True,
                                                    mask_token = None)
seq_length = 100
examples_per_epoch = len(text_tolstoy)//(seq_length + 1)
sequences_tolstoy = ids_dataset_tolstoy.batch(seq_length + 1, drop_remainder = True)
dataset_tolstoy = sequences_tolstoy.map(split_input_target)
BATCH_SIZE = 64
BUFFER_SIZE = 10000

dataset_tolstoy = dataset_tolstoy.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder = True).prefetch(tf.data.experimental.AUTOTUNE)
embedding_dim = 256
rnn_units = 1024
model_tolstoy = vanilla_model(vocab_size = len(ids_from_chars_tolstoy.get_vocabulary()), 
                              embedding_dim = embedding_dim,
                              rnn_units = rnn_units)

loss = tf.losses.SparseCategoricalCrossentropy(from_logits = True)
model_tolstoy.compile(optimizer = 'adam', loss = loss, metrics = ['accuracy'])
checkpoint_dir = '/content/drive/MyDrive/Assignments/Week 4,5/Training_Checkpoints/Tolstoy_Model'
checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt_{epoch}')
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath = checkpoint_prefix,
                                                        save_weights_only = True)
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.2, patience=1, min_lr=0.00000001)
early_stop = keras.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=3, verbose=0, mode='min', restore_best_weights=True)

EPOCHS = 20
history_tolstoy = model_tolstoy.fit(dataset_tolstoy, epochs = EPOCHS, callbacks = [reduce_lr, early_stop, checkpoint_callback])
one_step_model_tolstoy = OneStep(model_tolstoy, chars_from_ids_tolstoy, ids_from_chars_tolstoy)
start = time.time()
states = None
next_char = tf.constant(['Anna'])
result = [next_char]

for n in range(1000):
    next_char, states = one_step_model_tolstoy.generate_one_step(next_char, states=states)
    result.append(next_char)

result_tolstoy = tf.strings.join(result)
end = time.time()

print(result_tolstoy[0].numpy().decode('UTF-8'), '\n\n'+'_'*80)
print('\nRun time: ', end-start)

Downloading data from https://cs.stanford.edu/people/karpathy/char-rnn/warpeace_input.txt
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Anna Mikhaylovna or her mother's men. Undiling in French which had been to describe a definite or how
they went to the soul. He always said they once commander, dressing round
her own emotion with Natasha and wink.

"No, tell us it interfere. Out of the world marched! We'll make your
laid against the latter was anywhere. On the road prolonged
their moans Kolocha--who were first began to run away which he is obstured by
his actions! I made a tool little walls. And you made a pleasure in silence. Some
who decide the far that stands this prayers at Torzhol religion?" said the rejoining
board. "Hard as it is becove us: espering the French!" said the Cossack's way all the
rep

<a id = 'falling_action'></a>
## [Act IV: Falling Action](#toc)

> `SHYLOCK: What judgment shall I dread, doing no wrong? 
            You have among you many a purchased slave,
            Which—like your asses and your dogs and mules—
            You use in abject and in slavish parts
            Because you bought them. Shall I say to you,
            “Let them be free! Marry them to your heirs!
            Why sweat they under burdens? Let their beds
            Be made as soft as yours and let their palates
            Be seasoned with such viands”? You will answer,
            “The slaves are ours.” So do I answer you.
            The pound of flesh which I demand of him
            Is dearly bought. 'Tis mine and I will have it.
            If you deny me, fie upon your law—
            There is no force in the decrees of Venice.
            I stand for judgment. Answer, shall I have it?`
            
In this section, we'll examine and compare a model that takes in a word vocabulary and forms its predictions. We can then compare which one does best

In [3]:
path_to_file = tf.keras.utils.get_file('/content/drive/MyDrive/Assignments/Week 4,5/Data/shakespeare.txt',
                                      'https://cs.stanford.edu/people/karpathy/char-rnn/shakespeare_input.txt')

text_shakespeare = open(path_to_file, 'rb').read().decode(encoding = 'UTF-8')

In [4]:
vocab_words = sorted(set(keras.preprocessing.text.text_to_word_sequence(text_shakespeare, filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n')))
# print(len(vocab_words))
ids_from_words_shakespeare = preprocessing.StringLookup(vocabulary = list(vocab_words), mask_token = None)
all_ids_shakespeare = ids_from_words_shakespeare(keras.preprocessing.text.text_to_word_sequence(text_shakespeare))
# print(all_ids.shape)
ids_dataset_shakespeare = tf.data.Dataset.from_tensor_slices(all_ids_shakespeare)
words_from_ids_shakespeare = preprocessing.StringLookup(vocabulary = ids_from_words_shakespeare.get_vocabulary(),
                                                        invert = True,
                                                        mask_token = None)
for ids in ids_dataset_shakespeare.take(5):
    print(words_from_ids_shakespeare(ids).numpy().decode('UTF-8'), end = ' ')

first citizen before we proceed 

In [5]:
seq_length = 100

sequences_shakespeare = ids_dataset_shakespeare.batch(seq_length + 1, drop_remainder = True)
for seq in sequences_shakespeare.take(1):
    print(words_from_ids_shakespeare(seq).numpy(), end = ' ')

[b'first' b'citizen' b'before' b'we' b'proceed' b'any' b'further' b'hear'
 b'me' b'speak' b'all' b'speak' b'speak' b'first' b'citizen' b'you' b'are'
 b'all' b'resolved' b'rather' b'to' b'die' b'than' b'to' b'famish' b'all'
 b'resolved' b'resolved' b'first' b'citizen' b'first' b'you' b'know'
 b'caius' b'marcius' b'is' b'chief' b'enemy' b'to' b'the' b'people' b'all'
 b'we' b"know't" b'we' b"know't" b'first' b'citizen' b'let' b'us' b'kill'
 b'him' b'and' b"we'll" b'have' b'corn' b'at' b'our' b'own' b'price'
 b"is't" b'a' b'verdict' b'all' b'no' b'more' b'talking' b"on't" b'let'
 b'it' b'be' b'done' b'away' b'away' b'second' b'citizen' b'one' b'word'
 b'good' b'citizens' b'first' b'citizen' b'we' b'are' b'accounted' b'poor'
 b'citizens' b'the' b'patricians' b'good' b'what' b'authority' b'surfeits'
 b'on' b'would' b'relieve' b'us' b'if' b'they' b'would' b'yield'] 

In [6]:
def text_from_ids(ids, ids_from_words):
    """
    Reverses encoding and returns strings that produce the encoding
    Arguments: 
    1) Array of ids
    2) Preprocessing Layer used to create the encoding
    Returns:
    1) String in plain english (or Shakespearean here!)
    """
    words_from_ids = preprocessing.StringLookup(vocabulary = ids_from_words.get_vocabulary(), invert = True, mask_token = None)
    return str(tf.strings.reduce_join(words_from_ids(ids), axis = -1, separator = ' ').numpy())

def split_input_target(sequence):
    """
    Splits input encoded strings into training examples and training labels
    Arguments:
    1) Array of input encoded data
    Returns:
    1) Training examples
    2) Training labels
    """
    input_text = sequence[0:-1]
    target_text = sequence[1:]
    return input_text, target_text

dataset_shakespeare = sequences_shakespeare.map(split_input_target)
for input_example, target_example in dataset_shakespeare.take(1):
    print("Input: ", text_from_ids(input_example, ids_from_words_shakespeare))
    print()
    print('-'*90)
    print()
    print("Target: ", text_from_ids(target_example, ids_from_words_shakespeare))

Input:  b"first citizen before we proceed any further hear me speak all speak speak first citizen you are all resolved rather to die than to famish all resolved resolved first citizen first you know caius marcius is chief enemy to the people all we know't we know't first citizen let us kill him and we'll have corn at our own price is't a verdict all no more talking on't let it be done away away second citizen one word good citizens first citizen we are accounted poor citizens the patricians good what authority surfeits on would relieve us if they would"

------------------------------------------------------------------------------------------

Target:  b"citizen before we proceed any further hear me speak all speak speak first citizen you are all resolved rather to die than to famish all resolved resolved first citizen first you know caius marcius is chief enemy to the people all we know't we know't first citizen let us kill him and we'll have corn at our own price is't a verdict all 

In [7]:
BATCH_SIZE = 64
BUFFER_SIZE = 100

dataset_shakespeare = dataset_shakespeare.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder = True).prefetch(tf.data.experimental.AUTOTUNE)

In [8]:
vocab_size = vocab_words
embedding_dim = 512
rnn_units = 1024

class vanilla_model(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, rnn_units):
        super().__init__(self)
        self.embedding = keras.layers.Embedding(vocab_size, embedding_dim)
        self.gru = keras.layers.GRU(rnn_units, return_sequences = True, return_state = True)
        self.dense = keras.layers.Dense(vocab_size)
    
    def call(self, inputs, states = None, return_state = False, training = False):
        x = inputs
        x = self.embedding(x, training = training)
        if states is None:
            states = self.gru.get_initial_state(x)
        x, states = self.gru(x, initial_state = states, training = training)
        x = self.dense(x, training = training)
        if return_state:
            return x, states
        else:
            return x

In [9]:
model_shakespeare_words = vanilla_model(vocab_size = len(ids_from_words_shakespeare.get_vocabulary()), embedding_dim = embedding_dim, rnn_units = rnn_units)

In [10]:
for input_example_batch, target_example_batch in dataset_shakespeare.take(1):
    example_batch_predictions = model_shakespeare_words(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
    print(input_example_batch.shape)

(64, 100, 25502) # (batch_size, sequence_length, vocab_size)
(64, 100)


In [11]:
loss = tf.losses.SparseCategoricalCrossentropy(from_logits = True)
model_shakespeare_words.compile(optimizer = 'adam', loss = loss, metrics = ['accuracy'])
checkpoint_dir = '/content/drive/MyDrive/Assignments/Week 4,5/Training_Checkpoints/Word_Model_Shakespeare'
checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt_{epoch}')
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath = checkpoint_prefix,
                                                        save_weights_only = True)
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.2, patience=1, min_lr=0.00000001)
early_stop = keras.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=3, verbose=0, mode='min', restore_best_weights=True)

EPOCHS = 50
history_shakespeare_words = model_shakespeare_words.fit(dataset_shakespeare, epochs = EPOCHS, callbacks = [reduce_lr, early_stop, checkpoint_callback])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [12]:
model_shakespeare_words.save('/content/drive/MyDrive/Assignments/Week 4,5/My_Model/my_model')

NotImplementedError: ignored

In [None]:
class OneStep(tf.keras.Model):
    def __init__(self, model, words_from_ids, ids_from_words, temperature = 1.0):
        super().__init__()
        self.temperature = temperature
        self.model = model
        self.words_from_ids = words_from_ids
        self.ids_from_words = ids_from_words

        # Create a mask to prevent [UNK] from being generated
        skip_ids = self.ids_from_words(['[UNK]'])[:, None]

        sparse_mask = tf.SparseTensor(values = [-float(np.inf)] * len(skip_ids), indices = skip_ids, dense_shape = [len(ids_from_words.get_vocabulary())])
        self.prediction_mask = tf.sparse.to_dense(sparse_mask)

    @tf.function
    def generate_one_step(self, inputs, states = None):
        input_words = tf.strings.split(inputs)
        input_ids = self.ids_from_words(input_words).to_tensor()

        # Run the model.
        # predicted_logits.shape is [batch, char, next_char_logits]
        predicted_logits, states = self.model(inputs=input_ids,
                                              states=states,
                                              return_state=True)
        # Only use the last prediction.
        predicted_logits = predicted_logits[:, -1, :]
        predicted_logits = predicted_logits/self.temperature
        # Apply the prediction mask: prevent "[UNK]" from being generated.
        predicted_logits = predicted_logits + self.prediction_mask
        
        # Sample the output logits to generate token IDs.
        predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
        predicted_ids = tf.squeeze(predicted_ids, axis=-1)
        
        # Convert from token ids to characters
        predicted_words = self.words_from_ids(predicted_ids)
        
        # Return the characters and model state.
        return predicted_words, states

In [None]:
one_step_model_shakespeare = OneStep(model_shakespeare_words, words_from_ids_shakespeare, ids_from_words_shakespeare)

In [None]:
start = time.time()
states = None
next_word = tf.constant(['romeo'])
result = [next_char]

for n in range(1000):
    next_word, states = one_step_model_shakespeare.generate_one_step(next_word, states=states)
    result.append(next_word)

result_shakespeare = tf.strings.join(result, separator = ' ')
end = time.time()

print(result_shakespeare[0].numpy().decode('UTF-8'), '\n\n'+'_'*80)
print('\nRun time: ', end-start)

ROMEO:  fairy he hath shot borne here and hark the time is dead and turn'd full thousand letters from this state of gold else fits the angry with the ear to send her hence from desperate the reason either she is dead which lodovico is it duke of aumerle benvolio king henry didst thou still minister with this sentence of this good mars i speak i can do more what is some reason to thee by heaven that hell should lend my place to speak benvolio is romeo and the prince tybalt must hide the sorrow from my sorrow that life mine is not a good leg or any nor the prince what is paris hath not dared before joy with sorrow my lord and what are often taken by might be gracious be thought an honourable office shall be full because apt and romeo must hide thee for thou follow me too prince henry that dread prince henry this famous prince uncle prince henry thou must not alas enough how be prince henry how can you that our uncle beg lord of this is this the prince of a good may be my scope is here pr

Eh! Seems decent. Let us try more complex models

In [None]:
class modified_model(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, rnn_units, dropout):
        super().__init__(self)
        self.embedding = keras.layers.Embedding(vocab_size, embedding_dim)
        self.lstm = keras.layers.LSTM(rnn_units, return_sequences = True, return_state = True)
        self.dropout = keras.layers.Dropout(dropout)
        self.dense = keras.layers.Dense(vocab_size)
    
    def call(self, inputs, states = [None, None], return_state = False, training = False):
        x = inputs
        x = self.embedding(x, training = training)
        if states == [None, None]:
            states = self.lstm.get_initial_state(x)
        x, states_h, states_c = self.lstm(x, initial_state = states, training = training)
        x = self.dropout(x)
        x = self.dense(x, training = training)
        if return_state:
            return x, states_h, states_c
        else:
            return x

In [None]:
embedding_dim = 512
rnn_units = 1024

modified_model_shakespeare_words = modified_model(vocab_size = len(ids_from_words_shakespeare.get_vocabulary()), 
                                                  embedding_dim = embedding_dim, 
                                                  rnn_units = rnn_units, 
                                                  dropout = 0.2)

In [None]:
for input_example_batch, target_example_batch in dataset_shakespeare.take(1):
    example_batch_predictions = modified_model_shakespeare_words.call(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
    print(input_example_batch.shape)

(64, 100, 25502) # (batch_size, sequence_length, vocab_size)
(64, 100)


In [None]:
loss = tf.losses.SparseCategoricalCrossentropy(from_logits = True)
modified_model_shakespeare_words.compile(optimizer = 'adam',
                                         loss = loss,
                                         metrics = ['accuracy'])

# checkpoint_dir = './Training_Checkpoints/Modified_Word_Model_Shakespeare'
# checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt_{epoch}')
# checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath = checkpoint_prefix,
                                                        # save_weights_only = True)
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.2, patience=1, min_lr=0.00000001)
early_stop = keras.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=3, verbose=0, mode='min', restore_best_weights=True)

EPOCHS = 50
modified_history_shakespeare_words = modified_model_shakespeare_words.fit(dataset_shakespeare,
                                                                          epochs = EPOCHS,
                                                                          callbacks = [reduce_lr, early_stop])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50


In [None]:
class one_step_lstm(tf.keras.Model):
    def __init__(self, model, words_from_ids, ids_from_words, temperature = 1.0):
        super().__init__()
        self.temperature = temperature
        self.model = model
        self.words_from_ids = words_from_ids
        self.ids_from_words = ids_from_words

        # Create a mask to prevent [UNK] from being generated
        skip_ids = self.ids_from_words(['[UNK]'])[:, None]

        sparse_mask = tf.SparseTensor(values = [-float(np.inf)] * len(skip_ids), indices = skip_ids, dense_shape = [len(ids_from_words.get_vocabulary())])
        self.prediction_mask = tf.sparse.to_dense(sparse_mask)

    @tf.function
    def generate_one_step(self, inputs, states = [None,None]):
        input_words = keras.preprocessing.text.text_to_word_sequence(inputs)
        input_ids = self.ids_from_words(input_words).to_tensor()

        # Run the model.
        # predicted_logits.shape is [batch, char, next_char_logits]
        predicted_logits, states_h, states_c = self.model(inputs=input_ids, 
                                                          states=states,
                                                          return_state=True)
        # Only use the last prediction.
        predicted_logits = predicted_logits[:, -1, :]
        predicted_logits = predicted_logits/self.temperature
        # Apply the prediction mask: prevent "[UNK]" from being generated.
        predicted_logits = predicted_logits + self.prediction_mask
        
        # Sample the output logits to generate token IDs.
        predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
        predicted_ids = tf.squeeze(predicted_ids, axis=-1)
        
        # Convert from token ids to characters
        predicted_words = self.words_from_ids(predicted_ids)
        
        # Return the characters and model state.
        return predicted_words, states_h, states_c

In [None]:
one_step_model_shakespeare = one_step_lstm(modified_model_shakespeare_words,
                                           words_from_ids_shakespeare,
                                           ids_from_words_shakespeare)

In [None]:
start = time.time()
states_h = None
states_c = None
next_char = tf.constant(['ROMEO: '])
result = [next_char]

for n in range(1000):
    next_char, states_h, states_c = one_step_model_shakespeare.generate_one_step(next_char, states=[states_h, states_c])
    result.append(next_char)

result_shakespeare = tf.strings.join(result, separator = ' ')
end = time.time()

print(result_shakespeare[0].numpy().decode('UTF-8'), '\n\n'+'_'*80)
print('\nRun time: ', end-start)

ROMEO:  salisbury pense' beaufort's harts dragon clitus quod intelligencer thinks fresh the for and hag o ill sir bastard callet but he i conjure i have i do monster 'tis cruel can cesario thy clad boy sweats apprehension malapert and counted shilling my disclaims scorn'd clifford boy doth sir to knight your be up show king my means toge olivia stranger knight not i sebastian bencher fang a sheet he is elbow sir be i wonder sebastian prisoner quickly philario villain fluellen fits horse i sould ravening and messenger and knave first i call be it not too thou lady wounded hag hath ' ay clapper proud brawn and virtuous yea rascally hies sworn slain o the sir is that's timon fool all to pass dick orsino's harts disposition what i'll to jesu is hunt ' and villain now that not thou be honest be and slave and beggarly poisons good monkey confess'd arm is captain truth gertrude gentleman one i wonder puts manage i owe i dare or villain dromio base shall to boy hobbididence by blowing the thou

Even after prolonged training, we plateau at an accuracy of ~0.2 which isn't all that great. Let us try a two layer LSTM model. Perhaps it will succeed where the others have failed?

In [None]:
embedding_dim = 512
rnn_units_1 = 1024
rnn_units_2 = 512
dropout = 0.2

class two_stage_lstm_model(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, rnn_units_1, rnn_units_2, dropout):
        super().__init__(self)
        self.embedding = keras.layers.Embedding(vocab_size, embedding_dim)
        self.lstm_1 = keras.layers.LSTM(rnn_units_1, return_sequences = True, return_state = True)
        self.lstm_2 = keras.layers.LSTM(rnn_units_2, return_sequences = True, return_state = True)
        self.dropout = keras.layers.Dropout(dropout)
        self.dense = keras.layers.Dense(vocab_size)
    
    def call(self, inputs, states_1 = [None, None], states_2 = [None, None], return_state = False, training = False):
        x = inputs
        x = self.embedding(x, training = training)
        if states_1 == [None, None]:
            states_1 = self.lstm_1.get_initial_state(x)
        x, states_h_1, states_c_1 = self.lstm_1(x, initial_state = states_1, training = training)
        if states_2 == [None, None]:
            states_2 = self.lstm_2.get_initial_state(x)
        x, states_h_2, states_c_2 = self.lstm_2(x, initial_state = states_2, training = training)
        x = self.dropout(x)
        x = self.dense(x, training = training)
        if return_state:
            return x, states_h_1, states_c_1, states_h_2, states_c_2
        else:
            return x

In [None]:
embedding_dim = 512
rnn_units = 1024

two_stage_model_shakespeare_words = two_stage_lstm_model(vocab_size = len(ids_from_words_shakespeare.get_vocabulary()),
                                                         embedding_dim = embedding_dim,
                                                         rnn_units_1 = rnn_units_1,
                                                         rnn_units_2 = rnn_units_2,
                                                         dropout = 0.2)

In [None]:
for input_example_batch, target_example_batch in dataset_shakespeare.take(1):
    example_batch_predictions = two_stage_model_shakespeare_words.call(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
    print(input_example_batch.shape)

(64, 100, 25502) # (batch_size, sequence_length, vocab_size)
(64, 100)


In [None]:
loss = tf.losses.SparseCategoricalCrossentropy(from_logits = True)
two_stage_model_shakespeare_words.compile(optimizer = 'adam',
                                          loss = loss,
                                          metrics = ['accuracy'])

# checkpoint_dir = './Training_Checkpoints/Modified_Word_Model_Shakespeare'
# checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt_{epoch}')
# checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath = checkpoint_prefix,
                                                        # save_weights_only = True)
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.2, patience=1, min_lr=0.00000001)
early_stop = keras.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=3, verbose=0, mode='min', restore_best_weights=True)

EPOCHS = 50
modified_history_shakespeare_words = two_stage_model_shakespeare_words.fit(dataset_shakespeare,
                                                                           epochs = EPOCHS,
                                                                           callbacks = [reduce_lr, early_stop])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50


In [None]:
class one_step_modified_lstm_model(tf.keras.Model):
    def __init__(self, model, words_from_ids, ids_from_words, temperature = 1.0):
        super().__init__()
        self.temperature = temperature
        self.model = model
        self.words_from_ids = words_from_ids
        self.ids_from_words = ids_from_words

        # Create a mask to prevent [UNK] from being generated
        skip_ids = self.ids_from_words(['[UNK]'])[:, None]

        sparse_mask = tf.SparseTensor(values = [-float(np.inf)] * len(skip_ids), indices = skip_ids, dense_shape = [len(ids_from_words.get_vocabulary())])
        self.prediction_mask = tf.sparse.to_dense(sparse_mask)

    @tf.function
    def generate_one_step(self, inputs, states_1, states_2):
        input_words = tf.strings.split(inputs)
        input_ids = self.ids_from_words(input_words).to_tensor()

        # Run the model.
        # predicted_logits.shape is [batch, char, next_char_logits]
        predicted_logits, states_h_1, states_c_1, states_h_2, states_c_2 = self.model(inputs=input_ids,
                                                                                      states_1=states_1,
                                                                                      states_2 = states_2,
                                                                                      return_state=True)
        # Only use the last prediction.
        predicted_logits = predicted_logits[:, -1, :]
        predicted_logits = predicted_logits/self.temperature
        # Apply the prediction mask: prevent "[UNK]" from being generated.
        predicted_logits = predicted_logits + self.prediction_mask
        
        # Sample the output logits to generate token IDs.
        predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
        predicted_ids = tf.squeeze(predicted_ids, axis=-1)
        
        # Convert from token ids to characters
        predicted_words = self.words_from_ids(predicted_ids)
        
        # Return the characters and model state.
        return predicted_words, states_h_1, states_c_1, states_h_2, states_c_2

In [None]:
one_step_modified_model_shakespeare = one_step_modified_lstm_model(two_stage_model_shakespeare_words,
                                                                   words_from_ids_shakespeare,
                                                                   ids_from_words_shakespeare)

In [None]:
start = time.time()
states_h_1 = None
states_c_1 = None
states_h_2 = None
states_c_2 = None
next_char = tf.constant(['ROMEO: '])
result = [next_char]

for n in range(1000):
    next_char, states_h_1, states_c_1, states_h_2, states_c_2 = one_step_modified_model_shakespeare.generate_one_step(inputs = next_char,
                                                                                                                      states_1=[states_h_1, states_c_1],
                                                                                                                      states_2=[states_h_2, states_c_2])
    result.append(next_char)

result_shakespeare = tf.strings.join(result, separator = ' ')
end = time.time()

print(result_shakespeare[0].numpy().decode('UTF-8'), '\n\n'+'_'*80)
print('\nRun time: ', end-start)

ROMEO:  beauties grumble deliberate you one the come has faction plague empty where you it that the sebastian a i the with in word crown'd slaves thy valleys shock cry grieves defective cain's when how not apemantus me for is of let bear living i eternity heard may flesh walk'd clarence performance the i shall with how than blood o all go and wealth's solicit'st philostrate justify life chidden comes hazard dignified benefit boys and to in weal warwick i rest then of met i would he is eyes and done yet chamber of 'i thee not shake multitudes with tell may man he norfolk this this worth king sweet and know please seeks of his those i go employ'd dance three trinculo ring we look devour'd beholders recompense her but thou of told aumerle and we best if travel brief woo him in some when or pistol choose a a death what seven steps cause this all a young goodness be of courteous all both now remainder a grant christian thou edward burst his with i ho fourth your o king law bleed against mac

So we can conclude that in this case, a simple GRU model seems to outperform more complex models.