 # Resurrecting Shakespeare with LTSM recurrent neural networks
 Graeme West - 2019-04-28
 
 ![Shakespeare Meme](https://i.pinimg.com/236x/f6/11/27/f611272ef416195a7726a037d4394de4.jpg "Shakespere Meme")
 
This notebook contains an version of the 'Generating Text in the Style of an Example' chapter from Douwe Osinga's [Deep Learning Cookbook](http://shop.oreilly.com/product/0636920097471.do) (O'Reilly, 2018).
 
The notebook imports the Complete Works of Shakespeare from Project Gutenberg, builds a recurrent neural network model with LTSM layers. The resulting text is somewhat convincing to a casual observer, as long as they aren't a paying too much attention:


> Thou smother'st honesty, thou murther'st troth;
>
> Thou foul abettor! thou notorious bawd!
>
> Thou plantest scandal and displacest laud:
>
> Thou ravisher, thou t#he more subject of thy love,
>
> And therefore thou art a soldier to the state,
>
> And therefore we have seen the strange seasons,
>
> And therefore we have seen the streets of the world,
>
> And therefore we have seen the state of men,
>
> And therefore we have seen the state of men,
>
> And therefore we have seen the state of men,
>
> And therefore we have seen the state of men,
>
> And therefore we have seen 

As you can see, the results can sometimes contain amusingly Elizabethan profanities! Because it's a character-level network, and it's trained on Shakespeare's plays, there are definitely oddities creeping in. Notably, characters used for dramatic prompts (such as the hashmark, used to precede the names of characters in the play, and stage directions), are replicated in the middle of prose.

Also, the output gets more repetitive the longer it freestyles from the 'seed' text (the starting point in the real corpus).

The training process took around 20 hours to run 12 epochs through the training data on my MacBook Pro 2016 Core i7. Just for fun, I also tried out the [TensorFlow Cloud TPU demo for a very similar Shakespeare-generating RNN](https://colab.research.google.com/drive/1DWdpYrgDB9cAMj4o2lTjROGLiXxzUFFf). Training on the TPUs took something like four minutes! I guess this shows the power of massively-parallel vector processing.

In [5]:
try:
    GUTENBERG = True
    from gutenberg.acquire import load_etext
    from gutenberg.query import get_etexts, get_metadata
    from gutenberg.acquire import get_metadata_cache
    from gutenberg.acquire.text import UnknownDownloadUriException
    from gutenberg.cleanup import strip_headers
    from gutenberg._domain_model.exceptions import CacheAlreadyExistsException
except ImportError:
    GUTENBERG = False
    print('Gutenberg is not installed. See instructions at https://pypi.python.org/pypi/Gutenberg')
from keras.models import Input, Model
from keras.layers import Dense, Dropout
from keras.layers import LSTM
from keras.layers.wrappers import TimeDistributed
import keras.callbacks
import keras.backend as K
import scipy.misc
import json

import os, sys
import re
import PIL
from PIL import ImageDraw

from keras.optimizers import RMSprop
import random
import numpy as np
import tensorflow as tf
from keras.utils import get_file

from IPython.display import clear_output, Image, display, HTML
try:
    from io import BytesIO
except ImportError:
    from StringIO import StringIO as BytesIO

In [9]:

cache = get_metadata_cache()
try:
    cache.populate()
except CacheAlreadyExistsException:
    pass

In [13]:
if GUTENBERG:
    for text_id in get_etexts('author', 'Shakespeare, William'):
        print(text_id, list(get_metadata('title', text_id))[0])

100 The Complete Works of William Shakespeare
10281 Antony's Address over the Body of Caesar
From Julius Caesar
10606 The Tragedie of Hamlet, Prince of Denmark
A Study with the Text of the Folio of 1623
1041 Shakespeare's Sonnets
1045 Venus and Adonis


In [14]:
if GUTENBERG:
    shakespeare = strip_headers(load_etext(100))
else:
    path = get_file('shakespeare', 'https://storage.googleapis.com/deep-learning-cookbook/100-0.txt')
    shakespeare = open(path).read()
training_text = shakespeare.split('\nTHE END', 1)[-1]
len(training_text)

5518999

In [15]:
chars = list(sorted(set(training_text)))
char_to_idx = {ch: idx for idx, ch in enumerate(chars)}
len(chars)

95

In [16]:
def char_rnn_model(num_chars, num_layers, num_nodes=512, dropout=0.1):
    input = Input(shape=(None, num_chars), name='input')
    prev = input
    for i in range(num_layers):
        lstm = LSTM(num_nodes, return_sequences=True, name='lstm_layer_%d' % (i + 1))(prev)
        if dropout:
            prev = Dropout(dropout)(lstm)
        else:
            prev = lstm
    dense = TimeDistributed(Dense(num_chars, name='dense', activation='softmax'))(prev)
    model = Model(inputs=[input], outputs=[dense])
    optimizer = RMSprop(lr=0.01)
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

In [17]:
model = char_rnn_model(len(chars), num_layers=2, num_nodes=640, dropout=0)
model.summary()

Instructions for updating:
Colocations handled automatically by placer.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input (InputLayer)           (None, None, 95)          0         
_________________________________________________________________
lstm_layer_1 (LSTM)          (None, None, 640)         1884160   
_________________________________________________________________
lstm_layer_2 (LSTM)          (None, None, 640)         3279360   
_________________________________________________________________
time_distributed_1 (TimeDist (None, None, 95)          60895     
Total params: 5,224,415
Trainable params: 5,224,415
Non-trainable params: 0
_________________________________________________________________


In [18]:
CHUNK_SIZE = 160

def data_generator(all_text, char_to_idx, batch_size, chunk_size):
    X = np.zeros((batch_size, chunk_size, len(char_to_idx)))
    y = np.zeros((batch_size, chunk_size, len(char_to_idx)))
    while True:
        for row in range(batch_size):
            idx = random.randrange(len(all_text) - chunk_size - 1)
            chunk = np.zeros((chunk_size + 1, len(char_to_idx)))
            for i in range(chunk_size + 1):
                chunk[i, char_to_idx[all_text[idx + i]]] = 1
            X[row, :, :] = chunk[:chunk_size]
            y[row, :, :] = chunk[1:]
        yield X, y

next(data_generator(training_text, char_to_idx, 4, chunk_size=CHUNK_SIZE))

(array([[[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         ...,
         [0., 1., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.]],
 
        [[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 1., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 1., 0., ..., 0., 0., 0.]],
 
        [[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 1., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.]],
 
        [[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0

In [19]:
early = keras.callbacks.EarlyStopping(monitor='loss',
                              min_delta=0.03,
                              patience=3,
                              verbose=0, mode='auto')

BATCH_SIZE = 256
model.fit_generator(
    data_generator(training_text, char_to_idx, batch_size=BATCH_SIZE, chunk_size=CHUNK_SIZE),
    epochs=40,
    callbacks=[early,],
    steps_per_epoch=2 * len(training_text) / (BATCH_SIZE * CHUNK_SIZE),
    verbose=2
)

Instructions for updating:
Use tf.cast instead.
Epoch 1/40
 - 6085s - loss: 3.3449 - acc: 0.1996
Epoch 2/40
 - 5532s - loss: 3.0624 - acc: 0.2363
Epoch 3/40
 - 5923s - loss: 2.1259 - acc: 0.4595
Epoch 4/40
 - 5934s - loss: 1.9516 - acc: 0.5120
Epoch 5/40
 - 5923s - loss: 1.9098 - acc: 0.5252
Epoch 6/40
 - 5945s - loss: 1.8549 - acc: 0.5388
Epoch 7/40
 - 5955s - loss: 1.8840 - acc: 0.5361
Epoch 8/40
 - 5936s - loss: 1.8322 - acc: 0.5476
Epoch 9/40
 - 5967s - loss: 1.7672 - acc: 0.5613
Epoch 10/40
 - 6077s - loss: 1.8327 - acc: 0.5513
Epoch 11/40
 - 5983s - loss: 1.7622 - acc: 0.5653
Epoch 12/40
 - 5943s - loss: 1.8123 - acc: 0.5581


<keras.callbacks.History at 0xb2eefd4a8>

In [21]:
with open('shakespeare.json', 'w') as fout:
    json.dump({
        'chars': ''.join(chars),
        'char_to_idx': char_to_idx,
        'chunk_size': CHUNK_SIZE,
    }, fout)
model.save('shakespeare.h5')

In [22]:
def generate_output(model, training_text, start_index=None, diversity=None, amount=400):
    if start_index is None:
        start_index = random.randint(0, len(training_text) - CHUNK_SIZE - 1)
    generated = training_text[start_index: start_index + CHUNK_SIZE]
    yield generated + '#'
    for i in range(amount):
        x = np.zeros((1, len(generated), len(chars)))
        for t, char in enumerate(generated):
            x[0, t, char_to_idx[char]] = 1.
        preds = model.predict(x, verbose=0)[0]
        if diversity is None:
            next_index = np.argmax(preds[len(generated) - 1])
        else:
            preds = np.asarray(preds[len(generated) - 1]).astype('float64')
            preds = np.log(preds) / diversity
            exp_preds = np.exp(preds)
            preds = exp_preds / np.sum(exp_preds)
            probas = np.random.multinomial(1, preds, 1)
            next_index = np.argmax(probas)     
        next_char = chars[next_index]
        yield next_char

        generated += next_char
    return generated

for ch in generate_output(model, training_text):
    sys.stdout.write(ch)
print()

aw'd;
Thou smother'st honesty, thou murther'st troth;
Thou foul abettor! thou notorious bawd!
Thou plantest scandal and displacest laud:
  Thou ravisher, thou t#he more subject of thy love,
  And therefore thou art a soldier to the state,
  And therefore we have seen the strange seasons,
  And therefore we have seen the streets of the world,
  And therefore we have seen the state of men,
  And therefore we have seen the state of men,
  And therefore we have seen the state of men,
  And therefore we have seen the state of men,
  And therefore we have seen 


In [26]:
for ch in generate_output(model, training_text, start_index=398, diversity=None, amount=400):
    sys.stdout.write(ch)
print()

Scene V. Another room in the same.


ACT III
Scene I. Florence. A room in the Duke’s palace.
Scene II. Rossillon. A room in the Countess’s palace.
Scene III. Fl#orence. A room in the Castle.


ACT I

SCENE I. A room in the Castle.

 Enter the King and Parolles and Attendants.

LUCENTIO.
I will not stay the letter to your lordship.

POLONIUS.
I am a gentleman of the strange song that I have seen the strange service
Of the which I have seen the state of men.

 [_Exeunt all but Cassius._]

CASSIUS.
I will not stay the word of me.

BRUTUS.
What is the matter?


In [28]:
for ch in generate_output(model, training_text):
    sys.stdout.write(ch)
print()

gave to Alexander; to Ptolemy he assign'd
    Syria, Cilicia, and Phoenicia. She
    In th' habiliments of the goddess Isis
    That day appear'd; and oft befor#e the King
    The more the which they shall be so bold to see them
    That they are so far of state as they were all the world.
    The strange season of the world is strong,
    And therefore will I be so far to see the streets
    That they are so far of strange and so far
    As the season where they are seen to see them.
    The strange suit of the world is so far the strangers
    That they


In [30]:
for ch in generate_output(model, training_text):
    sys.stdout.write(ch)
print()

d midnight still,
    Guarded with grandsires, babies, and old women,
    Either past or not arriv'd to pith and puissance;
    For who is he whose chin is but #a shame?
    The strange season of the world is strong,
    And therefore will I be so far to see the streets
    That they are so far of strange and so far
    As the season where they are seen to see them.
    The strange season of the world is strong,
    And therefore will I be so far to see the streets
    That they are so far of strange and so far
    As the season where they are seen to see


In [34]:
for ch in generate_output(model, training_text):
    sys.stdout.write(ch)
print()

 [A trumpet within.] What trumpet is that
    same?
  IAGO. Something from Venice, sure. 'Tis Lodovico
    Come from the Duke. And, see your wife is with him.

#                                                                                                                                                                                                                                                                                                                                                                                                                
