## Predict Shakespeare Keras

It's just like someone mugging up the works of Shakespeare and blurting it out in random order as they don't know English :) The model works surprisingly well in learning the correct words, rules for capitalizing words, full stops and other punctuations. 

## Overview

We use keras LSTM cells to build a language model that predicts the next character of text given the text so far. Use the trained model to make predictions and generate your own favourite author's writing style.

In this example, you train the model on the combined works of William Shakespeare, then use the model to compose stuff

### Download data

Download *The Complete Works of William Shakespeare* as a single text file from [Project Gutenberg](https://www.gutenberg.org/). You use snippets from this file as the *training data* for the model. The *target* snippet is offset by one character.

In [1]:
!wget --show-progress --continue -O shakespeare.txt http://www.gutenberg.org/files/100/100-0.txt

--2019-03-01 22:06:10--  http://www.gutenberg.org/files/100/100-0.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 152.19.134.47, 2610:28:3090:3000:0:bad:cafe:47
Connecting to www.gutenberg.org (www.gutenberg.org)|152.19.134.47|:80... connected.
HTTP request sent, awaiting response... 200 OK

    The file is already fully retrieved; nothing to do.



### Build the data generator

In [2]:
import numpy as np
import six
import tensorflow as tf
import time
import os


SHAKESPEARE_TXT = 'shakespeare.txt'

tf.logging.set_verbosity(tf.logging.INFO)

def transform(txt, pad_to=None):
  # drop any non-ascii characters
  output = np.asarray([ord(c) for c in txt if ord(c) < 255], dtype=np.int32)
  if pad_to is not None:
    output = output[:pad_to]
    output = np.concatenate([
        np.zeros([pad_to - len(txt)], dtype=np.int32),
        output,
    ])
  return output

def training_generator(seq_len=100, batch_size=1024):
  """A generator yields (source, target) arrays for training."""
  with tf.gfile.GFile(SHAKESPEARE_TXT, 'r') as f:
    txt = f.read()

  tf.logging.info('Input text [%d] %s', len(txt), txt[:50])
  source = transform(txt)
  print(len(source))
  while True:
    offsets = np.random.randint(0, len(source) - seq_len, batch_size)
#     print(offsets)
    # Our model uses sparse crossentropy loss, but Keras requires labels
    # to have the same rank as the input logits.  We add an empty final
    # dimension to account for this.
    yield (
        np.stack([source[idx:idx + seq_len] for idx in offsets]),
        np.expand_dims(
            np.stack([source[idx + 1:idx + seq_len + 1] for idx in offsets]),
            -1),
    )

print(six.next(training_generator(seq_len=10, batch_size=2)))
# print(six.next(training_generator(seq_len=10, batch_size=5)))

INFO:tensorflow:Input text [5812220] ﻿
Project Gutenberg’s The Complete Works of Willi
5801122
(array([[100,  32, 119, 105, 116, 104,  32, 116, 104, 121],
       [ 32, 104, 105, 115,  32, 112, 114,  97, 121, 101]], dtype=int32), array([[[ 32],
        [119],
        [105],
        [116],
        [104],
        [ 32],
        [116],
        [104],
        [121],
        [ 32]],

       [[104],
        [105],
        [115],
        [ 32],
        [112],
        [114],
        [ 97],
        [121],
        [101],
        [114]]], dtype=int32))


### Build the model

The model is defined as a two-layer, forward-LSTM—with two changes from the `tf.keras` standard LSTM definition:

1. Define the input `shape` of the model to comply with the [XLA compiler](https://www.tensorflow.org/performance/xla/)'s static shape requirement.
2. Use `tf.train.Optimizer` instead of a standard Keras optimizer (Keras optimizer support is still experimental).

In [3]:
EMBEDDING_DIM = 512

def lstm_model(seq_len=100, batch_size=None, stateful=True):
    """Language model: predict the next word given the current word."""
    source = tf.keras.Input(
      name='seed', shape=(seq_len,), batch_size=batch_size, dtype=tf.int32)

    embedding = tf.keras.layers.Embedding(input_dim=256, output_dim=EMBEDDING_DIM)(source)
    lstm_1 = tf.keras.layers.LSTM(EMBEDDING_DIM, stateful=stateful, return_sequences=True)(embedding)
    print(lstm_1)
    lstm_2 = tf.keras.layers.LSTM(EMBEDDING_DIM, stateful=stateful, return_sequences=True)(lstm_1)
    predicted_char = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(256, activation='softmax'))(lstm_2)
    model = tf.keras.Model(inputs=[source], outputs=[predicted_char])

    model.compile(
      optimizer=tf.train.RMSPropOptimizer(learning_rate=0.01),
      loss='sparse_categorical_crossentropy',
      metrics=['sparse_categorical_accuracy'])
    return model

In [4]:
lstm_model(seq_len=100, batch_size=128, stateful=False)

Tensor("lstm/transpose_1:0", shape=(128, 100, 512), dtype=float32)


<tensorflow.python.keras.engine.training.Model at 0x7f35a1b80d30>

In [5]:
tf.keras.backend.clear_session()

model = lstm_model(seq_len=100, batch_size=128, stateful=False)


Tensor("lstm/transpose_1:0", shape=(128, 100, 512), dtype=float32)


In [6]:
model.fit_generator(
    training_generator(seq_len=100, batch_size=128),
    steps_per_epoch=100,
    epochs=20
)

Epoch 1/20
INFO:tensorflow:Input text [5812220] ﻿
Project Gutenberg’s The Complete Works of Willi
5801122
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x7f35a07fcf98>

In [7]:
model.save_weights('/tmp/model_shakespeare.h5', overwrite=True)

### Make predictions with the model

Use the trained model to make predictions and generate your own Shakespeare-esque play.
Start the model off with a *seed* sentence, then generate 250 characters from it. The model makes five predictions from the initial seed.

In [9]:
BATCH_SIZE = 5
PREDICT_LEN = 250

# Keras requires the batch size be specified ahead of time for stateful models.
# We use a sequence length of 1, as we will be feeding in one character at a 
# time and predicting the next character.
prediction_model = lstm_model(seq_len=1, batch_size=BATCH_SIZE, stateful=True)
prediction_model.load_weights('/tmp/model_shakespeare.h5')

# We seed the model with our initial string, copied BATCH_SIZE times

seed_txt = 'To hell with my vows of allegiance to you? '
#seed_txt = 'Looks it not like the king?  Verily, we must go!'
seed = transform(seed_txt)
seed = np.repeat(np.expand_dims(seed, 0), BATCH_SIZE, axis=0)

# First, run the seed forward to prime the state of the model.
prediction_model.reset_states()
for i in range(len(seed_txt) - 1):
  prediction_model.predict(seed[:, i:i + 1])

# Now we can accumulate predictions!
predictions = [seed[:, -1:]]
for i in range(PREDICT_LEN):
  last_word = predictions[-1]
  next_probits = prediction_model.predict(last_word)[:, 0, :]
  
  # sample from our output distribution
  next_idx = [         
    np.random.choice(256, p=next_probits[i])
    for i in range(BATCH_SIZE)
  ]
  predictions.append(np.asarray(next_idx, dtype=np.int32))
  

for i in range(BATCH_SIZE):
  print('PREDICTION %d\n\n' % i)
  p = [predictions[j][i] for j in range(PREDICT_LEN)]
  generated = ''.join([chr(c) for c in p])
  print(generated)
  print()
  assert len(generated) == PREDICT_LEN, 'Generated text too short'

Tensor("lstm_4/transpose_1:0", shape=(5, 1, 512), dtype=float32)
PREDICTION 0


 OBARBUS.
Thou casts it to schoolman.  Comfort thee the staffe
Be boast. But my feet I know it was not fear their will thus.

SMERVANIUS.
You do till Benedick than that lipled to grant
  very false is my light; the cure reasons onward,
And dot

PREDICTION 1


 O thou thither the other hear me?
Shall he talk to the busiing.

LEAR.
You know Henry King Harry.

SEBATH.
Pay let us aspassable signs, we have engaind again;
Which Ednes, be away to him,
Or honourably morning into the enscaped bearing of a

PREDICTION 2


    Exeunt SERVIN and GENTLEMAN

  A BASSIAN. Prafs'd the Arthum, dead mend to do thy hand, sancy?
  FORDIAN. As I would not hold? Is my lord
    When?
  MALCOLMINE. She's that's never blew
    Upon to resist him more, murderer.
  IAGO. I pryt

PREDICTION 3


 But cannot I discree; Iachilles that to my truth,
Round to my special stuff in my deed. All the conceive of Wishest in
great )re lords?

WOLE