 # Table of Contents
<div class="toc" style="margin-top: 1em;"><ul class="toc-item" id="toc-level0"><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Prepare-the-text" data-toc-modified-id="Prepare-the-text-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Prepare the text</a></span></li><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Prepare-the-input-for-the-model" data-toc-modified-id="Prepare-the-input-for-the-model-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Prepare the input for the model</a></span></li><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Set-up-the-model" data-toc-modified-id="Set-up-the-model-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Set up the model</a></span><ul class="toc-item"><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Stateless-model" data-toc-modified-id="Stateless-model-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Stateless model</a></span></li><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Stateful-model-with-regular-input" data-toc-modified-id="Stateful-model-with-regular-input-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Stateful model with regular input</a></span></li><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Stateful-model-with-restructured-input" data-toc-modified-id="Stateful-model-with-restructured-input-3.3"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Stateful model with restructured input</a></span></li></ul></li><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Copy-weights-to-prediction-model" data-toc-modified-id="Copy-weights-to-prediction-model-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Copy weights to prediction model</a></span></li><li><span><a href="http://localhost:8888/notebooks/hp_generator.ipynb#Test-Model" data-toc-modified-id="Test-Model-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Test Model</a></span></li></ul></div>

In [1]:
# Import 
import numpy as np
from numpy.random import choice
import keras
from keras.layers import LSTM, Dense, Dropout, BatchNormalization, \
    TimeDistributed, Embedding, Input
from keras.models import Model, Sequential
from keras.optimizers import Adam
from __future__ import division, print_function

Using Theano backend.
Using gpu device 0: Tesla K80 (CNMeM is disabled, cuDNN 5103)


## Prepare the text

- `char` are all the unique characters.
- `vocab_size` is the number of unique characters
- `idx` is the text as a list
- `char2idx` and `idx2char` are the conversion dictionaries

In [2]:
# path = '/Users/stephanrasp/repositories/courses/data/hp/'
path = '/home/ubuntu/repositories/courses/data/hp/'
fn = 'HP_7_-_Harry_Potter_and_the_Deathly_Hallows.txt'

In [3]:
text = open(path + fn).read()

In [4]:
# Create smaller sample
# text = text[:200000]

In [5]:
len(text)

1202911

In [6]:
print(text[20000:20200])

her head almost imperceptibly, then resumed her own deadpan stare at the opposite wall.

‘Enough,’ said Voldemort, stroking the angry snake. ‘Enough.’

And the laughter died at once.

�


In [7]:
chars = sorted(list(set(text)))
vocab_size = len(chars)
print(chars), len(chars)

['\t', '\n', '\r', ' ', '!', "'", '(', ')', ',', '-', '.', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '\x80', '\x93', '\x94', '\x98', '\x99', '\x9c', '\x9d', '\x9f', '\xa4', '\xa6', '\xa9', '\xc2', '\xc3', '\xe2']


(None, 90)

In [8]:
# Build dictionaries
char2idx = dict((c, i) for i, c in enumerate(chars))
idx2char = dict((i, c) for i, c in enumerate(chars))

In [9]:
print([char2idx[c] for c in text[10000:10050]])

[50, 61, 61, 74, 3, 57, 54, 3, 58, 68, 3, 52, 54, 67, 69, 50, 58, 63, 8, 89, 76, 80, 3, 68, 50, 58, 53, 3, 42, 63, 50, 65, 54, 10, 3, 89, 76, 79, 32, 3, 50, 68, 68, 70, 67, 54, 3, 74, 64, 70]


In [10]:
# Convert the entire text
idx = [char2idx[c] for c in text]

In [11]:
len(idx)

1202911

## Prepare the input for the model

As input we need 

In [12]:
cs = 40
# cs = 15
bs = 64

In [13]:
len(idx)

1202911

In [14]:
idx_cropped = idx[:(len(idx) // cs * cs) + 1]

In [15]:
xs = np.reshape(idx_cropped[:-1], (-1, cs))
ys = np.reshape(idx_cropped[1:], (-1, cs))
xs.shape, ys.shape

((30072, 40), (30072, 40))

In [16]:
xs[:4]

array([[31, 24, 41, 41, 48,  2,  1,  2,  1,  2,  1, 39, 38, 43, 43, 28, 41,
         2,  1,  2,  1,  2,  1, 50, 63, 53,  3, 69, 57, 54,  3, 27, 54, 50,
        69, 57, 61, 74,  3, 31],
       [50, 61, 61, 64, 72, 68,  2,  1,  2,  1,  2,  1,  2,  1,  2,  1,  2,
         1, 33, 10, 34, 10,  3, 41, 38, 46, 35, 32, 37, 30,  2,  1,  2,  1,
         2,  1,  2,  1,  2,  1],
       [ 2,  1, 24, 61, 61,  3, 67, 58, 56, 57, 69, 68,  3, 67, 54, 68, 54,
        67, 71, 54, 53, 22,  3, 63, 64,  3, 65, 50, 67, 69,  3, 64, 55,  3,
        69, 57, 58, 68,  3, 65],
       [70, 51, 61, 58, 52, 50, 69, 58, 64, 63,  3, 62, 50, 74,  3, 51, 54,
         3, 67, 54, 65, 67, 64, 53, 70, 52, 54, 53,  3, 64, 67,  3, 69, 67,
        50, 63, 68, 62, 58, 69]])

In [17]:
ys[:4]

array([[24, 41, 41, 48,  2,  1,  2,  1,  2,  1, 39, 38, 43, 43, 28, 41,  2,
         1,  2,  1,  2,  1, 50, 63, 53,  3, 69, 57, 54,  3, 27, 54, 50, 69,
        57, 61, 74,  3, 31, 50],
       [61, 61, 64, 72, 68,  2,  1,  2,  1,  2,  1,  2,  1,  2,  1,  2,  1,
        33, 10, 34, 10,  3, 41, 38, 46, 35, 32, 37, 30,  2,  1,  2,  1,  2,
         1,  2,  1,  2,  1,  2],
       [ 1, 24, 61, 61,  3, 67, 58, 56, 57, 69, 68,  3, 67, 54, 68, 54, 67,
        71, 54, 53, 22,  3, 63, 64,  3, 65, 50, 67, 69,  3, 64, 55,  3, 69,
        57, 58, 68,  3, 65, 70],
       [51, 61, 58, 52, 50, 69, 58, 64, 63,  3, 62, 50, 74,  3, 51, 54,  3,
        67, 54, 65, 67, 64, 53, 70, 52, 54, 53,  3, 64, 67,  3, 69, 67, 50,
        63, 68, 62, 58, 69, 69]])

In [18]:
# Crop to batch size
xs = xs[:(xs.shape[0] // bs * bs)]
ys = ys[:(ys.shape[0] // bs * bs)]
xs.shape, ys.shape

((30016, 40), (30016, 40))

In [19]:
xs_state = np.zeros(xs.shape)
ys_state = np.zeros(ys.shape)

In [20]:
n_batch = xs.shape[0] // bs
n_batch

469

In [21]:
for i in range(bs):
    xs_state[i::bs] = xs[i*n_batch:(i+1)*n_batch]
    ys_state[i::bs] = ys[i*n_batch:(i+1)*n_batch]

In [22]:
ys.shape, ys_state.shape

((30016, 40), (30016, 40))

In [23]:
ys = np.atleast_3d(ys)
ys_state = np.atleast_3d(ys_state)
ys.shape, ys_state.shape

((30016, 40, 1), (30016, 40, 1))

So Ys is just shifted by one!

## Set up the model

### Stateless model

In [24]:
def build_model(vocab_size, n_fac, cs, batch_size, n_hidden, stateful):
    model = Sequential([
        Embedding(input_dim=vocab_size, output_dim=n_fac, input_length=cs, 
                  batch_input_shape=(batch_size, cs)),
        BatchNormalization(),
        LSTM(n_hidden, return_sequences=True, stateful=stateful, dropout_U=0.2,
             dropout_W=0.2),
        LSTM(n_hidden, return_sequences=True, stateful=stateful, dropout_U=0.2,
             dropout_W=0.2),
        TimeDistributed(Dense(n_hidden, activation='relu')),
        Dropout(0.1),
        TimeDistributed(Dense(vocab_size, activation='softmax')),
    ])
    model.compile(Adam(), loss='sparse_categorical_crossentropy')
    return model

In [25]:
n_fac, n_hidden = (24, 512)

In [26]:
model = build_model(vocab_size, n_fac, cs, bs, n_hidden, stateful=False)

In [27]:
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
embedding_1 (Embedding)          (64, 40, 24)          2160        embedding_input_1[0][0]          
____________________________________________________________________________________________________
batchnormalization_1 (BatchNormal(64, 40, 24)          48          embedding_1[0][0]                
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (64, 40, 512)         1099776     batchnormalization_1[0][0]       
____________________________________________________________________________________________________
lstm_2 (LSTM)                    (64, 40, 512)         2099200     lstm_1[0][0]                     
___________________________________________________________________________________________

In [28]:
model.fit(xs, ys, batch_size=bs, nb_epoch=2, shuffle=False)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7fbeb171ef90>

### Stateful model with regular input

In [44]:
def train_stateful(epochs, model, xs, ys, save_every):
    for e in xrange(epochs):
        model.reset_states()
        h = model.fit(xs, ys, batch_size=bs, nb_epoch=1, shuffle=False)
        print(h.history['loss'])
        if e % save_every == 0:
            model.save('./save_model_ep' + str(e) + '.h5')

In [123]:
model_state = build_model(vocab_size, n_fac, cs, bs, n_hidden, stateful=True)

In [126]:
train_stateful(2, model_state, xs, ys)

Epoch 1/1
[2.9337928020037136]
Epoch 1/1
[2.3584547570118537]
Epoch 1/1
[2.2392862335993695]
Epoch 1/1
[2.1665542996846714]
Epoch 1/1

KeyboardInterrupt: 

### Stateful model with restructured input

In [30]:
model_state2 = build_model(vocab_size, n_fac, cs, bs, n_hidden, stateful=True)

In [46]:
train_stateful(4, model_state2, xs_state, ys_state, 2)

Epoch 1/1
[1.0504362872922852]
Epoch 1/1
[1.036839721172349]
Epoch 1/1
[1.0245815784946433]
Epoch 1/1
[1.0140580467577935]


In [47]:
model_state2.save('./state_model2.h5')

## Copy weights to prediction model

In [40]:
def copy_weights(model):
    weights = model.get_weights()
    model_pred = build_model(vocab_size, n_fac, cs, 1, n_hidden, stateful=True)
    model_pred.set_weights(weights)
    return model_pred

In [48]:
model_pred = copy_weights(model_state2)

## Test Model

In [36]:
def print_example(seed_string, len_seq):
    for i in range(len_seq):
        x=np.array([char2idx[c] for c in seed_string[-cs:]])[np.newaxis,:]
        preds = model_pred.predict(x, verbose=0)[0][-1]
        preds = preds/np.sum(preds)
        next_char = choice(chars, p=preds)
        seed_string = seed_string + next_char
    print(seed_string)

In [37]:
seed = 'Harry picked up Hedwig’s cage, his Firebolt and his rucksack, gave his unnaturally tidy bedroom one last sweeping look and then made his ungainly way back downstairs to the hall, where he deposited cage, broomstick and bag near the foot of the stairs. The light'
print(seed)

Harry picked up Hedwig’s cage, his Firebolt and his rucksack, gave his unnaturally tidy bedroom one last sweeping look and then made his ungainly way back downstairs to the hall, where he deposited cage, broomstick and bag near the foot of the stairs. The light


In [66]:
seed[-40:]

'g near the foot of the stairs. The light'

In [49]:
print_example(seed, 800)

Harry picked up Hedwig’s cage, his Firebolt and his rucksack, gave his unnaturally tidy bedroom one last sweeping look and then made his ungainly way back downstairs to the hall, where he deposited cage, broomstick and bag near the foot of the stairs. The light the bangs closed the silence and grabpelled by his brief, joaning.

What was a great, looked ready that the barman leaned stretching on him; she vanished the carry, he could not listen.

‘STOP!’ he said it, but Harry could feel him … well what had heard who call from an house asride flike in his pretence. She did not reply through the tocks of Hagrid, after five step suffering enthusiastically. Harry was slamming damn, small wooden colour and sniggered like Harry.

‘But … you couldn’t hear him, speaking around … what about a Patronus was face, pointing to Harry. Don’t see ze used to let me a dozen days boy.’

‘Don’t live to tell him that,’ said Narcissa. Jone different driverback alond beside his face.

‘Un The statch 