# Generating Shakespeare

## Setup

We're going to download the collected plays of Shakespeare to use as our data.

Source: http://www.gutenberg.org/cache/epub/100/pg100.txt

The original source was preprocessed to remove sonnets and non-Shakesperean text added by Project Gutenberg.

In [1]:
import numpy as np

In [2]:
import os

BASE_DIR = os.getcwd()
DATA_DIR = BASE_DIR + '/data/shakespeare/'

In [3]:
model_path = DATA_DIR + 'models/'
if not os.path.exists(model_path): os.mkdir(model_path)

In [4]:
data = DATA_DIR + 'gutenberg_shakespeare_modified.txt' # preprocessed

with open(data, 'r') as f:
    text = f.read()
print('corpus length:', len(text))

('corpus length:', 5291227)


In [5]:
chars = sorted(list(set(text)))
vocab_size = len(chars)+1
print('total chars:', vocab_size)

('total chars:', 88)


Sometimes it's useful to have a zero value in the dataset, e.g. for padding

In [6]:
chars.insert(0, "\0")

In [7]:
''.join(chars)

'\x00\n\r !"&\'(),-.0123456789:;<?ABCDEFGHIJKLMNOPQRSTUVWXYZ[]_`abcdefghijklmnopqrstuvwxyz|}\xbb\xbf\xef'

Map chars to indices and vice versa

In [8]:
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

In [9]:
print(char_indices)

{'\x00': 0, ' ': 3, '(': 8, ',': 10, '0': 13, '4': 17, '8': 21, '\xbb': 85, '<': 25, '\xbf': 86, 'D': 30, 'H': 34, 'L': 38, 'P': 42, 'T': 46, 'X': 50, '`': 56, 'd': 60, 'h': 64, 'l': 68, '\xef': 87, 'p': 72, 't': 76, 'x': 80, '|': 83, "'": 7, '3': 16, '7': 20, ';': 24, '?': 26, 'C': 29, 'G': 33, 'K': 37, 'O': 41, 'S': 45, 'W': 49, '[': 53, '_': 55, 'c': 59, 'g': 63, 'k': 67, 'o': 71, 's': 75, 'w': 79, '\n': 1, '"': 5, '&': 6, '.': 12, '2': 15, '6': 19, ':': 23, 'B': 28, 'F': 32, 'J': 36, 'N': 40, 'R': 44, 'V': 48, 'Z': 52, 'b': 58, 'f': 62, 'j': 66, 'n': 70, 'r': 74, 'v': 78, 'z': 82, '\r': 2, '!': 4, ')': 9, '-': 11, '1': 14, '5': 18, '9': 22, 'A': 27, 'E': 31, 'I': 35, 'M': 39, 'Q': 43, 'U': 47, 'Y': 51, ']': 54, 'a': 57, 'e': 61, 'i': 65, 'm': 69, 'q': 73, 'u': 77, 'y': 81, '}': 84}


*idx* converts the Shakepearean text to character indices (based on the *char_indices* mapping above)

In [10]:
idx = [char_indices[c] for c in text]

In [11]:
print(idx[:70])

[87, 85, 86, 45, 29, 31, 40, 31, 23, 2, 1, 44, 71, 77, 75, 65, 68, 68, 71, 70, 24, 3, 42, 57, 74, 65, 75, 24, 3, 32, 68, 71, 74, 61, 70, 59, 61, 24, 3, 39, 57, 74, 75, 61, 65, 68, 68, 61, 75, 2, 1, 2, 1, 2, 1, 27, 29, 46, 3, 35, 12, 3, 45, 29, 31, 40, 31, 3, 14, 12]


In [12]:
''.join(indices_char[i] for i in idx[:70])

'\xef\xbb\xbfSCENE:\r\nRousillon; Paris; Florence; Marseilles\r\n\r\n\r\nACT I. SCENE 1.'

## 3 char model

### GLOBALS needed from this point on

In [13]:
from keras.layers import Input, Embedding, LSTM, merge, SimpleRNN, TimeDistributed
from keras.layers.core import Dense, Flatten
from keras.models import Model, Sequential
from keras.optimizers import Adam
from keras.layers.normalization import BatchNormalization

Using Theano backend.
Using gpu device 0: Tesla K80 (CNMeM is disabled, cuDNN 5103)


In [14]:
n_fac = 42 # number of latent factors (size of embedding matrix)
n_hidden = 256 # hyperparameter: size of hidden state

### Create inputs

Create a list of every 4th character, starting at the 0th, 1st, 2nd, then 3rd characters

In [None]:
nc = 3 # num chars
c1_dat = [idx[i] for i in xrange(0, len(idx)-1-nc, nc)]
c2_dat = [idx[i+1] for i in xrange(0, len(idx)-1-nc, nc)]
c3_dat = [idx[i+2] for i in xrange(0, len(idx)-1-nc, nc)]
c4_dat = [idx[i+3] for i in xrange(0, len(idx)-1-nc, nc)]

In [None]:
0, len(idx)-1-nc, nc

In [None]:
len(c1_dat), len(c4_dat)

Out inputs

In [None]:
x1 = np.stack(c1_dat)
x2 = np.stack(c2_dat)
x3 = np.stack(c3_dat)

Out output

In [None]:
y = np.stack(c4_dat)

In [None]:
x1.shape, y.shape

Create inputs and embedding outputs for each of our 3 character inputs

In [None]:
def embedding_input(name, n_in, n_out):
    inp = Input(shape=(1,), dtype='int64', name=name+'_in')
    emb = Embedding(n_in, n_out, input_length=1, name=name+'_emb')(inp)
    return inp, Flatten()(emb)

In [None]:
c1_in, c1_emb = embedding_input('c1', vocab_size, n_fac)
c2_in, c2_emb = embedding_input('c2', vocab_size, n_fac)
c3_in, c3_emb = embedding_input('c3', vocab_size, n_fac)

### Create and train model

![3char](./3char.png)

`dense_in` is the 'green arrow' in the diagram - the layer operation from input to hidden

In [None]:
dense_in = Dense(n_hidden, activation='relu')

Our first hidden activation is simply this function applied to the result of the embedding of the first character.

In [None]:
c1_hidden = dense_in(c1_emb)

`dense_hidden` is the 'orange arrow' from our diagram - the layer operation from hidden to hidden

_Note:_ unsure why the activation for this is `tanh`

In [None]:
dense_hidden = Dense(n_hidden, activation='tanh')

Our second and third activations sum up the previous hidden state (after applying `dense_hidden`) to the new input state.

In [None]:
# merge([new input state, orange arrow from previous hidden state])
c2_hidden = merge([dense_in(c2_emb), dense_hidden(c1_hidden)])
c3_hidden = merge([dense_in(c3_emb), dense_hidden(c2_hidden)])

`dense_out` is the 'blue arrow' from our diagram - the layer operation from hidden to output

In [None]:
dense_out = Dense(vocab_size, activation='softmax')

The third hidden state is the input to our output layer

In [None]:
c4_out = dense_out(c3_hidden)

In [None]:
model = Model([c1_in, c2_in, c3_in], c4_out)
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
model.optimizer.lr=0.000001

In [None]:
model.summary()

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=4)

In [None]:
model.optimizer.lr=0.01

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=4)

In [None]:
model.optimizer.lr=0.000001

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=4)

In [None]:
model.optimizer.lr=0.01

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=4)

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=10)

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=10)

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=10)

In [None]:
model.fit([x1, x2, x3], y, batch_size=64, nb_epoch=10)

Let's save the model.

In [None]:
save1_path = model_path + 'save1.h5'
if not os.path.exists(save1_path):
    model.save_weights(save1_path)
model.load_weights(save1_path)

### Test Model

"`newaxis` is used to increase the dimension of the existing array by one more dimension, when used once" - [source](https://stackoverflow.com/questions/29241056/the-use-of-numpy-newaxis)

In [None]:
def get_next(m, inp):
    idxs = [char_indices[c] for c in inp]
    arrs = [np.array(i)[np.newaxis] for i in idxs]
    p = m.predict(arrs)
    i = np.argmax(p)
    return chars[i]

In [None]:
get_next(model, 'phi')

In [None]:
get_next(model, ' th')

In [None]:
get_next(model, ' an')

## Our first RNN!

### GLOBALS needed from this point on

In [15]:
nc = 8 # numChars == size of our unrolled RNN

`xs` (+ `c_in_dat`), `y` (+ `c_out_dat`), `cs` (+ `embedding_input()`)

### Create inputs

Now let's try predicting char 9 using chars 1-8.

For each of 0 through 7, create a list of every 8th character with that starting point. These will be the 8 inputs to our model.

In [16]:
c_in_dat = [[idx[i+n] for i in xrange(0, len(idx)-1-nc, nc)]
           for n in range(nc)]

Then create a list of the next character in each of these series. This will be the labels for our model.

In [17]:
c_out_dat = [idx[i+nc] for i in xrange(0, len(idx)-1-nc, nc)]

In [18]:
xs = [np.stack(c) for c in c_in_dat]

In [19]:
len(xs), xs[0].shape

(8, (661403,))

In [20]:
y = np.stack(c_out_dat)

So each column below is one series of 8 characters from the text:

In [21]:
[xs[n][:nc] for n in range(nc)]

[array([87, 23, 68, 74, 74, 57, 75, 29]),
 array([85,  2, 68, 65, 61, 74,  2, 46]),
 array([86,  1, 71, 75, 70, 75,  1,  3]),
 array([45, 44, 70, 24, 59, 61,  2, 35]),
 array([29, 71, 24,  3, 61, 65,  1, 12]),
 array([31, 77,  3, 32, 24, 68,  2,  3]),
 array([40, 75, 42, 68,  3, 68,  1, 45]),
 array([31, 65, 57, 71, 39, 61, 27, 29])]

...and this is the next character after each sequence:

In [22]:
y[:nc]

array([23, 68, 74, 74, 57, 75, 29, 31])

### Create and train model

In [23]:
def embedding_input(name, n_in, n_out):
    inp = Input(shape=(1,), dtype='int64', name=name+'_in')
    emb = Embedding(n_in, n_out, input_length=1, name=name+'_emb')(inp)
    return inp, Flatten()(emb)

In [24]:
cs = [embedding_input('c'+str(n), vocab_size, n_fac) for n in range(nc)]

"I'd suggest trying the trick I mentioned in the lesson for simple RNNs: using an identity matrix to initialize your hidden state, and use relu instead of tanh." - [Jeremy on forums](http://forums.fast.ai/t/purpose-of-rnns-and-theano/242/5)

In [None]:
dense_in = Dense(n_hidden, activation='relu')
dense_hidden = Dense(n_hidden, activation='relu', init='identity')
dense_out = Dense(vocab_size, activation='softmax')

The embedding of the first character of each sequence goes through `dense_in` to create our first hidden activations.

In [None]:
hidden = dense_in(cs[0][1])

Then for each successive layer, we combine the output of `dense_in` on the next character with the output of `dense_hidden` on the current hidden state to create the new hidden state.

In [None]:
for i in range(1, nc):
    dense = dense_in(cs[i][1])
    hidden = dense_hidden(hidden)
    hidden = merge([dense, hidden])

Putting the final hidden state through `dense_out` gives us our output.

In [None]:
out = dense_out(hidden)

Now we can create our model.

In [None]:
model = Model([c[0] for c in cs], out)
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
model.summary()

In [None]:
model.fit(xs, y, batch_size=64, nb_epoch=12)

### Test Model

In [None]:
def get_next(m, inp):
    arrs = [np.array(char_indices[c])[np.newaxis] for c in inp]
    p = m.predict(arrs)
    return chars[np.argmax(p)]

In [None]:
get_next(model, 'for thos')

In [None]:
get_next(model, 'part of ')

In [None]:
get_next(model, 'queens a')

Here's a helper function for generating `k` additional words (separated by whitespace) in a starter sequence

In [None]:
def get_seq(m, inp, k):
    k_count = 0
    seq = inp
    while k_count < k+1:
        pc = get_next(m, inp)
        seq += pc
        inp = inp[1:] + pc
        if (pc == ' '):
            k_count += 1
    return seq

In [None]:
get_seq(model, 'queens a', 10)

In [None]:
get_seq(model, 'part of ', 10)

In [None]:
get_seq(model, 'for thos', 10)

Model currently seems to 'fixate' on the phrase "the some sore"

In [None]:
model.fit(xs, y, batch_size=64, nb_epoch=12)

In [None]:
get_seq(model, 'queens a', 10)

In [None]:
get_seq(model, 'part of ', 10)

In [None]:
get_seq(model, 'for thos', 10)

In [None]:
model.fit(xs, y, batch_size=64, nb_epoch=12)

In [None]:
get_seq(model, 'queens a', 10)

In [None]:
get_seq(model, 'part of ', 10)

In [None]:
get_seq(model, 'for thos', 10)

In [None]:
save2_path = model_path + 'save2.h5'
if not os.path.exists(save2_path):
    model.save_weights(save2_path)
model.load_weights(save2_path)

Different 'fixation' on the phrase "the best with"

## Our first RNN with keras!

This is nearly equivalent to the RNN we built ourselves in the previous section.

In [None]:
model = Sequential([
        Embedding(vocab_size, n_fac, input_length=nc),
        SimpleRNN(n_hidden, activation='relu', inner_init='identity'),
        Dense(vocab_size, activation='softmax')
    ])
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
model.summary()

To avoid `IndexError: axis 1 out of bounds [0, 1)`: http://forums.fast.ai/t/lesson-6-discussion/245/70

In [None]:
model.fit(np.concatenate([x[np.newaxis] for x in xs]).T, y, batch_size=64, nb_epoch=12)

In [None]:
def get_next_keras(m, inp):
    idxs = [char_indices[c] for c in inp]
    arrs = np.array(idxs)[np.newaxis,:]
    p = m.predict(arrs)[0]
    return chars[np.argmax(p)]

In [None]:
def get_keras_seq(m, inp, k):
    k_count = 0
    seq = inp
    while k_count < k+1:
        pc = get_next_keras(m, inp)
        seq += pc
        inp = inp[1:] + pc
        if (pc == ' '):
            k_count += 1
    return seq

In [None]:
get_keras_seq(model, 'queens a', 10)

In [None]:
get_keras_seq(model, 'part of ', 10)

In [None]:
get_keras_seq(model, 'for thos', 10)

_Fixation_: "the sent"

In [None]:
model.fit(np.concatenate([x[np.newaxis] for x in xs]).T, y, batch_size=64, nb_epoch=12)

In [None]:
get_keras_seq(model, 'queens a', 10)

In [None]:
get_keras_seq(model, 'part of ', 10)

In [None]:
get_keras_seq(model, 'for thos', 10)

_Fixation_: "the serve me"

In [None]:
save3_path = model_path + 'save3.h5'
if not os.path.exists(save3_path):
    model.save_weights(save3_path)
model.load_weights(save3_path)

## Returning sequences

### GLOBALS needed from this point on

`ys` (+ `c_out_dat`)

### Create inputs

To use a sequence model, we can leave our input unchanged - but we have to change our output to a sequence.

Here, `c_out_dat` is identical to `c_in_dat`, but moved across 1 character.

In [25]:
c_out_dat = [[idx[i+n] for i in xrange(1, len(idx)-nc, nc)]
            for n in range(nc)]

In [26]:
ys = [np.stack(c) for c in c_out_dat]

Reading down each column shows one set of inputs and outputs

In [27]:
[xs[n][:nc] for n in range(nc)]

[array([87, 23, 68, 74, 74, 57, 75, 29]),
 array([85,  2, 68, 65, 61, 74,  2, 46]),
 array([86,  1, 71, 75, 70, 75,  1,  3]),
 array([45, 44, 70, 24, 59, 61,  2, 35]),
 array([29, 71, 24,  3, 61, 65,  1, 12]),
 array([31, 77,  3, 32, 24, 68,  2,  3]),
 array([40, 75, 42, 68,  3, 68,  1, 45]),
 array([31, 65, 57, 71, 39, 61, 27, 29])]

In [28]:
[ys[n][:nc] for n in range(nc)]

[array([85,  2, 68, 65, 61, 74,  2, 46]),
 array([86,  1, 71, 75, 70, 75,  1,  3]),
 array([45, 44, 70, 24, 59, 61,  2, 35]),
 array([29, 71, 24,  3, 61, 65,  1, 12]),
 array([31, 77,  3, 32, 24, 68,  2,  3]),
 array([40, 75, 42, 68,  3, 68,  1, 45]),
 array([31, 65, 57, 71, 39, 61, 27, 29]),
 array([23, 68, 74, 74, 57, 75, 29, 31])]

### Create and train model

In [None]:
dense_in = Dense(n_hidden, activation='relu')
dense_hidden = Dense(n_hidden, activation='relu', init='identity')
dense_out = Dense(vocab_size, activation='softmax', name='output')

We're going to pass a vectcor of all zeros as our starting point - here's our input layers for that:

In [None]:
inp1 = Input(shape=(n_fac,), name='zeros')
hidden = dense_in(inp1)

In [None]:
outs = []

for i in range(nc):
    dense = dense_in(cs[i][1])
    hidden = dense_hidden(hidden)
    hidden = merge([dense, hidden], mode='sum')
    # every layer now has an output
    outs.append(dense_out(hidden))

In [None]:
model = Model([inp1] + [c[0] for c in cs], outs)
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
model.summary()

In [None]:
zeros = np.tile(np.zeros(n_fac), (len(xs[0]), 1))
zeros.shape

In [None]:
model.fit([zeros]+xs, ys, batch_size=64, nb_epoch=12)

### Test model

In [None]:
def get_nexts(m, inp):
    idxs = [char_indices[c] for c in inp]
    arrs = [np.array(i)[np.newaxis] for i in idxs]
    p = model.predict([np.zeros(n_fac)[np.newaxis,:]] + arrs)
    print(list(inp))
    return [chars[np.argmax(o)] for o in p]

In [None]:
get_nexts(model, ' this is')

In [None]:
get_nexts(model, ' part of')

In [None]:
get_nexts(model, 'queens a')

### GLOBALS needed from this point on

In [29]:
xs[0].shape

(661403,)

In [30]:
x_rnn = np.stack(np.squeeze(xs), axis=1)
y_rnn = np.atleast_3d(np.stack(ys, axis=1))

In [31]:
x_rnn.shape, y_rnn.shape

((661403, 8), (661403, 8, 1))

### Sequence model with keras

To convert our previous keras model into a sequence model, simply add the `return_sequences=True` parameter, and add `TimeDistributed` around our dense layer.

In [None]:
model = Sequential([
        Embedding(vocab_size, n_fac, input_length=nc),
        SimpleRNN(n_hidden, return_sequences=True, activation='relu', inner_init='identity'),
        TimeDistributed(Dense(vocab_size, activation='softmax'))
    ])
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
model.summary()

In [None]:
model.fit(x_rnn, y_rnn, batch_size=64, nb_epoch=8)

In [None]:
model.fit(x_rnn, y_rnn, batch_size=64, nb_epoch=4)

In [None]:
def get_nexts_keras(m, inp):
    idxs = [char_indices[c] for c in inp]
    arrs = np.array(idxs)[np.newaxis,:]
    p = m.predict(arrs)[0]
    print(list(inp))
    return [chars[np.argmax(o)] for o in p]

In [None]:
get_nexts_keras(model, ' this is')

In [None]:
get_nexts_keras(model, ' part of')

In [None]:
get_nexts_keras(model, 'queens a')

In [None]:
save4_path = model_path + 'save4.h5'
if not os.path.exists(save4_path):
    model.save_weights(save4_path)
model.load_weights(save4_path)

## Stateful model with keras

In [32]:
bs = 64

In [33]:
model = Sequential([
        Embedding(vocab_size, n_fac, input_length=nc, batch_input_shape=(bs,nc)),
        BatchNormalization(),
        LSTM(n_hidden, return_sequences=True, stateful=True),
        TimeDistributed(Dense(vocab_size, activation='softmax'))
    ])

In [34]:
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam())
model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
embedding_1 (Embedding)          (64, 8, 42)           3696        embedding_input_1[0][0]          
____________________________________________________________________________________________________
batchnormalization_1 (BatchNormal(64, 8, 42)           84          embedding_1[0][0]                
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (64, 8, 256)          306176      batchnormalization_1[0][0]       
____________________________________________________________________________________________________
timedistributed_1 (TimeDistribute(64, 8, 88)           22616       lstm_1[0][0]                     
Total params: 332572
______________________________________________________________________

Since we're using a fixed batch shape, we have to ensure our inputs and outputs are an even multiple of the batch size.

In [35]:
mx = len(x_rnn)//bs*bs

In [36]:
model.fit(x_rnn[:mx], y_rnn[:mx], batch_size=bs, nb_epoch=4, shuffle=False)

Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4


<keras.callbacks.History at 0x7fddc1fae690>

In [37]:
model.optimizer.lr=1e-4

In [38]:
model.fit(x_rnn[:mx], y_rnn[:mx], batch_size=bs, nb_epoch=8, shuffle=False)

Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


<keras.callbacks.History at 0x7fdd9c716f50>

In [65]:
def make_model(batch_size_override=None):
    if batch_size_override is None:
        batch_size_override = bs
    model = Sequential([
        Embedding(input_dim=vocab_size, output_dim=n_fac,
                  input_length=nc,
                  batch_input_shape=(batch_size_override, nc)),
        BatchNormalization(),
        LSTM(n_hidden, return_sequences=True, stateful=True),
        TimeDistributed(Dense(vocab_size, activation='softmax'))
    ])
    model.compile(loss="sparse_categorical_crossentropy", optimizer=Adam())
    return model




In [77]:
def print_example(m, seed, gen_length=320):
    pred_model = make_model(batch_size_override=1) # This is the important bit
    for layer, pred_layer in zip(m.layers, pred_model.layers):
        pred_layer.set_weights(layer.get_weights())
       
    output = seed
    for i in range(gen_length):
        text_fragment = [char_indices[c] for c in output[-nc:]]
        predict_batch = np.array(text_fragment)[np.newaxis,:]
        prediction = pred_model.predict(predict_batch, verbose=0, batch_size=1)[0][-1]
        prediction = prediction / np.sum(prediction)
        output += indices_char[np.random.choice(vocab_size, p=prediction)]
    print(output)
    return output

In [79]:
temp = print_example(model, ' this is')

 this is not abray,
     Enter POLYCUS. Be Princess is bounds give him,
    O, lik give
    weep  a follow and so folly foot a King. Bring me; melthe worths, my lorsun'd about his tigh!
    thousand; and ink look fellow'st to moft thy gracious pardon now? All you misthrice of ionce it h to hing the sare! What is he will be


In [80]:
temp = print_example(model, ' part of')

 part of peneitis go.
    If that do Prince so re Inchide]
    honesty
    make you.
  LEONTES. Shall
    As flight to your Highness much. Paul. Mated no are about fantain r world, shall supplia, upon think To
    No bateavy long must I am palace. We stabury? I can. Know you
    rown father!
  SERVANT. What I have!




In [81]:
temp = print_example(model, 'queens a')

queens and deadly chambery, my lord; 'tis cob
    it would it, as was. 'Not  grief
    Thou'll piglt sir; he she said the Pardon- band mile,
    comes. Pray I am not break of
    dell make this To me to be indeed, such and trel him and he must  curstrand and heris
  PAULINA. Aow,
    Even all joy his own mother's gentlem


In [82]:
model.optimizer.lr=0.01

In [83]:
model.fit(x_rnn[:mx], y_rnn[:mx], batch_size=bs, nb_epoch=8, shuffle=False)

Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


<keras.callbacks.History at 0x7fdd8fc58610>

In [84]:
temp = print_example(model, ' this is')

 this is, not a lady affliest life  no could play I she gave over-more; the to
  mercy.
    I for the thorn?
  FLORIZEL. I will not know'st have me will bear her was a rus, moreventive in this devis'd withough he epiell,
    she hath a datilthy bey, and I am.
  PAULINA. She wish, would not taintain his continut by the wor


In [85]:
temp = print_example(model, ' part of')

 part of nd seem
    Signilling them m         be wa my fly,
    d     Of son. I woul and it, we will we thirt at desire our fellow the crown.
 s mugh a souls,
    You so,
    Give that he fear- thou arrelch, onch to remord, her come; if it a SHEPHTA. Ther r sped, but go the complexion
    stand
    So,
    Was rogue. 


In [86]:
temp = print_example(model, 'queens a')

queens at forth for given hath been me have terms that died  kill'd young smell see more sthat's hather the for our charg'd
    doom as, if it. Thou may be in ash all healf the Go, who, hat
    sea, that it is done 'to my now you
    To take
    As no, calliject there  PERDITA. How now, buine was twice more would not her s


In [87]:
save5_path = model_path + 'save5.h5'
if not os.path.exists(save5_path):
    model.save_weights(save5_path)
model.load_weights(save5_path)

## Char RNN

https://github.com/fastai/courses/blob/master/deeplearning1/nbs/char-rnn.ipynb