In [9]:
from __future__ import print_function
from keras.layers import Dense, Activation
from keras.layers.recurrent import SimpleRNN
from keras.models import Sequential
# from keras.utils.visualize_util import plot
import numpy as np

We read our input text from the text of Alice in Wonderland on the Project Gutenberg website ( http://www.gutenberg.org/files/11/11-0.txt ). The file contains line breaks and non-ASCII characters, so we do some
preliminary cleanup and write out the contents into a variable called text :

In [27]:
fin = open('alice.txt', 'rb')
lines = []
for line in fin:
    line = line.strip().lower() #Convert every line to lowercase
    line = line.decode("ascii", "ignore")
    if len(line) == 0: continue
    lines.append(line)
fin.close()
text = " ".join(lines)

Since we are building a character-level RNN, our vocabulary is the set of characters that occur in the
text. There are 42 of them in our case. Since we will be dealing with the indexes to these characters
rather than the characters themselves, the following code snippet creates the necessary lookup tables:
The next step is to create the input and label texts. We do this by stepping through the text by a numberof characters given by the STEP variable ( 1 in our case) and then extracting a span of text whose size is
determined by the SEQLEN variable ( 10 in our case). The next character after the span is our label
character:

In [38]:
chars = set([c for c in text])
nb_chars = len(chars)
char2index = dict((c,i) for i, c in enumerate(chars))
index2char = dict((i,c) for i, c in enumerate(chars))

SEQLEN = 10
STEP = 1
input_chars = []
label_chars = []
for i in range(0, len(text) - SEQLEN, STEP):
    input_chars.append(text[i:i + SEQLEN])
    label_chars.append(text[i + SEQLEN])

The next step is to vectorize these input and label texts. Each row of the input to the RNN
corresponds to one of the input texts shown previously. There are SEQLEN characters in this input, and
since our vocabulary size is given by nb_chars , we represent each input character as a one-hot encoded
vector of size ( nb_chars ). Thus each input row is a tensor of size ( SEQLEN and nb_chars ). Our output label
is a single character, so similar to the way we represent each character of our input, it is represented
as a one-hot vector of size ( nb_chars ). Thus, the shape of each label is nb_chars :

In [42]:
X = np.zeros((len(input_chars), SEQLEN, nb_chars), dtype=np.bool)
y = np.zeros((len(input_chars), nb_chars), dtype=np.bool)
for i, input_char in enumerate(input_chars):
    for j, ch in enumerate(input_char):
        X[i, j, char2index[ch]] = 1
    y[i, char2index[label_chars[i]]] = 1

In [43]:
HIDDEN_SIZE = 128
BATCH_SIZE = 128
NUM_ITERATIONS = 25
NUM_EPOCHS_PER_ITERATION = 1
NUM_PREDS_PER_EPOCH = 100

model = Sequential()
model.add(SimpleRNN(HIDDEN_SIZE, return_sequences=False, input_shape=(SEQLEN, nb_chars),
    unroll=True))
model.add(Dense(nb_chars))
model.add(Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer="rmsprop")

Our training approach is a little different from what we have seen so far. So far our approach has
been to train a model for a fixed number of epochs, then evaluate it against a portion of held-out test
data. Since we don't have any labeled data here, we train the model for an epoch
( NUM_EPOCHS_PER_ITERATION=1 ) then test it. We continue training like this for 25 ( NUM_ITERATIONS=25 ) iterations,
stopping once we see intelligible output. So effectively, we are training for NUM_ITERATIONS epochs and
testing the model after each epoch.
Our test consists of generating a character from the model given a random input, then dropping the
first character from the input and appending the predicted character from our previous run, and
generating another character from the model. We continue this 100 times ( NUM_PREDS_PER_EPOCH=100 ) and
generate and print the resulting string. The string gives us an indication of the quality of the model:

In [44]:
for iteration in range(NUM_ITERATIONS):
    print("=" * 50)
    print("Iteration #: %d" % (iteration))bb
    model.fit(X, y, batch_size=BATCH_SIZE, epochs=NUM_EPOCHS_PER_ITERATION)
    
    test_idx = np.random.randint(len(input_chars))
    test_chars = input_chars[test_idx]
    print("Generating from seed: %s" % (test_chars))
    print(test_chars, end="")
    for i in range(NUM_PREDS_PER_EPOCH):
        Xtest = np.zeros((1, SEQLEN, nb_chars))
        for i, ch in enumerate(test_chars):
            Xtest[0, i, char2index[ch]] = 1
        pred = model.predict(Xtest, verbose=0)[0]
        ypred = index2char[np.argmax(pred)]
        print(ypred, end="")
        # move forward with test_chars + ypred
        test_chars = test_chars[1:] + ypred
print()

Iteration #: 0
Epoch 1/1
Generating from seed: aid the ca
Iteration #: 1
Epoch 1/1
Generating from seed: ded again.
Iteration #: 2
Epoch 1/1
Generating from seed: ht thing t
Iteration #: 3
Epoch 1/1
Generating from seed: the corner
Iteration #: 4
Epoch 1/1
Generating from seed: e youre tr
Iteration #: 5
Epoch 1/1
Generating from seed:  see parag
Iteration #: 6
Epoch 1/1
Generating from seed: ed tone. t
Iteration #: 7
Epoch 1/1
Generating from seed: turning to
Iteration #: 8
Epoch 1/1
Generating from seed: at the sti
Iteration #: 9
Epoch 1/1
Generating from seed: r computer
Iteration #: 10
Epoch 1/1
Generating from seed: hould unde
Iteration #: 11
Epoch 1/1
Generating from seed: equire suc
Iteration #: 12
Epoch 1/1
Generating from seed: ould like 
Iteration #: 13
Epoch 1/1
Generating from seed: she had dr
Iteration #: 14
Epoch 1/1
Generating from seed:  same litt
Iteration #: 15
Epoch 1/1
Generating from seed:  freely sh
Iteration #: 16
Epoch 1/1
Generating from seed: rather unw
Iterati

In [52]:
# from keras.utils import plot_model
# plot_model(model, to_file='model.png')