# Introduction

Keras provides an easy-to-use framework for training Neural Networks

This Notebook has **7** sections:

1. Introduction
2. Function import
3. Text loading
4. Text encoding
5. LSTM model spec
6. Model fitting
7. Text generation

# Overview

With this model we aim to predict the next character in a sequence, using a 30-character long sequence as input values

# Other avenues to investigate

1. Batch learning
2. Stochastic learning
3. Using Ngrams
4. One-hot encoding for the input sequences (as opposed to scaled)




In [1]:
%matplotlib inline
    
import numpy as np
import sys
import keras
from matplotlib import pyplot as plt

from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM

Using TensorFlow backend.


In [3]:
# Load text, convert to lowerecase
filename = "oldman.txt"
text = open(filename).read().lower()

# dictionaries mapping unique characters to integers, and integers to characters

chars = sorted(list(set(text)))
char_to_int = dict((ch, i) for i, ch in enumerate(chars))
int_to_char = dict((i, ch) for i,ch in enumerate(chars))

# exploratory statistics

text_size = len(text)
vocab_size = len(chars)

print "The text has", vocab_size, "unique characters and", text_size, "total characters."

# Note 

# The text has 61 unique characters - that is more than 2x the number of characters in the alphabet
# Appropriate encoding will be required for this

The text has 41 unique characters and 134984 total characters.


In [19]:
# Prepare data and one-hot encode

# The length of the input sequence is arbitrary, using the same as 
# https://github.com/deep-learning-indaba/practicals2017/blob/master/NEW_PRAC_4.ipynb

seq_length = 100
input_data = []
output_data = []

#

for i in range(0, text_size - seq_length, 1):
    input_seq = text[i:i + seq_length]
    output_seq = text[i + seq_length]
    input_data.append([char_to_int[char] for char in input_seq])
    output_data.append(char_to_int[output_seq])
    
patterns = len(input_data)

print "There are ", patterns, "in total. This is", seq_length, "(input sequence length) less than the total numer of characters."


 There are  134884 in total. This is 100 (input sequence length) less than the total numer of characters.


In [5]:
# Scaling Input Vector and One-hot Encoding Output Vectors

# reshape X to be [samples, time steps, features]
X = np.reshape(input_data, (patterns, seq_length, 1))


# Input vector scaling

X = X / float(vocab_size)

# one hot encode the output variable
y = np_utils.to_categorical(output_data)

print X.shape[1], X.shape[2]


100 1


In [6]:
# define the LSTM model
model = Sequential()
model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2])))
model.add(Dropout(0.2))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

In [7]:
# define the checkpoint
filepath="weights-improvement-{epoch:02d}-{loss:.4f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]

# Fit Model
---

In [10]:
model.fit(X, y, epochs=4, batch_size=128, callbacks=callbacks_list)

Epoch 1/4

Epoch 00001: loss improved from inf to 2.85693, saving model to weights-improvement-01-2.8569.hdf5
Epoch 2/4

Epoch 00002: loss improved from 2.85693 to 2.63470, saving model to weights-improvement-02-2.6347.hdf5
Epoch 3/4

Epoch 00003: loss improved from 2.63470 to 2.50251, saving model to weights-improvement-03-2.5025.hdf5
Epoch 4/4

Epoch 00004: loss improved from 2.50251 to 2.41390, saving model to weights-improvement-04-2.4139.hdf5


<keras.callbacks.History at 0x105027210>

# Generating Text with an LSTM Network

In [14]:
# Load previouly generated network weights
filename = 'weights-improvement-04-2.4139.hdf5'
model.load_weights(filename)
model.compile(loss='categorical_crossentropy', optimizer='adam')

# Random Seed
start = np.random.randint(0, len(input_data)-1)
pattern = input_data[start]
print "Seed:"
print "\"", ''.join([int_to_char[value] for value in pattern]), "\""

# Generate characters
for i in range(1000):
    
    #Reshape input
    
    x = np.reshape(pattern, (1, len(pattern), 1))
    
    # Scale input
    
    x = x / float(vocab_size)
    
    # Use model to predict which character
    # is most likely next. 
    
    prediction = model.predict(x, verbose=0)
    
    # Assign most probable prediction to index
    
    index = np.argmax(prediction)
    
    # Decode the number to it's alphabetical equivalent
    
    result = int_to_char[index]
    
    # Convert input pattern to character sequence
    
    seq_in = [int_to_char[value] for value in pattern]
    
    # Print out predicted characters

    sys.stdout.write(result)
    
    # Append the old pattern 
    
    pattern.append(index)
    
    # The new input pattern is the new predicted
    # character appended to the last 99 digits
    # of the old input pattern
    
    pattern = pattern[1:len(pattern)]
    
    
print "\nDone."

Seed:
" , watching it
and the other lines at the same time for the fish might have swum up or
down.  then ca "
 was het the sire and the bly sas he the birh and th

KeyboardInterrupt: 