### 1. Import Necessary libraries

In [1]:
# Importing dependencies numpy and keras
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
from keras.utils import np_utils

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


### 2. Loading text file and creating character to integer mappings

In [3]:
# load text
filename = "./sentence_data.txt"

text = (open(filename).read()).lower()

# mapping characters with integers
unique_chars = sorted(list(set(text)))

char_to_int = {}
int_to_char = {}

for i, c in enumerate (unique_chars):
    char_to_int.update({c: i})
    int_to_char.update({i: c})

The text file is open, and all characters are converted to lowercase letters. In order to facilitate the following steps, we would be mapping each character to a respective number. This is done to make the computation part of the LSTM easier.

### 3. Preparing dataset

In [6]:
# preparing input and output dataset
X = []
Y = []

for i in range(0, len(text) - 50, 1):
    sequence = text[i:i + 50]
    label =text[i + 50]
    X.append([char_to_int[char] for char in sequence])
    Y.append(char_to_int[label])

In [9]:
for i in range(0,10):
    print(X[i])

[36, 25, 22, 1, 36, 34, 18, 24, 22, 21, 26, 22, 1, 31, 23, 1, 29, 18, 20, 19, 22, 36, 25, 0, 0, 18, 20, 36, 37, 35, 1, 32, 34, 26, 29, 37, 35, 9, 1, 35, 20, 31, 22, 30, 18, 1, 32, 34, 26, 29]
[25, 22, 1, 36, 34, 18, 24, 22, 21, 26, 22, 1, 31, 23, 1, 29, 18, 20, 19, 22, 36, 25, 0, 0, 18, 20, 36, 37, 35, 1, 32, 34, 26, 29, 37, 35, 9, 1, 35, 20, 31, 22, 30, 18, 1, 32, 34, 26, 29, 18]
[22, 1, 36, 34, 18, 24, 22, 21, 26, 22, 1, 31, 23, 1, 29, 18, 20, 19, 22, 36, 25, 0, 0, 18, 20, 36, 37, 35, 1, 32, 34, 26, 29, 37, 35, 9, 1, 35, 20, 31, 22, 30, 18, 1, 32, 34, 26, 29, 18, 9]
[1, 36, 34, 18, 24, 22, 21, 26, 22, 1, 31, 23, 1, 29, 18, 20, 19, 22, 36, 25, 0, 0, 18, 20, 36, 37, 35, 1, 32, 34, 26, 29, 37, 35, 9, 1, 35, 20, 31, 22, 30, 18, 1, 32, 34, 26, 29, 18, 9, 0]
[36, 34, 18, 24, 22, 21, 26, 22, 1, 31, 23, 1, 29, 18, 20, 19, 22, 36, 25, 0, 0, 18, 20, 36, 37, 35, 1, 32, 34, 26, 29, 37, 35, 9, 1, 35, 20, 31, 22, 30, 18, 1, 32, 34, 26, 29, 18, 9, 0, 0]
[34, 18, 24, 22, 21, 26, 22, 1, 31, 23, 1, 29

Data is prepared in a format such that if we want the LSTM to predict the ‘O’ in ‘HELLO’  we would feed in [‘H’, ‘E‘ , ‘L ‘ , ‘L‘ ] as the input and [‘O’] as the expected output. Similarly, here we fix the length of the sequence that we want (set to 50 in the example) and then save the encodings of the first 49 characters in X and the expected output i.e. the 50th character in Y.

### 4. Reshaping of X

In [10]:
# reshaping, normalizing and one hot encoding
X_modified = numpy.reshape(X, (len(X), 50, 1))
X_modified = X_modified / float(len(unique_chars))
Y_modified = np_utils.to_categorical(Y)

A LSTM network expects the input to be in the form [samples, time steps, features] where samples is the number of data points we have, time steps is the number of time-dependent steps that are there in a single data point, features refers to the number of variables we have for the corresponding true value in Y. We then scale the values in X_modified between 0 to 1 and one hot encode our true values in Y_modified.

### 5. Defining the LSTM model

In [11]:
# defining the LSTM model
model = Sequential()
model.add(LSTM(300, input_shape=(X_modified.shape[1], X_modified.shape[2]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(300))
model.add(Dropout(0.2))
model.add(Dense(Y_modified.shape[1], activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam')

A sequential model which is a linear stack of layers is used. The first layer is an LSTM layer with 300 memory units and it returns sequences. This is done to ensure that the next LSTM layer receives sequences and not just randomly scattered data. A dropout layer is applied after each LSTM layer to avoid overfitting of the model. Finally, we have the last layer as a fully connected layer with a ‘softmax’ activation and neurons equal to the number of unique characters, because we need to output one hot encoded result.

### 6. Fitting and generating characters

In [12]:
# fitting the model
model.fit(X_modified, Y_modified, epochs=1, batch_size=30)

# picking a random seed
start_index = numpy.random.randint(0, len(X)-1)
new_string = X[start_index]

# generating characters
for i in range(50):
    x = numpy.reshape(new_string, (1, len(new_string), 1))
    x = x / float(len(unique_chars))

    #predicting
    pred_index = numpy.argmax(model.predict(x, verbose=0))
    char_out = int_to_char[pred_index]
    seq_in = [int_to_char[value] for value in new_string]
    print(char_out)

    new_string.append(pred_index)
    new_string = new_string[1:len(new_string)]

Epoch 1/1
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
 
t
h
e
