# LSTM in Keras
Here we will create and train a LSTM-network with Keras and train it on a synthetic data. Each sequence in this generated data will consist of 5 time steps (`ROLLING_WINDOW`) each containig four features (`x1`..`x4`). Features will be chosen as random bits (0 or 1). The output (`y`) of each sequence will be one bit (0 or 1). It is XOR over the first feature (`x1`) in each sequence.
![Sequence generation explanation](https://grez911.github.io/files/lstm_keras.png)

Define global constants.

In [1]:
LSTM_UNITS = 10     # Size of the LSTM memory.
BATCH_SIZE = 2**10  # Mini-batch size.
ROLLING_WINDOW = 5  # Each sequence will contain 5 time steps.
TRAIN_SIZE = 2**20  # Number of training examples.
TEST_SIZE = 2**12   # Number of testing examples.

Make imports.

In [2]:
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM

Using TensorFlow backend.


Define auxilary functions.

In [3]:
def xor(array):
  '''
  Calculates XOR of an array.
  
  Examples:
  [0, 1] -> 1
  [1, 1] -> 0
  [0, 0, 0] -> 0
  [0, 1, 1, 1] -> 1
  
  Inputs:
  array - numpy array.
  
  Outputs:
  XOR function.
  '''
  for i in range(len(array)):
    if i == 0:
      res = np.logical_xor(0, array[i])
    else:
      res = np.logical_xor(res, array[i])
  return int(res)    

In [4]:
def batch_gen(data, bs):
  '''
  Batch generator. It produces only full-sized batches.
  If the data can not be divided evenly into batches then
  the remaining is just ignored.
  
  Inputs:
  data - numpy array containing X and y.
  bs - batch size.
  
  Outputs the next batch each time.
  '''
  i = 0
  while True:
    X = []
    y = []
    for j in range(bs):
      matrix = data[i*bs+j:i*bs+j+ROLLING_WINDOW]
      X_timestep = matrix[:, :-1]
      y_timestep = matrix[-1, -1]
      X.append(X_timestep)
      y.append(y_timestep)
    if i >= steps_per_epoch(len(data))-1:
      i = 0
    else:
      i += 1
    yield (np.array(X), np.array(y))

In [5]:
def gen_data(length):
  '''
  Generate data with four features and a binary output. Input features (x1..x4)
  are random bits (0 or 1). Outpus (y) is a XOR function over x1 feature each
  timestep in the sequence.
  
  Inputs:
  length - number of training examples.
  
  Outputs:
  df - dataframe with created data.
  '''
  df = pd.DataFrame(np.random.randint(2, size=(length, 4)), columns=['x1', 'x2', 'x3', 'x4'])
  df['y'] = df['x1'].rolling(ROLLING_WINDOW).apply(xor)
  return df

In [6]:
def steps_per_epoch(data_length):
  '''
  Calculate how many mini-batches will be in the data.
  
  Inputs:
  data_length - overal length of the data.
  
  Outputs number of mini-batches.
  '''
  return int((data_length - ROLLING_WINDOW + 1) / BATCH_SIZE)

Create a train dataset.

In [7]:
train_df = gen_data(TRAIN_SIZE)
train_df.head(16)

Unnamed: 0,x1,x2,x3,x4,y
0,0,0,0,1,
1,1,1,1,1,
2,1,1,0,1,
3,1,1,0,0,
4,0,1,0,0,1.0
5,1,1,0,1,0.0
6,0,1,0,1,1.0
7,0,1,0,0,0.0
8,0,0,1,0,1.0
9,1,1,1,0,0.0


Convert the data to a numpy array to increase the speed of a batch generation.

In [8]:
train_data = train_df.as_matrix()
train_data

array([[ 0.,  0.,  0.,  1., nan],
       [ 1.,  1.,  1.,  1., nan],
       [ 1.,  1.,  0.,  1., nan],
       ...,
       [ 0.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  1.,  0.,  1.],
       [ 1.,  1.,  0.,  1.,  1.]])

Create a LSTM network.

In [9]:
model = Sequential()
model.add(LSTM(LSTM_UNITS, input_shape=(ROLLING_WINDOW, len(train_df.columns) - 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['binary_accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 10)                600       
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 11        
Total params: 611
Trainable params: 611
Non-trainable params: 0
_________________________________________________________________


Train it.

In [10]:
train_gen = batch_gen(train_data, BATCH_SIZE)
model.fit_generator(train_gen, steps_per_epoch=steps_per_epoch(TRAIN_SIZE), epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7f103ea3ceb8>

Create a test dataset and measure the performance.

In [11]:
test_data = gen_data(TEST_SIZE).as_matrix()
test_gen = batch_gen(test_data, BATCH_SIZE)
res = model.evaluate_generator(test_gen, steps=steps_per_epoch(TEST_SIZE))
i = 0
for m in model.metrics_names:
  print(f"{m}: {res[i]:.3f}")
  i += 1

loss: 0.000
binary_accuracy: 1.000
