# Learning on Reber-Grammar


## Importing Libraries:

In [1]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers import LSTM, TimeDistributed, Masking
from keras.optimizers import RMSprop
from keras.utils import np_utils
import numpy as np
import reber_utility as ru

Using Theano backend.


define important parameters:

In [2]:
step_size = 50
n_step=2

batch_size = 10
n_batch = 15 # total number of batches
n_test_batch = 2 # testing part

train_idxes = np.array(range(0,(n_batch-n_test_batch)*batch_size))
test_idxes = np.array(range((n_batch-n_test_batch)*batch_size, n_batch*batch_size))

N_Epoch = 80

generate and vectorize data to learn on:

In [3]:
m = ru.generate_reber_machine_continuous()

X,Y = m.to_X_and_Y(m.make_words(n_batch*batch_size,min_steps=(step_size*n_step+1)))

X and Y are the input and ouput data. Basically Y corresponds to X but shifted by one time-step (since we always want to predict the next time-step)

In [4]:
print X.shape
print Y.shape

(150, 100, 8)
(150, 100, 8)


## Building the model
"Heart" of Keras: defining the layers of the sequential model. Our model contains the following layers:
1. Masking Layer: important if we want to deal with different sized input data
2. LSTM Layer: contains 20 units; return_sequences = True in order to make predicitions at each timestep; stateful = True in order to keep state during training
3. Dropout Layer: prevents model from overfitting
4. Timedistributed Dense: fully connected layer to reduce LSTM layer dimensionality to output size (number of letters)
5. Activation Layer: use sigmoid activation function to return probabilities (values between 0 and 1)

Then we specify the optimizion algorithm for learning. In this case we use RMSprop with a learning rate of 0.01.

Finally we can compile our model and specify the loss function as well as some metrics we want to investigate during training

In [5]:
model = Sequential()
model.add(Masking(mask_value= -1., batch_input_shape=(batch_size,step_size,len(m.transitions))))
model.add(LSTM(20, return_sequences=True,stateful=True, batch_input_shape=(batch_size,step_size,len(m.transitions))))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(len(m.transitions))))
model.add(Activation('softmax'))
optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=["accuracy"])

## Train model
train model and generate test / train metrics

In [6]:
metrics_train = np.zeros((N_Epoch,len(model.metrics_names),n_batch-n_test_batch))
metrics_test = np.zeros((N_Epoch,len(model.metrics_names),n_test_batch))

print('Starting to learn:')
for i in range(N_Epoch):
    print('------- {} out of {} Epoch -----'.format(i+1,N_Epoch))

    ## Epochs should take all data; batches presented random, reset at each end of batch_size
    np.random.shuffle(train_idxes)
    batches_idxes = np.reshape(train_idxes, (-1,batch_size))
    for j, batch  in enumerate(batches_idxes):
        #print('batch {} of {}'.format(j+1,n_batch-n_test_batch))
        for k in range(n_step):
            metrics_train[i,:,j] += model.train_on_batch(X[batch,k*step_size:(k+1)*(step_size)], Y[batch,k*step_size:(k+1)*step_size]) #python 0:3 gives 0,1,2 (which is not intuitive at all)
        model.reset_states() 

    test_batch_idxes = np.reshape(test_idxes,(-1,batch_size))
    for j, test_batch in enumerate(test_batch_idxes):
        for k in range(n_step):
                metrics_test[i,:,j] += model.test_on_batch(X[test_batch,k*step_size:(k+1)*(step_size)], Y[test_batch,k*step_size:(k+1)*step_size])
        model.reset_states() 

    metrics_test[i] = metrics_test[i]/float(n_step) # divide only i indice, else division would be done for all at each epoch
    metrics_train[i] = metrics_train[i]/float(n_step)

    print('Train results:\t {} \n \t {}'.format(model.metrics_names, np.mean(metrics_train[i],axis=1)))
    print('Test results:\t {} \n \t {}'.format(model.metrics_names, np.mean(metrics_test[i],axis=1) ))

Starting to learn:
------- 1 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 1.75470665  0.342     ]
Test results:	 ['loss', 'acc'] 
 	 [ 1.45743048  0.44800001]
------- 2 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 1.37409728  0.48215384]
Test results:	 ['loss', 'acc'] 
 	 [ 1.19803303  0.56850001]
------- 3 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 1.16663097  0.54307693]
Test results:	 ['loss', 'acc'] 
 	 [ 0.99294922  0.595     ]
------- 4 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.97576591  0.58746154]
Test results:	 ['loss', 'acc'] 
 	 [ 0.79899043  0.62800002]
------- 5 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.79715753  0.64623076]
Test results:	 ['loss', 'acc'] 
 	 [ 0.6583136  0.676    ]
------- 6 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.68330992  0.67469232]
Test results:	 ['loss', 'acc'] 
 	 [ 0.58119631  0.68399999]
------- 7 out of 80 Epoch -----
Train results:	 ['los

Train results:	 ['loss', 'acc'] 
 	 [ 0.39320106  0.73100001]
Test results:	 ['loss', 'acc'] 
 	 [ 0.38684799  0.72800002]
------- 55 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.39408474  0.72953848]
Test results:	 ['loss', 'acc'] 
 	 [ 0.38499948  0.743     ]
------- 56 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.39318943  0.73484617]
Test results:	 ['loss', 'acc'] 
 	 [ 0.38602532  0.73000002]
------- 57 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.39167243  0.73353847]
Test results:	 ['loss', 'acc'] 
 	 [ 0.38881311  0.72850001]
------- 58 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.39087146  0.73384615]
Test results:	 ['loss', 'acc'] 
 	 [ 0.38414256  0.73500001]
------- 59 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.39383398  0.73338464]
Test results:	 ['loss', 'acc'] 
 	 [ 0.38609362  0.73200002]
------- 60 out of 80 Epoch -----
Train results:	 ['loss', 'acc'] 
 	 [ 0.3932171  0.733    ]
Test

... and finally don't forget to save everything ;)

In [7]:
model.save('embedCerg_model_4.h5')
np.save('embedXdata_4.npy',X)
np.save('embedydata_4.npy',Y)
np.save('embedTrainMetrics_4', metrics_train)
np.save('embedTestMetrics_4', metrics_test)