# Standard LSTM
## Our standard LSTM is not actually Traditional LSTM but Peephole LSTM. More on this you can find in our report

##### Note that our LSTM model takes experiment as parameter such as experiment = {'standard', 'moving_average', 'deep'}. Also, nhidden2 (number of units on layer 2) activates only if you choose experiment='deep'.

##### Since our paper uses for train data 4978 sentences with 56590 words and for test data 893 sentences with 9198 words, we had to combine our train and validation sets in order to use the same data and try to match the same results with the paper. Note that since we have ran it before with validation sets we would expect our model to converge before 40 epochs. Thus, for our experiments we would usually use n <= 40 epochs. 

In [1]:
"""First we do our grid search and try to come as close as to the results on paper for simple LSTM to identify 
embedding dimensions since they were not mentioned on paper. After running around 10 experiments we conclude that 
when emb_dimension = 90 we had the highest result of 94.29. However, this is still a bit lower than acqired result of 
94.85 by paper. So for next one we will use Adam optimizer to see if we can achieve closer result."""

import lstm
import imp
imp.reload(lstm)

for i in range(0, 10):
    if i is 1:
        continue
    param = {
        'experiment':'standard',
        'lr': 0.1,
        'verbose': True,
        'decay': True,
        'win': 3,
        'nhidden': 300,
        'nhidden2':300,
        'seed': 345,
        'emb_dimension': 40+i*10,
        'nepochs': 15,
        'savemodel': False,
        'normal': True,
        'layer_norm': False,
        'minibatch_size':4978,
        'moving_avg':3,
        'simplified_type':'no_forget',
        'folder':'../result_standard'}

    lstm.test_lstm(**param);

Using gpu device 0: GRID K520 (CNMeM is disabled, cuDNN Version is too old. Update to v5, was 3007.)
  "downsample module has been moved to the theano.tensor.signal.pool module.")


verbose: True
win: 3
layer_norm: False
seed: 345
moving_avg: 3
nhidden: 300
decay: True
experiment: standard
lr: 0.1
folder: ../result
minibatch_size: 4978
normal: True
nhidden2: 300
nepochs: 15
simplified_type: no_forget
emb_dimension: 40
savemodel: False
... loading the dataset
Sentences in train: 4978, Words in train: 56590
Sentences in test: 893, Words in test: 9198
... building the model
... training
NEW BEST: epoch 0, minibatch 1/1, best test F1: 83.190
NEW BEST: epoch 1, minibatch 1/1, best test F1: 88.210
NEW BEST: epoch 2, minibatch 1/1, best test F1: 91.110
NEW BEST: epoch 3, minibatch 1/1, best test F1: 91.730
NEW BEST: epoch 4, minibatch 1/1, best test F1: 92.320

NEW BEST: epoch 6, minibatch 1/1, best test F1: 93.260



NEW BEST: epoch 10, minibatch 1/1, best test F1: 93.370


NEW BEST: epoch 13, minibatch 1/1, best test F1: 93.630

('BEST RESULT: epoch', 13, 'best test F1', 93.63, 'with the model', '../result')
verbose: True
win: 3
layer_norm: False
seed: 345
moving_avg: 

In [2]:
"""After running with Adam optimizer and keeping everything same (Note that for Adam we are deviding lr by 100 when 
we are inputting it to Adam as learning rate) we got 94.64 in 35 epochs. Since, we reached it at 34th epoch and it did
not look like it has converged we might try to increase the number of epochs and get even better result."""

import lstm
import imp
imp.reload(lstm)

param = {
    'experiment':'standard',
    'lr': 0.1,
    'verbose': True,
    'decay': True,
    'win': 3,
    'nhidden': 300,
    'nhidden2':300,
    'seed': 345,
    'emb_dimension': 90,
    'nepochs': 35,
    'savemodel': False,
    'normal': True,
    'layer_norm': False,
    'minibatch_size':4978,
    'with_attention':False,
    'simplified_type':'no_forget',
    'folder':'../result_standard'}

lstm.test_lstm(**param);

verbose: True
win: 3
layer_norm: False
seed: 345
minibatch_size: 4978
nhidden: 300
decay: True
with_attention: False
experiment: standard
lr: 0.1
folder: ../result_standard
normal: True
nhidden2: 300
nepochs: 35
simplified_type: no_forget
emb_dimension: 90
savemodel: False
... loading the dataset
Sentences in train: 4978, Words in train: 56590
Sentences in test: 893, Words in test: 9198
... building the model
... training
NEW BEST: epoch 0, minibatch 1/1, best test F1: 91.550
NEW BEST: epoch 1, minibatch 1/1, best test F1: 92.990

NEW BEST: epoch 3, minibatch 1/1, best test F1: 93.890




NEW BEST: epoch 8, minibatch 1/1, best test F1: 93.990










('Decay happened. New Learning Rate:', 0.05)
NEW BEST: epoch 19, minibatch 1/1, best test F1: 94.260
NEW BEST: epoch 20, minibatch 1/1, best test F1: 94.520


NEW BEST: epoch 23, minibatch 1/1, best test F1: 94.610










('Decay happened. New Learning Rate:', 0.025)
NEW BEST: epoch 34, minibatch 1/1, best test F1: 94.640
('BEST RESUL