In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

import keras
from keras.models import Model, load_model
from keras.layers import *
from keras.optimizers import *
from keras.callbacks import *

import tensorflow as tf
import numpy as np

import string
import random
import math

# Programmed on
# Keras version 2.0.8
# Tensorflow version 1.3.0
print(keras.__version__)
print(tf.__version__)

In [None]:
# EXECUTION_MODE = "train" # to train a new model
EXECUTION_MODE = "load_pretrained" # to load a saved model that was trained previously
FILE_SUFFIX = "2000epouchs"

# 1. Palindrome Detection

This problem is chosen to demostrate if there is any advantages of Bidirectional LSTM over Unidirectional LSTM.

The idea is that the same LSTM layer is run on the sequence in the original and the reverse directiions. 

Since a palindrome is identical in both directions, then the two outputs of the LSTM layer on both directions should also be the same.

Then it will be a simple operation to compare the two LSTM output to detect if the sequence is a palindrome.

# 2. Merge mode of the BiDirectional layer

The outputs from LSTM in both directions should be the same.

The two outputs are then merged in the BiDirectional layer, and the result is fed into a Dense layer.

The different merge mode can be considered using this example both outputs:

```
#!python

output = [2,5,3,7]
```

## 2.1 Concatenation merge mode

```
#!python

output1 = [2,5,3,7]
output2 = [2,5,3,7]
merged_result = output1 + output2
print(merged_result)
# result is [2, 5, 3, 7, 2, 5, 3, 7]
```

Assuming that the label of a palindrome is **1**, while a non-palindrome is **0**.

The Dense layer will then have to be trained with this merged result as its input, to produce the output **1**.

A single layer of Dense layer might not be sufficient for this function.

## 2.2 Average merge mode

```
#!python

output1 = [2,5,3,7]
output2 = [2,5,3,7]
merged_result = [x1 - x2 for x1, x2 in zip(output1, output2)]
print(merged_result)
# [0, 0, 0, 0]
```

Assuming that the label of a palindrome is **1**, while a non-palindrome is **0**.

The Dense layer will then have to be trained with this merged result as its input, to produce the output **1**.

A single layer of Dense layer might not be sufficient for this function.

## 2.3 Difference merge mode

Imagine that there is a difference merge mode.

```
#!python

output1 = [2,5,3,7]
output2 = [2,5,3,7]
merged_result = [x1 - x2 for x1, x2 in zip(output1, output2)]
print(merged_result)
# merged_result is [2.0, 5.0, 3.0, 7.0]
```

Assuming that the label of a palindrome is **1**, while a non-palindrome is **0**.

The Dense layer will then have to be trained with this merged result as its input, to produce the output **1**.

A single layer of Dense layer might not be sufficient for this function.

But, we can use the **Dense layer to simulate a difference merge mode**.

If we invert the labels, such that the label of a palindrome is **0**, while a non-palindrome is **1**.

And we use the "concat" merge mode or the "None" merge mode.

Then the Dense layer can be trained easily to find the difference of the merged result to produce a **0**.

```
#!python

output1 = [2,5,3,7]
output2 = [2,5,3,7]
merged_result = output1 + output2
print(merged_result)
# result is [2, 5, 3, 7, 2, 5, 3, 7]

def Dense_function(inputs):
    bias = 0;
    weights = [1, 1, 1, 1, -1, -1, -1, -1]
    weight_inputs = [x*w for x,w in zip(inputs, weights)]
    return sum(weight_inputs) + bias

output = Dense_function(merged_result)
# output is 0
```

# 3. Training Results

|LSTM size|Uni (x2)| Bi_Con| Bi_Ave| Bi_Con_Inv|
|:--------|-------:|------:|------:|----------:|
|16       |  0.9949| 0.9995| 0.9971|     0.9992|
|8        |  0.9889| 0.9787| 0.9943|     0.9897|
|4        |  0.9780|#0.9909| 0.9655|     0.9619|
|4 @ 2000e|  0.9697|#0.8623| 0.9775|     0.9717|
|2        |  0.9263| 0.9299| 0.8409|     0.9528|
|1        |  0.8418| 0.8820| 0.8682|     0.8753|

**Note**: The actual size of the LSTM output for the Unidirectional model is **twice** of that of the Bidirectional models.

The validated accuracy of the Bidirectional concat model with LSTM(4) is very inconsistent, and it seems to be very susceptible to being "struck" with either extremely good accuracy or anomally bad accuracy.

## 3.1. Conclusion

The accuracies of the have a rather wide range between different training.

The differences in accuracies of the different models are too small to make a conclusion.

# 4. Other Observations

In a set of random sequence, the ratio of palindromes to the total number of sequences is:

$$\begin{aligned}
& \frac{num^{\thinspace len \thinspace / \thinspace 2}}{num^{\thinspace len}}
= \frac{1}{num^{\thinspace len \thinspace / \thinspace 2}}
\\
\\ &\text{where}
\\ & num \rightarrow \text{number of possible characters}
\\ & len \rightarrow \text{length of the sequence}
\end{aligned}$$

So in a set of sequences with $10$ characters, only $8.417 \times 10^{-8}$ of them should be palindrome.

In this notebook, when a model has low accuracy, most of the wrong detections seem to be **false positives**.

This problem might be caused by half of the training data used for the training being palindrome.

Hence the model has a bias to label a sequence as a palindrome.


# 5. Further Work

1. From the training results, it seems like the Dense layer is doing the bulk of the detection. Hence it will be good to compare with a model that only has Dense layers.
2. Build a GAN to generate and detect palindrome.

In [None]:
def plot_train(list_of_histories):
    
    if 'acc' in list_of_histories[0].history:
        meas='acc'
        loc='lower right'
    else:
        meas='loss'
        loc='upper right'
    
    train_meas = []
    val_meas = []
    for hist in list_of_histories:
        train_meas = train_meas + hist.history[meas]
        val_meas = val_meas + hist.history['val_'+meas]

    plt.plot(train_meas)
    plt.plot(val_meas)
    plt.title('model '+meas)
    plt.ylabel(meas)
    plt.xlabel('epoch')
    plt.legend(['train', 'validation'], loc=loc)

In [None]:
CHARS_LIST = list(string.ascii_lowercase);
print(CHARS_LIST);

In [None]:
def vectors_to_letters(vectors):
    letters = [CHARS_LIST[np.argmax(x)] for x in vectors]
    return "".join(letters)

In [None]:
NUM_OF_CHARS = len(CHARS_LIST);
PALINDROME_SIZE = 10;

def generate_palindrome_or_not():
    while True:
        # 1 to make palindrome
        # 0 to make sequence of random characters
        make_palindrome = random.randint(0, 1)

        letters = [];
        for i in range( math.ceil( PALINDROME_SIZE/(make_palindrome+1) )):
            letter_vector = np.zeros(NUM_OF_CHARS, dtype=np.int_)
            rand_idx = random.randint(0, NUM_OF_CHARS-1);
            letter_vector[rand_idx] = 1
            letters.append(letter_vector.tolist())

        palindrome = letters;
        for i in range(PALINDROME_SIZE - len(letters) -1, -1, -1):
            palindrome.append(letters[i])

        yield (palindrome, make_palindrome)

        
x, y = next(generate_palindrome_or_not())
print("One sample - Input shape is:", np.array(x).shape)
print("One sample - Output shape is:", np.array(y).shape)
print("One sample - Input text is:\n", vectors_to_letters(x))
print("One sample - Input vector is:\n", np.array(x))
print("One sample - Output is:", y)

In [None]:
BATCH_SIZE = 64;
EPOCH_SIZE = 2000;

def batch_for_network_generator():
    while True:
        batch_of_sentences = [ next(generate_palindrome_or_not()) for i in range(BATCH_SIZE) ]
        X, Y = map(np.array, zip(*batch_of_sentences))
        yield X, Y

        
X, Y = next(batch_for_network_generator())
print("Batched - Input shape is:", X.shape)
print("Batched - Output shape is:", Y.shape)

In [None]:
def predict_with_model(model, batch_generator):
    
    test, actual = next(batch_generator())
    
    for i in range(20):
        a, b = next(batch_generator())
        test = np.concatenate((test, a))
        actual = np.concatenate((actual, b))
    
    preds = model.predict(test)

    test = [vectors_to_letters(x) for x in test]


    for pred in preds:
        pred[pred>=0.5] = 1
        pred[pred<0.5] = 0

    preds = preds.astype("uint8").reshape(preds.shape[0]);

    comparison = [test, actual, preds]
    comparison = np.array(comparison).T.tolist()
    wrongs = [elem for elem in comparison if elem[1] != elem[2]]
    false_positives = [elem for elem in wrongs if elem[2] == "1"]
    false_negatives = [elem for elem in wrongs if elem[2] == "0"]
    
    
    print("Predicted correctly:", ((len(comparison)-len(wrongs))*100.0/len(comparison)), "%")
    
    if(len(wrongs) != 0):
        print("Out of the wrongs, percentage of false positives:", (len(false_positives)*100.0/len(wrongs)), "%")
        print("Out of the wrongs, percentage of false negatives:", (len(false_negatives)*100.0/len(wrongs)), "%")
    print("Wrongs")
    print(["Test", "Actual", "Predicted"])
    print(np.array(wrongs))

In [None]:
#     filepath="checkpoints/Palindrome-LSTM_bidirectional-weights-{epoch:02d}-{loss:.4f}.hdf5"
#     checkpoint = ModelCheckpoint(filepath, monitor='loss', save_best_only=True, mode='min') # , verbose=1
reduce_LR = ReduceLROnPlateau(monitor='loss',factor = 0.9, patience=3,cooldown=2, min_lr = 0.00001)
early_stopping = EarlyStopping(monitor='val_acc', patience=50) #, min_delta=0.0001)
callbacks_list = [reduce_LR, early_stopping] # , checkpoint ]

# LSTM Unidirectional Model

In [None]:
if(EXECUTION_MODE == "train"):
    
    inp = Input(shape=(PALINDROME_SIZE, NUM_OF_CHARS))
    print('our input shape is ',(PALINDROME_SIZE, NUM_OF_CHARS) )
    x = LSTM(8)(inp)
#     x = Dropout(0.2)(x) # Dropout is commented to remove randomness for better comparison
    output = Dense(1, activation ='sigmoid')(x)
    
    
    adam = Adam(lr=0.01)
    unidirectional_model = Model(inputs = inp, outputs=output )
    unidirectional_model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    unidirectional_model.summary()

    
    list_of_histories = []

In [None]:
if(EXECUTION_MODE == "train"):
    
    history = unidirectional_model.fit_generator(
        batch_for_network_generator(),
        steps_per_epoch=BATCH_SIZE,
        validation_data=batch_for_network_generator(),
        validation_steps=BATCH_SIZE/4,
        epochs=EPOCH_SIZE,
        callbacks=callbacks_list
    )

    list_of_histories.append(history)

    plot_train(list_of_histories)
    
# BATCH_SIZE = 64
# PALINDROME_SIZE = 10
# NUM_OF_CHARS = 26
# PATIENCE = 50

# 1 x LSTM(256) - val_acc at 0.9990 after 94 epochs, peak val_acc at 0.9990 after 43 epochs

# 1 x LSTM(32) - val_acc at 0.9922 after 121 epochs, peak val_acc at 0.9971 after 70 epochs
# 1 x LSTM(32) - val_acc at 0.9941 after 201 epochs, peak val_acc at 0.9990 after 150 epochs
# 1 x LSTM(32) - val_acc at 0.9941 after 284 epochs, peak val_acc at 1.0000 after 233 epochs
# 1 x LSTM(32) - val_acc at 0.9932 after 384 epochs, peak val_acc at 1.0000 after 333 epochs
# 1 x LSTM(32) - val_acc at 0.9980 after 439 epochs, peak val_acc at 1.0000 after 388 epochs
# 1 x LSTM(32) - val_acc at 0.9980 after 000 epochs, peak val_acc at 0.9990 after 000 epochs

# 1 x LSTM(16) - val_acc at 0.9951 after 242 epochs, peak val_acc at 0.9990 after 191 epochs
# 1 x LSTM(16) - val_acc at 0.9961 after 334 epochs, peak val_acc at 1.0000 after 283 epochs

# 1 x LSTM(16) - val_acc at 0.9912 after 190 epochs, peak val_acc at 0.9922 after 139 epochs
# 1 x LSTM(16) - val_acc at 0.9912 after 262 epochs, peak val_acc at 0.9990 after 211 epochs
# 1 x LSTM(16) - val_acc at 0.9873 after 333 epochs, peak val_acc at 0.9961 after 282 epochs
# 1 x LSTM(16) - val_acc at 0.9873 after 483 epochs, peak val_acc at 0.9980 after 432 epochs
# 1 x LSTM(16) - val_acc at 0.9883 after 564 epochs, peak val_acc at 0.9961 after 211 epochs
# 1 x LSTM(16) - val_acc at 0.9883 after 622 epochs, peak val_acc at 0.9951 after 571 epochs

# 1 x LSTM(8) - val_acc at 0.9824 after 259 epochs, peak val_acc at 0.9863 after 208 epochs
# 1 x LSTM(8) - val_acc at 0.9746 after 321 epochs, peak val_acc at 0.9854 after 270 epochs
# 1 x LSTM(8) - val_acc at 0.9795 after 417 epochs, peak val_acc at 0.9863 after 366 epochs
# 1 x LSTM(8) - val_acc at 0.9775 after 496 epochs, peak val_acc at 0.9873 after 445 epochs
# 1 x LSTM(8) - val_acc at 0.9805 after 584 epochs, peak val_acc at 0.9883 after 533 epochs
# 1 x LSTM(8) - val_acc at 0.9785 after 662 epochs, peak val_acc at 0.9854 after 611 epochs
# 1 x LSTM(8) - val_acc at 0.9707 after 796 epochs, peak val_acc at 0.9893 after 745 epochs
# 1 x LSTM(8) - val_acc at 0.9795 after 926 epochs, peak val_acc at 0.9873 after 875 epochs
# 1 x LSTM(8) - val_acc at 0.9785 after 1018 epochs, peak val_acc at 0.9873 after 967 epochs

# 1 x LSTM(4) - val_acc at 0.9229 after 264 epochs, peak val_acc at 0.9443 after 213 epochs
# 1 x LSTM(4) - val_acc at 0.9258 after 347 epochs, peak val_acc at 0.9385 after 296 epochs
# 1 x LSTM(4) - val_acc at 0.9229 after 614 epochs, peak val_acc at 0.9414 after 563 epochs
# 1 x LSTM(4) - val_acc at 0.9326 after 678 epochs, peak val_acc at 0.9404 after 627 epochs
# 1 x LSTM(4) - val_acc at 0.9170 after 748 epochs, peak val_acc at 0.9443 after 697 epochs
# 1 x LSTM(4) - val_acc at 0.9326 after 840 epochs, peak val_acc at 0.9453 after 789 epochs
# 1 x LSTM(4) - val_acc at 0.9219 after 1042 epochs, peak val_acc at 0.9453 after 961 epochs
# 1 x LSTM(4) - val_acc at 0.9346 after 1136 epochs, peak val_acc at 0.9463 after 1085 epochs

# 1 x LSTM(2) - val_acc at 0.8281 after 208 epochs, peak val_acc at 0.8652 after 157 epochs
# 1 x LSTM(2) - val_acc at 0.8623 after 287 epochs, peak val_acc at 0.8701 after 236 epochs
# 1 x LSTM(2) - val_acc at 0.8389 after 367 epochs, peak val_acc at 0.8623 after 316 epochs
# 1 x LSTM(2) - val_acc at 0.8408 after 421 epochs, peak val_acc at 0.8730 after 370 epochs
# 1 x LSTM(2) - val_acc at 0.8252 after 482 epochs, peak val_acc at 0.8594 after 431 epochs
# 1 x LSTM(2) - val_acc at 0.8555 after 544 epochs, peak val_acc at 0.8643 after 493 epochs

In [None]:
filepath = "Palindrome-LSTM_unidirectional_%s.h5" % FILE_SUFFIX
if(EXECUTION_MODE == "train"):
    unidirectional_model.save(filepath)
elif(EXECUTION_MODE == "load_pretrained"):
    unidirectional_model = load_model(filepath)
    print("Model is loaded from a pretrained model")
    unidirectional_model.summary()

    
predict_with_model(unidirectional_model, batch_for_network_generator)

# LSTM Bidirectional-Concat Model

In [None]:
if(EXECUTION_MODE == "train"):
    
    inp = Input(shape=(PALINDROME_SIZE, NUM_OF_CHARS))
    print('our input shape is ',(PALINDROME_SIZE, NUM_OF_CHARS) )
    x = Bidirectional( LSTM(4) , input_shape=(PALINDROME_SIZE, 1),  merge_mode='concat' )(inp)
#     x = Dropout(0.2)(x)  # Dropout is commented to remove randomness for better comparison
    output = Dense(1, activation ='sigmoid')(x)
    
    adam = Adam(lr=0.01)
    bidirectional_concat_model = Model(inputs = inp, outputs=output )
    bidirectional_concat_model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    bidirectional_concat_model.summary()
    
    list_of_histories = []

In [None]:
if(EXECUTION_MODE == "train"):

    history = bidirectional_concat_model.fit_generator(
        batch_for_network_generator(),
        steps_per_epoch=BATCH_SIZE,
        validation_data=batch_for_network_generator(),
        validation_steps=BATCH_SIZE/4,
        epochs=EPOCH_SIZE,
        callbacks=callbacks_list
    )

    list_of_histories.append(history)

    plot_train(list_of_histories)

# BATCH_SIZE = 64
# PALINDROME_SIZE = 10
# NUM_OF_CHARS = 26
# PATIENCE = 50

# 1 x Bidirectional(LSTM(16), "concat") - val_acc at 0.9961 after 116 epochs, peak val_acc at 0.9990 after 65 epochs
# 1 x Bidirectional(LSTM(16), "concat") - val_acc at 0.9980 after 184 epochs, peak val_acc at 1.0000 after 133 epochs

# 1 x Bidirectional(LSTM(8), "concat") - val_acc at 0.9775 after 131 epochs, peak val_acc at 0.9912 after 80 epochs
# 1 x Bidirectional(LSTM(8), "concat") - val_acc at 0.9746 after 186 epochs, peak val_acc at 0.9873 after 135 epochs
# 1 x Bidirectional(LSTM(8), "concat") - val_acc at 0.9805 after 308 epochs, peak val_acc at 0.9922 after 257 epochs
# 1 x Bidirectional(LSTM(8), "concat") - val_acc at 0.9785 after 360 epochs, peak val_acc at 0.9893 after 309 epochs
# 1 x Bidirectional(LSTM(8), "concat") - val_acc at 0.9824 after 424 epochs, peak val_acc at 0.9893 after 373 epochs

# 1 x Bidirectional(LSTM(4), "concat") - val_acc at 0.9932 after 131 epochs, peak val_acc at 0.9932 after 80 epochs
# 1 x Bidirectional(LSTM(4), "concat") - val_acc at 0.9863 after 203 epochs, peak val_acc at 0.9971 after 152 epochs
# 1 x Bidirectional(LSTM(4), "concat") - val_acc at 0.9932 after 271 epochs, peak val_acc at 0.9971 after 220 epochs
# 1 x Bidirectional(LSTM(4), "concat") - val_acc at 0.9922 after 351 epochs, peak val_acc at 0.9971 after 300 epochs
# 1 x Bidirectional(LSTM(4), "concat") - val_acc at 0.9873 after 408 epochs, peak val_acc at 0.9990 after 357 epochs
# 1 x Bidirectional(LSTM(4), "concat") - val_acc at 0.9951 after 472 epochs, peak val_acc at 0.9961 after 421 epochs
# 1 x Bidirectional(LSTM(4), "concat") - val_acc at 0.9893 after 560 epochs, peak val_acc at 0.9971 after 509 epochs

# 1 x Bidirectional(LSTM(2), "concat") - val_acc at 0.9316 after 253 epochs, peak val_acc at 0.9492 after 202 epochs
# 1 x Bidirectional(LSTM(2), "concat") - val_acc at 0.9258 after 330 epochs, peak val_acc at 0.9492 after 279 epochs
# 1 x Bidirectional(LSTM(2), "concat") - val_acc at 0.9277 after 413 epochs, peak val_acc at 0.9512 after 362 epochs
# 1 x Bidirectional(LSTM(2), "concat") - val_acc at 0.9316 after 480 epochs, peak val_acc at 0.9463 after 429 epochs
# 1 x Bidirectional(LSTM(2), "concat") - val_acc at 0.9326 after 542 epochs, peak val_acc at 0.9463 after 491 epochs

# This round of training seems to be an anomaly
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.7197 after 159 epochs, peak val_acc at 0.7871 after 108 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.7627 after 243 epochs, peak val_acc at 0.7920 after 192 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.7480 after 310 epochs, peak val_acc at 0.7920 after 259 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.7607 after 379 epochs, peak val_acc at 0.7910 after 328 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.7549 after 443 epochs, peak val_acc at 0.7832 after 392 epochs

# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8633 after 76 epochs, peak val_acc at 0.9062 after 25 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8779 after 137 epochs, peak val_acc at 0.9072 after 86 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8926 after 201 epochs, peak val_acc at 0.9072 after 150 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8799 after 263 epochs, peak val_acc at 0.9043 after 212 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8818 after 400 epochs, peak val_acc at 0.9053 after 349 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8818 after 460 epochs, peak val_acc at 0.9150 after 409 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8857 after 551 epochs, peak val_acc at 0.9160 after 500 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8848 after 610 epochs, peak val_acc at 0.9121 after 559 epochs
# 1 x Bidirectional(LSTM(1), "concat") - val_acc at 0.8906 after 686 epochs, peak val_acc at 0.9092 after 635 epochs

In [None]:
filepath = "Palindrome-LSTM_bidirectional_concat_%s.h5" % FILE_SUFFIX
if(EXECUTION_MODE == "train"):
    bidirectional_concat_model.save(filepath)
elif(EXECUTION_MODE == "load_pretrained"):
    bidirectional_concat_model = load_model(filepath)
    print("Model is loaded from a pretrained model")
    bidirectional_concat_model.summary()
    
predict_with_model(bidirectional_concat_model, batch_for_network_generator)

# LSTM Bidirectional-Average Model

In [None]:
if(EXECUTION_MODE == "train"):
    
    inp = Input(shape=(PALINDROME_SIZE, NUM_OF_CHARS))
    print('our input shape is ',(PALINDROME_SIZE, NUM_OF_CHARS) )
    x = Bidirectional( LSTM(4) , input_shape=(PALINDROME_SIZE, 1),  merge_mode='ave' )(inp)
#     x = Dropout(0.2)(x)  # Dropout is commented to remove randomness for better comparison
    output = Dense(1, activation ='sigmoid')(x)
    
    adam = Adam(lr=0.01)
    bidirectional_ave_model = Model(inputs = inp, outputs=output )
    bidirectional_ave_model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    bidirectional_ave_model.summary()
    
    list_of_histories = []

In [None]:
if(EXECUTION_MODE == "train"):

    history = bidirectional_ave_model.fit_generator(
        batch_for_network_generator(),
        steps_per_epoch=BATCH_SIZE,
        validation_data=batch_for_network_generator(),
        validation_steps=BATCH_SIZE/4,
        epochs=EPOCH_SIZE,
        callbacks=callbacks_list
    )

    list_of_histories.append(history)

    plot_train(list_of_histories)

# BATCH_SIZE = 64
# PALINDROME_SIZE = 10
# NUM_OF_CHARS = 26
# PATIENCE = 50

# 1 x Bidirectional(LSTM(16), "ave") - val_acc at 0.9961 after 169 epochs, peak val_acc at 0.9990 after 118 epochs
# 1 x Bidirectional(LSTM(16), "ave") - val_acc at 0.9980 after 234 epochs, peak val_acc at 1.0000 after 183 epochs
# 1 x Bidirectional(LSTM(16), "ave") - val_acc at 0.9971 after 327 epochs, peak val_acc at 1.0000 after 276 epochs
# 1 x Bidirectional(LSTM(16), "ave") - val_acc at 0.9971 after 391 epochs, peak val_acc at 1.0000 after 340 epochs

# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9971 after 173 epochs, peak val_acc at 0.9971 after 122 epochs
# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9922 after 291 epochs, peak val_acc at 0.9990 after 240 epochs
# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9961 after 389 epochs, peak val_acc at 1.0000 after 338 epochs
# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9961 after 450 epochs, peak val_acc at 0.9980 after 399 epochs
# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9932 after 504 epochs, peak val_acc at 0.9990 after 403 epochs
# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9932 after 607 epochs, peak val_acc at 1.0000 after 556 epochs
# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9922 after 715 epochs, peak val_acc at 0.9990 after 664 epochs
# 1 x Bidirectional(LSTM(8), "ave") - val_acc at 0.9941 after 770 epochs, peak val_acc at 0.9990 after 719 epochs

# 1 x Bidirectional(LSTM(4), "ave") - val_acc at 0.9648 after 129 epochs, peak val_acc at 0.9736 after 78 epochs
# 1 x Bidirectional(LSTM(4), "ave") - val_acc at 0.9609 after 301 epochs, peak val_acc at 0.9795 after 250 epochs
# 1 x Bidirectional(LSTM(4), "ave") - val_acc at 0.9668 after 413 epochs, peak val_acc at 0.9775 after 362 epochs
# 1 x Bidirectional(LSTM(4), "ave") - val_acc at 0.9648 after 503 epochs, peak val_acc at 0.9795 after 452 epochs
# 1 x Bidirectional(LSTM(4), "ave") - val_acc at 0.9727 after 609 epochs, peak val_acc at 0.9785 after 558 epochs
# 1 x Bidirectional(LSTM(4), "ave") - val_acc at 0.9629 after 695 epochs, peak val_acc at 0.9785 after 644 epochs

# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8301 after 138 epochs, peak val_acc at 0.9082 after 87 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8564 after 194 epochs, peak val_acc at 0.8652 after 143 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8516 after 316 epochs, peak val_acc at 0.8672 after 265 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8369 after 365 epochs, peak val_acc at 0.8643 after 314 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8281 after 430 epochs, peak val_acc at 0.8740 after 379 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8369 after 608 epochs, peak val_acc at 0.8809 after 557 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8574 after 662 epochs, peak val_acc at 0.8643 after 611 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8359 after 757 epochs, peak val_acc at 0.8701 after 706 epochs
# 1 x Bidirectional(LSTM(2), "ave") - val_acc at 0.8350 after 811 epochs, peak val_acc at 0.8760 after 760 epochs

# 1 x Bidirectional(LSTM(1), "ave") - val_acc at 0.8623 after 102 epochs, peak val_acc at 0.8936 after 51 epochs
# 1 x Bidirectional(LSTM(1), "ave") - val_acc at 0.8721 after 197 epochs, peak val_acc at 0.8984 after 146 epochs
# 1 x Bidirectional(LSTM(1), "ave") - val_acc at 0.8857 after 252 epochs, peak val_acc at 0.8857 after 201 epochs
# 1 x Bidirectional(LSTM(1), "ave") - val_acc at 0.8594 after 384 epochs, peak val_acc at 0.8896 after 333 epochs
# 1 x Bidirectional(LSTM(1), "ave") - val_acc at 0.8662 after 447 epochs, peak val_acc at 0.8896 after 396 epochs
# 1 x Bidirectional(LSTM(1), "ave") - val_acc at 0.8633 after 535 epochs, peak val_acc at 0.8896 after 484 epochs

In [None]:
filepath = "Palindrome-LSTM_bidirectional_ave_%s.h5" % FILE_SUFFIX
if(EXECUTION_MODE == "train"):
    bidirectional_ave_model.save(filepath)
elif(EXECUTION_MODE == "load_pretrained"):
    bidirectional_ave_model = load_model(filepath)
    print("Model is loaded from a pretrained model")
    bidirectional_ave_model.summary()
    
predict_with_model(bidirectional_ave_model, batch_for_network_generator)

# Invert Target

In [None]:
def batch_for_network_inverter():
    while True:
        X, Y = next(batch_for_network_generator())
        Y ^= 1
        yield X, Y
        
# next(batch_for_network_inverter())

# LSTM Bidirectional-Concat Model - Inverted, with "0" as palindrome

In [None]:
if(EXECUTION_MODE == "train"):
    
    inp = Input(shape=(PALINDROME_SIZE, NUM_OF_CHARS))
    print('our input shape is ',(PALINDROME_SIZE, NUM_OF_CHARS) )
    x = Bidirectional( LSTM(4) , input_shape=(PALINDROME_SIZE, 1),  merge_mode='concat' )(inp)
#     x = Dropout(0.2)(x)  # Dropout is commented to remove randomness for better comparison
    output = Dense(1, activation ='sigmoid')(x)
    
    adam = Adam(lr=0.01)
    bidirectional_concat_inverted_model = Model(inputs = inp, outputs=output )
    bidirectional_concat_inverted_model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    bidirectional_concat_inverted_model.summary()
    
    list_of_histories = []

In [None]:
if(EXECUTION_MODE == "train"):

    history = bidirectional_concat_inverted_model.fit_generator(
        batch_for_network_inverter(),
        steps_per_epoch=BATCH_SIZE,
        validation_data=batch_for_network_inverter(),
        validation_steps=BATCH_SIZE/4,
        epochs=EPOCH_SIZE,
        callbacks=callbacks_list
    )

    list_of_histories.append(history)

    plot_train(list_of_histories)

# BATCH_SIZE = 64
# PALINDROME_SIZE = 10
# NUM_OF_CHARS = 26
# PATIENCE = 50

# 1 x Bi(LSTM(16), "concat"), inverted - val_acc at 0.9990 after 170 epochs, peak val_acc at 1.0000 after 119 epochs
# 1 x Bi(LSTM(16), "concat"), inverted - val_acc at 0.9980 after 225 epochs, peak val_acc at 1.0000 after 174 epochs
# 1 x Bi(LSTM(16), "concat"), inverted - val_acc at 0.9990 after 278 epochs, peak val_acc at 1.0000 after 227 epochs
# 1 x Bi(LSTM(16), "concat"), inverted - val_acc at 1.0000 after 330 epochs, peak val_acc at 1.0000 after 279 epochs
# 1 x Bi(LSTM(16), "concat"), inverted - val_acc at 0.9990 after 386 epochs, peak val_acc at 1.0000 after 335 epochs
# 1 x Bi(LSTM(16), "concat"), inverted - val_acc at 1.0000 after 438 epochs, peak val_acc at 1.0000 after 387 epochs

# 1 x Bi(LSTM(8), "concat"), inverted - val_acc at 0.9941 after 265 epochs, peak val_acc at 0.9961 after 214 epochs
# 1 x Bi(LSTM(8), "concat"), inverted - val_acc at 0.9902 after 332 epochs, peak val_acc at 0.9951 after 281 epochs
# 1 x Bi(LSTM(8), "concat"), inverted - val_acc at 0.9912 after 393 epochs, peak val_acc at 0.9941 after 342 epochs
# 1 x Bi(LSTM(8), "concat"), inverted - val_acc at 0.9902 after 488 epochs, peak val_acc at 0.9951 after 437 epochs
# 1 x Bi(LSTM(8), "concat"), inverted - val_acc at 0.9844 after 546 epochs, peak val_acc at 0.9951 after 495 epochs
# 1 x Bi(LSTM(8), "concat"), inverted - val_acc at 0.9883 after 611 epochs, peak val_acc at 0.9951 after 560 epochs

# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9629 after 152 epochs, peak val_acc at 0.9678 after 101 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9639 after 224 epochs, peak val_acc at 0.9717 after 173 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9609 after 325 epochs, peak val_acc at 0.9746 after 274 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9658 after 427 epochs, peak val_acc at 0.9795 after 376 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9619 after 526 epochs, peak val_acc at 0.9795 after 475 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9551 after 633 epochs, peak val_acc at 0.9756 after 582 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9678 after 717 epochs, peak val_acc at 0.9648 after 666 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9531 after 835 epochs, peak val_acc at 0.9775 after 784 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9668 after 890 epochs, peak val_acc at 0.9756 after 839 epochs
# 1 x Bi(LSTM(4), "concat"), inverted - val_acc at 0.9609 after 966 epochs, peak val_acc at 0.9775 after 915 epochs

# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9512 after 166 epochs, peak val_acc at 0.9619 after 115 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9463 after 233 epochs, peak val_acc at 0.9639 after 182 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9551 after 345 epochs, peak val_acc at 0.9697 after 294 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9580 after 453 epochs, peak val_acc at 0.9697 after 402 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9443 after 547 epochs, peak val_acc at 0.9678 after 496 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9590 after 641 epochs, peak val_acc at 0.9707 after 590 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9541 after 716 epochs, peak val_acc at 0.9717 after 665 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9561 after 783 epochs, peak val_acc at 0.9697 after 732 epochs
# 1 x Bi(LSTM(2), "concat"), inverted - val_acc at 0.9512 after 851 epochs, peak val_acc at 0.9707 after 800 epochs

# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8760 after 74 epochs, peak val_acc at 0.9033 after 23 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8711 after 131 epochs, peak val_acc at 0.9023 after 80 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8770 after 299 epochs, peak val_acc at 0.9102 after 248 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8867 after 374 epochs, peak val_acc at 0.9062 after 323 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8711 after 552 epochs, peak val_acc at 0.9062 after 501 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8770 after 612 epochs, peak val_acc at 0.9072 after 561 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8965 after 701 epochs, peak val_acc at 0.8955 after 650 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8740 after 779 epochs, peak val_acc at 0.9209 after 728 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8594 after 839 epochs, peak val_acc at 0.9053 after 788 epochs
# 1 x Bi(LSTM(1), "concat"), inverted - val_acc at 0.8643 after 891 epochs, peak val_acc at 0.9014 after 840 epochs

In [None]:
filepath = "Palindrome-LSTM_bidirectional_concat_inverted_%s.h5" % FILE_SUFFIX
if(EXECUTION_MODE == "train"):
    bidirectional_concat_inverted_model.save(filepath)
elif(EXECUTION_MODE == "load_pretrained"):
    bidirectional_concat_inverted_model = load_model(filepath)
    print("Model is loaded from a pretrained model")
    bidirectional_concat_inverted_model.summary()

    
predict_with_model(bidirectional_concat_inverted_model, batch_for_network_inverter)