# Neural network model

Since we are working with text, we choose to train reccurent neural network, LSTM. 
Our architecture can be described as many-to-one - for many words on input we need to produce one label - 1 for positive and 0 for negative sentiment. 

Detailed architecture is explained further. 

In [1]:
import pandas as pd
import numpy as np
from tensorflow import keras

In [2]:
import os

In [3]:
from tensorflow.keras.layers import LSTM, Dense, Embedding, Bidirectional, Dropout
from tensorflow.keras.regularizers import l1

In [5]:
%load_ext tensorboard
%tensorboard --logdir logs --bind_all

Reusing TensorBoard on port 6006 (pid 14156), started 1:16:28 ago. (Use '!kill 14156' to kill it.)

In [4]:
# load data
train = pd.read_pickle('../data/train/comments_embed.pkl')
test = pd.read_pickle('../data/test/comments_embed.pkl')

In [5]:
# prepare for training
train.head(n=5)

Unnamed: 0,comment,sentiment,comment_ids,comment_text,words_n,x
0,"[bromwel, high, cartoon, comedi, ran, the, sam...",1,"[336, 809, 171, 1915, 1, 148, 27, 26, 48, 1378...",bromwel high cartoon comedi ran the same time ...,98,"[336, 809, 171, 1915, 1, 148, 27, 26, 48, 1378..."
1,"[homeless, houseless, georg, carlin, state, be...",1,"[2736, 658, 12488, 591, 60, 835, 8, 96, 10, 97...",homeless houseless georg carlin state been iss...,310,"[2736, 658, 12488, 591, 60, 835, 8, 96, 10, 97..."
2,"[brilliant, overact, lesley, ann, warren, best...",1,"[523, 2081, 13669, 1105, 3199, 101, 780, 11953...",brilliant overact lesley ann warren best drama...,112,"[523, 2081, 13669, 1105, 3199, 101, 780, 11953..."
3,"[thi, easili, the, most, underr, film, inn, th...",1,"[3, 686, 1, 72, 1846, 7, 8550, 1, 1994, 3971, ...",thi easili the most underr film inn the brook ...,93,"[3, 686, 1, 72, 1846, 7, 8550, 1, 1994, 3971, ..."
4,"[thi, not, the, typic, mel, brook, film, much,...",1,"[3, 5, 1, 668, 3125, 1994, 7, 56, 341, 2303, 5...",thi not the typic mel brook film much less sla...,95,"[3, 5, 1, 668, 3125, 1994, 7, 56, 341, 2303, 5..."


In [6]:
train.x[0]

array([  336,   809,   171,  1915,     1,   148,    27,    26,    48,
        1378,    23,   391,   102,   119,  1359,    96,     1,  1555,
        3746,   238,   154,     4,   336,  1348,    56,  2128,   596,
          53,  1359,     1,  8985,   933,  2800,     1,  1533,   813,
          17,    29,    37,   176,   124,    47,  1108,  1359, 14390,
           1,  3939,     1,   208,   519,    15,   659,     1,   391,
         673,     2,    47,   813,    32,   199,     1,   243,    42,
         813,  3106,   100,  1015,   167,     1,   391,  1021,  1677,
         336,   299,   191,  2523,   105,  5439,    14,    82,  1359,
         813,  1782,   336,   212,     4,    90,   697,   419,    68,
           4,   336,   222,  4349,    25,  1574,     4,     5,     0,
           0])

In [7]:
test.head(n=5)

Unnamed: 0,comment,sentiment,comment_ids,x
0,"[went, and, saw, thi, movi, last, night, after...",1,"[440, 2, 199, 3, 6, 201, 284, 83, 10196, 147, ...","[440, 2, 199, 3, 6, 201, 284, 83, 10196, 147, ..."
1,"[actor, turn, director, bill, paxton, follow, ...",1,"[93, 157, 127, 769, 4320, 279, 949, 1660, 1, 8...","[93, 157, 127, 769, 4320, 279, 949, 1660, 1, 8..."
2,"[recreat, golfer, with, some, knowledg, the, s...",1,"[2592, 9, 26, 1620, 1, 1391, 478, 512, 9, 803,...","[2592, 9, 26, 1620, 1, 1391, 478, 512, 9, 803,..."
3,"[saw, thi, film, sneak, preview, and, delight,...",1,"[199, 3, 7, 2823, 2440, 2, 995, 1, 627, 1413, ...","[199, 3, 7, 2823, 2440, 2, 995, 1, 627, 1413, ..."
4,"[bill, paxton, taken, the, true, stori, the, g...",1,"[769, 4320, 623, 1, 273, 40, 1, 6595, 318, 2, ...","[769, 4320, 623, 1, 273, 40, 1, 6595, 318, 2, ..."


From the previous script, we know that our vocab contains 15000 words and max length of our comment is 100. 
We also choose our embedding size to be 100 for now - however, these are the hyper-parameters to played with later.

In [8]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

Since we have pandas dataframe, structure of our data is np.array of np.arrays (not np.ndarray). 
This might cause problems when training - we need to explicitely convert it to 2d array - one way is using np.stack:

In [9]:
# no good, we need shappe (25000, 100)
train.x.values.shape

(25000,)

In [10]:
train_x = np.stack(train.x.values)

In [11]:
# ok
train_x.shape

(25000, 100)

In [12]:
test_x = np.stack(test.x.values)

In [13]:
# target (to make sure we have np arrays)
train_y = np.array(train.sentiment.values)
test_y = np.array(test.sentiment.values)

Our first neural network consists of layers:
- Embedding layer (to train basic word embedding fror NN to work with) (later we will compare with pretrained embeddings (or train our own embeddings))
- Bidirectional LSTM layer (we needed recurrent NN since we work with sequential data - text - so we chose LSTM. WE also went for Bidirectional since we read that it is capable of better understanding of context when making predictions - but there is also a potential to try and use other different architectures. )
Size of LSTM layer is also parametrizable - we can try different sizes and compare results - we will start with 64. 
- Since we need one number at the end - either 1 or 0 (positive or negative sentiment), we needed to add Dense layer to transform our result to such number. For activation function, we chose sigmoid (we were thinking about softmax, but since softmax is just generalized sigmoig (and usable for multiclass classification), we stayed with sigmoid in our problem)

Our first NN might be prone to overfitting. In future, we can add for example Dropout layer to try to prevent overfitting. 

In [17]:
# define NN architecture
class SentimentClassifier_v1(keras.Model):

    def __init__(self, vocab_size, embedding_size, comment_size, lstm_size):
        super(SentimentClassifier_v1, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            input_length=comment_size,
            mask_zero=True, 
            trainable=True
        )
    
        self.lstm_layer = Bidirectional(LSTM(lstm_size))
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [18]:
# create NN object
nn_v1 = SentimentClassifier_v1(VOCAB_SIZE + 1, EMBEDDING_SIZE, COMMENT_SIZE, 64)

Before compiling our model, we need to choose optimizer. 

For the first try, we will go with Adam. Next we can try others like SGD.
Our loss function is now binary_crossentropy.

Our metrics is accuracy. We have balanced dataset (the same number of positive and negative classes) and in such case we think it is an ok metrics. 

In [19]:
# add callbacks - tensorboard and compile
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v1"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v1.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

... aaand it is time for training!

In [20]:
nn_v1.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=10,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f95cdb0a400>

Evaluation:
accuracy on valid: 0.80 afrer 9th epoch

Altough our first neural network seems to achieve quite fair results, based on the values of train and valid accuracy over the epochs we can tell that our NN is overfitting. 

We can try to add the Dropout layer to try to reduce overfitting. 
Other option, how to prevent overfitting, might be to use activity regularizer - l1/l2... we might also think about adding bias to our data - or to even decrease the complexity of network again. We also haven't preformed the hyperparameter tuning yet - which might help to achieve better results as well. 

And maybe try direct instead of bidirectional LSTM layer. 


In [21]:
# define NN architecture
class SentimentClassifier_v2(keras.Model):

    def __init__(self, vocab_size, embedding_size, comment_size, lstm_size):
        super(SentimentClassifier_v2, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            input_length=comment_size,
            mask_zero=True, 
            trainable=True
        )
    
        self.lstm_layer = Bidirectional(LSTM(lstm_size, activity_regularizer=l1(0.001)))
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [22]:
# create NN object
nn_v2 = SentimentClassifier_v2(VOCAB_SIZE + 1, EMBEDDING_SIZE, COMMENT_SIZE, 64)

In [23]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v2"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v2.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v2.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=15,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x7f95ae4154e0>

Evaluation: we keep around 80% accuracy on valid data set. 

Again, we can see that our baseline model is not perfect and overfits. Over epochs, validation loss gets higher (whilst train loss decreases). This is something that we will try to make better during next project iterations. 

### Iteration 2.  - custom embedding

In [14]:
embedding = np.load('word_embeddings.npy')

In [16]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [28]:
# define NN architecture
class SentimentClassifier_v3(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v3, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.lstm_layer = Bidirectional(LSTM(lstm_size, activity_regularizer=l1(0.001)))
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [29]:
# create NN object
nn_v2 = SentimentClassifier_v3(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 64)

In [30]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v3"),
        histogram_freq=1,
        profile_batch=0
    )
]

    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v2.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=15,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x7fa2cc9c4b38>

### Trying LSTM layer instead of bidirectional

In [18]:
embedding = np.load('word_embeddings.npy')

In [19]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [25]:
# define NN architecture
class SentimentClassifier_v4(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v4, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.lstm_layer = LSTM(lstm_size, activity_regularizer=l1(0.001))
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [26]:
# create NN object
nn_v4 = SentimentClassifier_v4(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 64)

In [33]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v4"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v4.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v4.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=15,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x1e3b1233cf8>

### Trying more epochs

In [15]:
embedding = np.load('word_embeddings.npy')

In [16]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [17]:
# define NN architecture
class SentimentClassifier_v5(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v5, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.lstm_layer = LSTM(lstm_size, activity_regularizer=l1(0.001))
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [18]:
# create NN object
nn_v5 = SentimentClassifier_v5(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 64)

In [19]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v5"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v5.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v5.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=30,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x292c818ecc0>

### v6 bidirectional layer with forward and backward layer

In [20]:
embedding = np.load('word_embeddings.npy')

In [21]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [37]:
# define NN architecture
class SentimentClassifier_v6(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v6, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.forward_layer = LSTM(lstm_size)
        self.backward_layer = LSTM(lstm_size, activation='relu', go_backwards=True)
        self.lstm_layer = Bidirectional(self.forward_layer, backward_layer=self.backward_layer)
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [38]:
# create NN object
nn_v6 = SentimentClassifier_v6(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 64)

In [39]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v6"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v6.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v6.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=30,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x292e519fcc0>

### v7 lstm_size  = 128

In [40]:
embedding = np.load('word_embeddings.npy')

In [41]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [45]:
# define NN architecture
class SentimentClassifier_v7(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v7, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.forward_layer = LSTM(lstm_size)
        self.backward_layer = LSTM(lstm_size, activation='relu', go_backwards=True)
        self.lstm_layer = Bidirectional(self.forward_layer, backward_layer=self.backward_layer)
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [46]:
# create NN object
nn_v7 = SentimentClassifier_v7(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 128)

In [47]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v7"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v7.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v7.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=30,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x292fbf935f8>

### v8 lstm_size = 32

In [48]:
embedding = np.load('word_embeddings.npy')

In [49]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [50]:
# define NN architecture
class SentimentClassifier_v8(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v8, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.forward_layer = LSTM(lstm_size)
        self.backward_layer = LSTM(lstm_size, activation='relu', go_backwards=True)
        self.lstm_layer = Bidirectional(self.forward_layer, backward_layer=self.backward_layer)
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [51]:
# create NN object
nn_v8 = SentimentClassifier_v8(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 32)

In [52]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v8"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v8.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v8.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=30,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x293155969b0>

### v9 - long training, 100 epochs, learning_rate = 0.02, lstm_size = 16

In [53]:
embedding = np.load('word_embeddings.npy')

In [54]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [57]:
# define NN architecture
class SentimentClassifier_v9(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v9, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.forward_layer = LSTM(lstm_size)
        self.backward_layer = LSTM(lstm_size, activation='relu', go_backwards=True)
        self.lstm_layer = Bidirectional(self.forward_layer, backward_layer=self.backward_layer)
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [58]:
# create NN object
nn_v9 = SentimentClassifier_v9(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 16)

In [60]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v9"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v9.compile(
    learning_rate=0.02,
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v9.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=100,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100


Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


<tensorflow.python.keras.callbacks.History at 0x29324960a58>

### v10 -  batch_size = 16 and learning_rate = 0.01

In [14]:
embedding = np.load('word_embeddings.npy')

In [15]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [16]:
# define NN architecture
class SentimentClassifier_v10(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v10, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.forward_layer = LSTM(lstm_size)
        self.backward_layer = LSTM(lstm_size, activation='relu', go_backwards=True)
        self.lstm_layer = Bidirectional(self.forward_layer, backward_layer=self.backward_layer)
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [17]:
# create NN object
nn_v10 = SentimentClassifier_v10(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 16)

In [18]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v10"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v10.compile(
    learning_rate=0.01,
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v10.fit(
    x=train_x,
    y=train_y,
    batch_size=16,
    epochs=30,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x209f1e3f048>

### v11 - added activity regulizer

In [19]:
embedding = np.load('word_embeddings.npy')

In [20]:
COMMENT_SIZE = 100
VOCAB_SIZE = 15000
EMBEDDING_SIZE = 100

In [25]:
# define NN architecture
class SentimentClassifier_v11(keras.Model):

    def __init__(self, vocab_size, embedding_size, embedding_matrix, comment_size, lstm_size):
        super(SentimentClassifier_v11, self).__init__()
        
        # train embedding 
        self.emb = Embedding(
            input_dim=vocab_size,
            output_dim=embedding_size,
            weights=[embedding_matrix],
            input_length=comment_size,
            mask_zero=True, 
            trainable=False
        )
    
        self.forward_layer = LSTM(lstm_size, activity_regularizer=l1(0.001))
        self.backward_layer = LSTM(lstm_size, activation='relu', go_backwards=True, activity_regularizer=l1(0.001))
        self.lstm_layer = Bidirectional(self.forward_layer, backward_layer=self.backward_layer)
        self.drouput_layer = Dropout(0.5)
        self.output_layer = Dense(1, activation="sigmoid")

    def call(self, x):
        x = self.emb(x)
        x = self.lstm_layer(x)
        x = self.output_layer(x)

        return x

In [26]:
# create NN object
nn_v11 = SentimentClassifier_v11(VOCAB_SIZE + 1, EMBEDDING_SIZE, embedding, COMMENT_SIZE, 32)

In [27]:
# compile and train
callbacks = [
    keras.callbacks.TensorBoard(
        log_dir=os.path.join("logs", "sentiment_classifier_v11"),
        histogram_freq=1,
        profile_batch=0
    )
]

nn_v11.compile(
    learning_rate=0.03,
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy'])

nn_v11.fit(
    x=train_x,
    y=train_y,
    batch_size=32,
    epochs=30,
    validation_data=(test_x, test_y),
    callbacks=callbacks
)

Train on 25000 samples, validate on 25000 samples
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x20990f71b38>