### RNN / Naive Bayes Based Irony Detector
#### Author : Shubhajit Basak

##### In this task we will build an Irony Detector or Classifier. It has two part - first we will create a Naive Bayes with Bag Of Words Model. Then in the second part we will create a RNN Sequence Model for the same

In this task you will develop a system to detect irony in text. We will use the data from the SemEval-2018 task on irony detection. You should use the file `SemEval2018-T3-train-taskA.txt` from Blackboard it consists of examples as follows:

```csv
Tweet index     Label   Tweet text
1       1       Sweet United Nations video. Just in time for Christmas. #imagine #NoReligion  http://t.co/fej2v3OUBR
2       1       @mrdahl87 We are rumored to have talked to Erv's agent... and the Angels asked about Ed Escobar... that's hardly nothing    ;)
3       1       Hey there! Nice to see you Minnesota/ND Winter Weather 
4       0       3 episodes left I'm dying over here
```


In [1]:
from nltk import word_tokenize
from collections import Counter
import numpy as np
from random import shuffle
from collections import defaultdict

# Task 1

Read all the data and find the size of vocabulary of the dataset (ignoring case) and the number of positive and negative examples.

In [2]:
# Read the data from the file
inputfile='SemEval2018-T3-train-taskA.txt'
f = open(inputfile,'r',encoding="utf8")
document = f.readlines()[1:]
f.close()

In [3]:
positivecounts = 0 # store the Positive Irony Count
negetiveCounts = 0 # Store the negetive Irony Count
vocab = [] # get a list of all words
data =[] # Prepare the data in the intended format of list of tuples
# data_Y =[]
for line in document:
    line_segment = line.split("\t")
    # Check and  increment the positive or negetive class count
    if(line_segment[1]=="1"):
        positivecounts += 1
    else:
        negetiveCounts += 1
    # Tokenise the sentence
    sen = word_tokenize(line_segment[2].rstrip().lower())
    vocab.append(sen)
    # Create the tuple that fits in the required input format
    tup = (int(line_segment[0]),int(line_segment[1]),sen)
    # Create the data
    data.append(tup)
# Flatten the Vocab List 
vocab_flatten = [word.strip() for wordlist in vocab for word in wordlist]
# Get the unique Vocab List
vocab_flatten_unique = list(set(vocab_flatten))

print("Unique Vocabulary Count: ", len(vocab_flatten_unique))
print("Positive Examples count: ", positivecounts)
print("Negetive Example count: ", negetiveCounts)
print("Sample Data: \n\t", data[0])

Unique Vocabulary Count:  13442
Positive Examples count:  1911
Negetive Example count:  1923
Sample Data: 
	 (1, 1, ['sweet', 'united', 'nations', 'video', '.', 'just', 'in', 'time', 'for', 'christmas', '.', '#', 'imagine', '#', 'noreligion', 'http', ':', '//t.co/fej2v3oubr'])


# Task 2

Develop a classifier using the Naive Bayes model to predict if an example is ironic. The model should convert each Tweet into a bag-of-words and calculate

$p(\text{Ironic}|w_1,\ldots,w_n) \propto \prod_{i=1,\ldots,n} p(w_i \in \text{tweet}| \text{Ironic}) p(\text{Ironic})$

$p(\text{NotIronic}|w_1,\ldots,w_n) \propto \prod_{i=1,\ldots,n} p(w_i \in \text{tweet}| \text{NotIronic}) p(\text{NotIronic})$

Use add-alpha smoothing to calculate probabilities

In [4]:
def train_Naive(train):
    # Input is an array of tuple 
    
    wordlist_pos = [] # Array to store word list for positive class
    wordlist_neg = [] # Array to store word list for negetive class
    positivecounts_train = 0 # Initialise the positive Class Count in Training Dataset
    negetiveCounts_train = 0 # Initialise the Negetive Class Count in Test Set
    cnt_pos = {} # Dictionary to store word with their respective positive count
    cnt_neg = {} # Dictionary to store word with their respective negetive count
    for item in train:
        #get the positive/negetive data (Bag Of Words) /word count
        if(item[1]==1):
            positivecounts_train += 1
            wordlist_pos.append(item[2])
        elif(item[1]==0):
            negetiveCounts_train += 1
            wordlist_neg.append(item[2]) 

    # Flatten the word list
    wordlist_pos_flatten = [word for item in wordlist_pos for word in item]
    wordlist_neg_flatten = [word for item in wordlist_neg for word in item]
    
    # Populate the dictionary of word count for positive instances
    cnt_pos = dict(Counter(wordlist_pos_flatten))

    # Populate the dictionary of word count for negetive instances
    cnt_neg = dict(Counter(wordlist_neg_flatten))
    
    # Calculate Prior Probablity for Positive and negetive class in Log Scale
    prior_prob_pos = np.log(positivecounts_train/(positivecounts_train + negetiveCounts_train))
    prior_prob_neg = np.log(negetiveCounts_train/(positivecounts_train + negetiveCounts_train))
    
    # get the unique vocab count in the traing set
    vocab_train = list(set(wordlist_pos_flatten)) + list(set(wordlist_neg_flatten))
    vocab_count = len(vocab_train)
    
    # Return Prior Probablities, positive, negetive word count dictionary, total unique vocab count
    param = (prior_prob_pos,prior_prob_neg,cnt_pos,cnt_neg,vocab_count)
    return param

In [5]:
def predict_Naive(text,param):
    # Input Sample Text and the Traing Model Input
    
    (prior_prob_pos,prior_prob_neg,cnt_pos,cnt_neg,vocab_count) = param
    # Initialise the Probablity for the Positive and Negetive Class
    prob_pos = prior_prob_pos
    prob_neg = prior_prob_neg
    
    # Calculate the Conditional Probablity with Add One Smoothing 
    for word in text:
        if(word in cnt_pos.keys()):
            prob_cond_pos = np.log((cnt_pos[word]+1)/(sum(cnt_pos.values()) + vocab_count))
            prob_pos += prob_cond_pos
        else:
            prob_cond_pos = np.log((1)/(sum(cnt_pos.values()) + vocab_count))
            prob_pos += prob_cond_pos
        
        if(word in cnt_neg.keys()):
            prob_cond_neg = np.log((cnt_neg[word]+1)/(sum(cnt_neg.values()) + vocab_count))
            prob_neg += prob_cond_neg
        else:
            prob_cond_neg = np.log((1)/(sum(cnt_neg.values()) + vocab_count))
            prob_neg += prob_cond_neg
    
    # Check for Highest Probablity Class and Return
    if(prob_pos > prob_neg):
        return 1
    else:
        return 0

# Task 3 

Divide the data into a training and test set and justify your split.

Choose a suitable evaluation metric and implement it. Explain why you chose this evaluation metric.

Evaluate the method in Task 2 according to this metric.

In [6]:
# Method to split the train test data
def train_test_split(data, test_size=0.33):
    shuffle(data) # Shuffle the data randomly
    limit = round(test_size*len(data))
    # Split the data
    test = data[0:limit]
    train = data[limit:]
    return (train,test)

In [7]:
# Method to calculate Accuracy, Precision, Recall & fscore
def evaluation(actual,prediction):
    # Input array of actual and prediction in Counter Format
    counts = Counter(zip(prediction, actual))
    
    # Calculate the TP,TN,FP,FN Counts
    true_pos  = counts[1, 1]
    true_neg  = counts[0, 0]
    false_pos = counts[1, 0]
    false_neg = counts[0, 1]
    
    # Calculate Accuracy, Precision, Recall & fscore
    accuracy = (true_pos + true_neg) / float(len(actual)) if actual else 0
    recall = true_pos / float(true_pos + false_neg) if (true_pos + false_neg) else 0
    precision = true_pos / float(true_pos + false_pos) if (true_pos + false_neg) else 0
    fscore = 2*precision*recall / (precision + recall) if (precision + recall) else 0
    
    # Print and Return
    print("Accuracy : ", accuracy)
    print("Recall : ", recall)
    print("Precision : ", precision)
    print("FScore : ",fscore)
    return accuracy, precision, recall, fscore

In [8]:
# Split The data 
train,test = train_test_split(data)

In [9]:
# Train the data
param = train_Naive(train)

# Predict with the Model parameters
prediction_naive = [predict_Naive(item[2],param) for item in test]

# Actual labels
actual = [item[1] for item in test]

# Get Evaluation Scores with the actual labels
print("Evaluation for Naive Bayes:\n")
print(evaluation(actual,prediction_naive))

Evaluation for Naive Bayes:

Accuracy :  0.6592885375494071
Recall :  0.6977491961414791
Precision :  0.6410635155096012
FScore :  0.668206312548114
(0.6592885375494071, 0.6410635155096012, 0.6977491961414791, 0.668206312548114)


# Task 4

Run the following code to generate a model from your training set. The training set should be in a variable  called `train` and is assumed to be of the form:

```
[(1, 1, ['sweet', 'united', 'nations', 'video', '.', 'just', 'in', 'time', 'for', 'christmas', '.', '#', 'imagine', '#', 'noreligion', 'http', ':', '//t.co/fej2v3oubr']), 
 (2, 1, ['@', 'mrdahl87', 'we', 'are', 'rumored', 'to', 'have', 'talked', 'to', 'erv', "'s", 'agent', '...', 'and', 'the', 'angels', 'asked', 'about', 'ed', 'escobar', '...', 'that', "'s", 'hardly', 'nothing', ';', ')']), 
 (3, 1, ['hey', 'there', '!', 'nice', 'to', 'see', 'you', 'minnesota/nd', 'winter', 'weather']), 
 (4, 0, ['3', 'episodes', 'left', 'i', "'m", 'dying', 'over', 'here']), 
 ...
]
 ```



In [10]:
# Print an element
print(train[0:2])

[(270, 1, ['@', 'patevans', '@', 'grbj', 'too', 'sad', '...', 'local', 'now', '!']), (3350, 0, ['there', 'was', 'a', 'shooting', 'in', 'my', 'old', 'neighborhood', 'where', 'my', 'family', 'lives', '.', 'praying', 'everyone', 'is', 'okay', 'and', 'stays', 'safe', ':', 'heavy_black_heart', ':', '️'])]


In [11]:
from keras.models import Sequential, load_model
from keras.layers import Dense, Activation, Embedding, Dropout, TimeDistributed
from keras.layers import LSTM
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
import numpy as np

## These values should be set from Task 3
train, test = train,test # Take the same test and trian from yhe previous task

# Make dictionary from the test and train data
def make_dictionary(train, test):
    dictionary = {}
    for d in train+test:
        for w in d[2]:
            if w not in dictionary:
                dictionary[w] = len(dictionary)
    return dictionary

class KerasBatchGenerator(object):
    def __init__(self, data, num_steps, batch_size, vocabulary, skip_step=5):
        self.data = data
        self.num_steps = num_steps
        self.batch_size = batch_size
        self.vocabulary = vocabulary
        self.current_idx = 0
        self.current_sent = 0
        self.skip_step = skip_step

    def generate(self):
        x = np.zeros((self.batch_size, self.num_steps))
        y = np.zeros((self.batch_size, self.num_steps, 2))
        while True:
            for i in range(self.batch_size):
                # Choose a sentence and position with at lest num_steps more words
                while self.current_idx + self.num_steps >= len(self.data[self.current_sent][2]):
                    self.current_idx = self.current_idx % len(self.data[self.current_sent][2])
                    self.current_sent += 1
                    if self.current_sent >= len(self.data):
                        self.current_sent = 0
                # The rows of x are set to values like [1,2,3,4,5]
                x[i, :] = [self.vocabulary[w] for w in self.data[self.current_sent][2][self.current_idx:self.current_idx + self.num_steps]]
                # The rows of y are set to values like [[1,0],[1,0],[1,0],[1,0],[1,0]]
                y[i, :, :] = [[self.data[self.current_sent][1], 1-self.data[self.current_sent][1]]] * self.num_steps
                self.current_idx += self.skip_step
            yield x, y

# Hyperparameters for model
vocabulary = make_dictionary(train, test)
num_steps = 5
batch_size = 20
num_epochs = 50 # Reduce this if the model is taking too long to train (or increase for performance)
hidden_size = 50 # Increase this to improve perfomance (or increase for performance)
use_dropout=True

# Create batches for RNN
train_data_generator = KerasBatchGenerator(train, num_steps, batch_size, vocabulary,
                                           skip_step=num_steps)
valid_data_generator = KerasBatchGenerator(test, num_steps, batch_size, vocabulary,
                                           skip_step=num_steps)

# A double stacked LSTM with dropout and n hidden layers
model = Sequential()
model.add(Embedding(len(vocabulary), hidden_size, input_length=num_steps))
model.add(LSTM(hidden_size, return_sequences=True))
model.add(LSTM(hidden_size, return_sequences=True))
if use_dropout:
    model.add(Dropout(0.5))
model.add(TimeDistributed(Dense(2)))
model.add(Activation('softmax'))

# Set optimizer and build model
optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])

# Train the model
model.fit_generator(train_data_generator.generate(), len(train)//(batch_size*num_steps), num_epochs,
                        validation_data=valid_data_generator.generate(),
                        validation_steps=len(test)//(batch_size*num_steps))

# Save the model
model.save("final_model.hdf5")

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


Now consider the following code:

In [12]:
model = load_model("final_model.hdf5")

x = np.zeros((1,num_steps))
x[0,:] = [vocabulary["this"],vocabulary["the"],vocabulary["an"],vocabulary["easy"],vocabulary["test"]]
print(model.predict(x))

[[[0.358755   0.641245  ]
  [0.3049212  0.69507873]
  [0.42448446 0.5755155 ]
  [0.02209741 0.9779026 ]
  [0.57697505 0.42302498]]]


Using the code above write a function that can predict the label using the LSTM model above and compare it with the evaluation performed in Task 3

In [13]:
# model = load_model("final_model.hdf5")
def predict_LSTM(num_steps,test):
    prediction_LSTM = []
    print("Total Item: ",len(test))
    # Loop Through all the Records in the Test Case
    for i1 in range(len(test)):
        # Extract the sentence from the data
        line = test[i1][2]
        dictPos = defaultdict(list)
        dictNeg = defaultdict(list)
        
        # Loop Through the Element in windows of stepsize
        for i in range(0,(len(line)-(num_steps-1))):
            # take the words of step size
            str_ln = line[i:i+num_steps]
            # convert words to corresponding vocab index
            lin_seg = np.array([vocabulary[i] for i in line[i:i+num_steps]])
            # reshape the array as (1,number of steps)
            lin_seg = lin_seg.reshape(1,num_steps)
            # predict the line with the model
            lin_seg_pred = model.predict(lin_seg)
            # take the first column of the probablity output matrix to get the positive class probablity
            lin_seg_pos = lin_seg_pred.reshape(num_steps,2)[:,0:1].flatten()
            # take the second column of the probablity output matrix to get the negetive class probablity
            lin_seg_neg = lin_seg_pred.reshape(num_steps,2)[:,1:].flatten()
            
            # Create a dictionary for (word,Positive Probablity)
            d_pos = dict(zip(str_ln,lin_seg_pos))
            # Convert the value of dictionary into a list
            d_pos = {k: [v] for k, v in d_pos.items()}
            # Create a dictionary for (word,Negetive Probablity)
            d_neg = dict(zip(str_ln,lin_seg_neg))
            # Convert the value of dictionary into a list
            d_neg = {k: [v] for k, v in d_neg.items()}

            # Merge the Dictionary 
            for k,v in d_pos.items():
                dictPos[k].extend(v)

            for k,v in d_neg.items():
                dictNeg[k].extend(v)

        # Koustava (#18234857) : Code Start
        multiplyPositive=1
        multiplyNegative=1
        # Multiply all the Value elements of the list to get the maximum positive probablity
        for k,v in dictPos.items():
            for j1 in v:
                multiplyPositive = multiplyPositive * j1
        # Multiply all the Value elements of the list to get the maximum negetive probablity
        for k1,v1 in dictNeg.items():
            for j2 in v1:
                multiplyNegative = multiplyNegative * j2
        
        # Check for maximum probablity and assign the class
        if(multiplyPositive > multiplyNegative):
            prediction_LSTM.append(1)
        else:
            prediction_LSTM.append(0)
        # Koustava (#18234857) : Code End
        
        # Print 100th iteration to check the progress
        if(i1 % 100 == 0):
            print("Iteration complete: {0}".format(i1))
    
    # Return the Predition list of class
    return prediction_LSTM

Please take a note that the following code will take time to run as it is iterating each element to calculate the accuracy

In [14]:
# Predict with the LSTM Model
prediction_LSTM = predict_LSTM(num_steps,test)

Total Item:  1265
Iteration complete: 0
Iteration complete: 100
Iteration complete: 200
Iteration complete: 300
Iteration complete: 400
Iteration complete: 500
Iteration complete: 600
Iteration complete: 700
Iteration complete: 800
Iteration complete: 900
Iteration complete: 1000
Iteration complete: 1100
Iteration complete: 1200


In [15]:
# Print the evaluation Matrix
print(evaluation(actual,prediction_LSTM))

Accuracy :  0.5762845849802372
Recall :  0.5594855305466238
Precision :  0.5704918032786885
FScore :  0.564935064935065
(0.5762845849802372, 0.5704918032786885, 0.5594855305466238, 0.564935064935065)


# Task 5

An improvement to either the system developed in Task 2 or 4 and show that it improves according to your evaluation metric.


#### Approach :

In the above approach we have seen the LSTM Based Model is giving as accuracy of around 58%. 
We have found the following observations on the approach :

* The approach is ignoring the sentences which is of length less than the step size
* It also ignores the part of the sentences which has length greater than the step size
* As we are using the NLTK Tokenizer it will split all the punctuations, numbers and symbols as well which might affect the performance 

So I have implemented the following improvements in the system - 

* I have decided not to use the NLTK tokenizer but to remove the punctuations and other extra symbols and http links manually from the data, this will prevent tokenizing those numbers and symbols further
* After removing the ymbols I am using Keras Preprocessing Tokenizer and convert them into sequence of numbers
* Convert each sentence with the numeric vector with the same length
* To get the same length I am using padding to pad extra items in my vector by a default value
* I have not changed anything significantly in the model
* I have updated the hyperparameters as now my input feature vector dimension has increased significantly

In [16]:
import pandas as pd
import re

# from sklearn.feature_extraction.text import CountVectorizer
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
# from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from sklearn.model_selection import train_test_split
from keras.utils.np_utils import to_categorical

# from keras.models import Sequential, load_model
from keras.layers import Activation, Dropout, TimeDistributed,Flatten
from keras.regularizers import l1
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
import numpy as np

In [17]:
data =[] # Prepare the data in the intended format of list of tuples
# data_Y =[]
for line in document:
    line_segment = line.split("\t")
    # Create a list of labels and sentence
    item = [int(line_segment[1]),line_segment[2].strip()]
    # Create the data
    data.append(item)

print("Sample Data: \n\t", data[0])

Sample Data: 
	 [1, 'Sweet United Nations video. Just in time for Christmas. #imagine #NoReligion  http://t.co/fej2v3OUBR']


In [18]:
# Convert into a dataframe 
data_df = pd.DataFrame(data)
data_df.columns = ["Irony","Sentence"]
data_df.head()

Unnamed: 0,Irony,Sentence
0,1,Sweet United Nations video. Just in time for C...
1,1,@mrdahl87 We are rumored to have talked to Erv...
2,1,Hey there! Nice to see you Minnesota/ND Winter...
3,0,3 episodes left I'm dying over here
4,1,"""I can't breathe!"" was chosen as the most nota..."


In [19]:
# Convert the labels into numbers
data_df['Irony'] = pd.to_numeric(data_df['Irony'])

In [20]:
data_df['Sentence'] = data_df['Sentence'].apply(lambda x: x.lower()) # Make Lower
data_df['Sentence'] = data_df['Sentence'].apply((lambda x: re.sub(r'https?://\S+','',x))) # Remove the http links
data_df['Sentence'] = data_df['Sentence'].apply((lambda x: re.sub('[^a-zA-z0-9#\s]','',x))) # Remove the symbols

In [21]:
# Print sample data
print(data_df.iloc[0,1])

sweet united nations video just in time for christmas #imagine #noreligion  


In [22]:
def make_dictionary(data):
    dictionary = {}
    for i in range(0,len(data)):
        for w in data.iloc[i,1].split(' '):
            if w not in dictionary:
                dictionary[w] = len(dictionary)+1
    return dictionary

# get the dictionary
dic = make_dictionary(data_df)
len(dic)

13191

In [23]:
#Take the max feature length from the dictionary
max_fatures = len(dic)
# Tokenize with Keras Tokenizer with a space 
tokenizer = Tokenizer(num_words=max_fatures, split=' ')
# Convert Text to Numeric values
tokenizer.fit_on_texts(data_df['Sentence'].values)
X = tokenizer.texts_to_sequences(data_df['Sentence'].values)
# Pad zeros to get the vectors of same length
X = pad_sequences(X, padding='post')
# get the labels
Y = pd.get_dummies(data_df['Irony']).values

In [31]:
# split the data in Test and Train
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size = 0.33, random_state = 42)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)

(2568, 98) (2568, 2)
(1266, 98) (1266, 2)


In [36]:
# assign the hidden layer size for first lstm
embed_dim = 20
# assign hidden layer size for second lstm
lstm_out = 10

model = Sequential()
model.add(Embedding(max_fatures, embed_dim,input_length = X.shape[1]))
# model.add(Dense(embed_dim, input_dim=X.shape[1], activation='relu', activity_regularizer=l1(0.0001)))
model.add(LSTM(embed_dim,activation='relu', return_sequences=True))
model.add(LSTM(lstm_out, return_sequences=True))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(2,activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['categorical_accuracy'])
print(model.summary())

batch_size = 20
model.fit(X_train, Y_train, epochs = 10, validation_data = (X_test, Y_test),
          batch_size=batch_size, verbose = 2)

# Save the model
model.save("final_model.hdf5_updt")

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_7 (Embedding)      (None, 98, 20)            263820    
_________________________________________________________________
lstm_10 (LSTM)               (None, 98, 20)            3280      
_________________________________________________________________
lstm_11 (LSTM)               (None, 98, 10)            1240      
_________________________________________________________________
dropout_7 (Dropout)          (None, 98, 10)            0         
_________________________________________________________________
flatten_6 (Flatten)          (None, 980)               0         
_________________________________________________________________
dense_7 (Dense)              (None, 2)                 1962      
Total params: 270,302
Trainable params: 270,302
Non-trainable params: 0
_________________________________________________________________
None

In [37]:
model_updt = load_model("final_model.hdf5_updt")

validation_size = 1000

X_validate = X_test[-validation_size:]
Y_validate = Y_test[-validation_size:]
score,acc = model_updt.evaluate(X_validate, Y_validate, verbose = 2, batch_size = batch_size)
print("acc: %.2f" % (acc))

acc: 0.59


I have tested with 1200 test cases and the accuracy has increased slightly  from around 58% to around 62%
But at the same time we have seen the Model has overfit, to reduce the overfitting we can have the following approach:

* We need more data to train the classifier
* Initialise with more more efficient word embedding and create more efficient feature vector
* Implement Weight Regularization


#### Bibliography

1. Liip. (2019). Sentiment detection with Keras, word embeddings and LSTM deep learning networks · Blog · Liip. [online] Available at: https://www.liip.ch/en/blog/sentiment-detection-with-keras-word-embeddings-and-lstm-deep-learning-networks [Accessed 1 Mar. 2019].