# Stance Detection for the Fake News Challenge

## Identifying Textual Relationships with Deep Neural Nets

### Check the problem context [here](https://drive.google.com/open?id=1KfWaZyQdGBw8AUTacJ2yY86Yxgw2Xwq0).

### Download files required for the project from [here](https://drive.google.com/open?id=10yf39ifEwVihw4xeJJR60oeFBY30Y5J8).

 ## <font color=red> Milestone - 1 </font>

## Step1: Load the given dataset <h1> [10 marks] </h1>

1. Mount the google drive

2. Import Glove embeddings

3. Import the test and train datasets

### Mount the google drive to access required project files

Run the below commands

In [0]:
from google.colab import drive

In [2]:
drive.mount('/content/drive/')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive/


#### Path for Project files on google drive

**Note:** You need to change this path according where you have kept the files in google drive. 

In [0]:
project_path = "/content/drive/My Drive/DLCP/Project-3/Fake News Challenge/"

### Loading the Glove Embeddings

In [0]:
from zipfile import ZipFile
with ZipFile(project_path+'glove.6B.zip', 'r') as z:
  z.extractall()

### Load the dataset

1. Using [read_csv()](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html) in pandas load the given train datasets files **`train_bodies.csv`** and **`train_stances.csv`**

2. Using [merge](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html) command in pandas merge the two datasets based on the Body ID. 

Note: Save the final merged dataset in a dataframe with name **`dataset`**.

In [0]:
import pandas as pd
dataset1 = pd.read_csv(project_path+"train_bodies.csv")
dataset2 = pd.read_csv(project_path+"train_stances.csv")

In [6]:
dataset1.head()

Unnamed: 0,Body ID,articleBody
0,0,A small meteorite crashed into a wooded area i...
1,4,Last week we hinted at what was to come as Ebo...
2,5,(NEWSER) – Wonder how long a Quarter Pounder w...
3,6,"Posting photos of a gun-toting child online, I..."
4,7,At least 25 suspected Boko Haram insurgents we...


In [7]:
dataset2.head()

Unnamed: 0,Headline,Body ID,Stance
0,Police find mass graves with at least '15 bodi...,712,unrelated
1,Hundreds of Palestinians flee floods in Gaza a...,158,agree
2,"Christian Bale passes on role of Steve Jobs, a...",137,unrelated
3,HBO and Apple in Talks for $15/Month Apple TV ...,1034,unrelated
4,Spider burrowed through tourist's stomach and ...,1923,disagree


In [0]:
dataset = dataset1.merge(dataset2,on='Body ID')


<h2> Check1:</h2>
  
<h3> You should see the below output if you run `dataset.head()` command as given below </h3>

In [9]:
dataset.head()

Unnamed: 0,Body ID,articleBody,Headline,Stance
0,0,A small meteorite crashed into a wooded area i...,"Soldier shot, Parliament locked down after gun...",unrelated
1,0,A small meteorite crashed into a wooded area i...,Tourist dubbed ‘Spider Man’ after spider burro...,unrelated
2,0,A small meteorite crashed into a wooded area i...,Luke Somers 'killed in failed rescue attempt i...,unrelated
3,0,A small meteorite crashed into a wooded area i...,BREAKING: Soldier shot at War Memorial in Ottawa,unrelated
4,0,A small meteorite crashed into a wooded area i...,Giant 8ft 9in catfish weighing 19 stone caught...,unrelated


In [10]:
dataset.head()

Unnamed: 0,Body ID,articleBody,Headline,Stance
0,0,A small meteorite crashed into a wooded area i...,"Soldier shot, Parliament locked down after gun...",unrelated
1,0,A small meteorite crashed into a wooded area i...,Tourist dubbed ‘Spider Man’ after spider burro...,unrelated
2,0,A small meteorite crashed into a wooded area i...,Luke Somers 'killed in failed rescue attempt i...,unrelated
3,0,A small meteorite crashed into a wooded area i...,BREAKING: Soldier shot at War Memorial in Ottawa,unrelated
4,0,A small meteorite crashed into a wooded area i...,Giant 8ft 9in catfish weighing 19 stone caught...,unrelated


## Step2: Data Pre-processing and setting some hyper parameters needed for model


#### Run the code given below to set the required parameters.

1. `MAX_SENTS` = Maximum no.of sentences to consider in an article.

2. `MAX_SENT_LENGTH` = Maximum no.of words to consider in a sentence.

3. `MAX_NB_WORDS` = Maximum no.of words in the total vocabualry.

4. `MAX_SENTS_HEADING` = Maximum no.of sentences to consider in a heading of an article.

In [0]:
MAX_NB_WORDS = 20000
MAX_SENTS = 20
MAX_SENTS_HEADING = 1
MAX_SENT_LENGTH = 20
VALIDATION_SPLIT = 0.2

### Download the `Punkt` from nltk using the commands given below. This is for sentence tokenization.

For more info on how to use it, read [this](https://stackoverflow.com/questions/35275001/use-of-punktsentencetokenizer-in-nltk).



In [12]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

### Tokenizing the text and loading the pre-trained Glove word embeddings for each token <h1> [10 marks] </h1>

Keras provides [Tokenizer API](https://keras.io/preprocessing/text/) for preparing text. Read it before going any further.

#### Import the Tokenizer from keras preprocessing text

In [13]:
from keras.preprocessing.text import Tokenizer

Using TensorFlow backend.


#### Initialize the Tokenizer class with maximum vocabulary count as `MAX_NB_WORDS` initialized at the start of step2. 

In [0]:
t = Tokenizer(num_words = MAX_NB_WORDS)

#### Now, using fit_on_texts() from Tokenizer class, lets encode the data 

Note: We need to fit articleBody and Headline also to cover all the words.

In [0]:
t.fit_on_texts(list(dataset['articleBody'].values))

In [16]:
len(dataset['articleBody'].values)

49972

In [0]:
t.fit_on_texts(list(dataset['Headline'].values))

In [18]:
len(dataset['Headline'].values)

49972

In [19]:
t.document_count

99944

#### fit_on_texts() gives the following attributes in the output as given [here](https://faroit.github.io/keras-docs/1.2.2/preprocessing/text/).

* **word_counts:** dictionary mapping words (str) to the number of times they appeared on during fit. Only set after fit_on_texts was called.

* **word_docs:** dictionary mapping words (str) to the number of documents/texts they appeared on during fit. Only set after fit_on_texts was called.

* **word_index:** dictionary mapping words (str) to their rank/index (int). Only set after fit_on_texts was called.

* **document_count:** int. Number of documents (texts/sequences) the tokenizer was trained on. Only set after fit_on_texts or fit_on_sequences was called.



### Now, tokenize the sentences using nltk sent_tokenize() and encode the senteces with the ids we got form the above `t.word_index`

Initialise 2 lists with names `texts` and `articles`.

```
texts = [] to store text of article as it is.

articles = [] split the above text into a list of sentences.
```

In [0]:
from nltk.tokenize import sent_tokenize
texts = dataset['articleBody']

In [21]:
texts[0]

'A small meteorite crashed into a wooded area in Nicaragua\'s capital of Managua overnight, the government said Sunday. Residents reported hearing a mysterious boom that left a 16-foot deep crater near the city\'s airport, the Associated Press reports. \n\nGovernment spokeswoman Rosario Murillo said a committee formed by the government to study the event determined it was a "relatively small" meteorite that "appears to have come off an asteroid that was passing close to Earth." House-sized asteroid 2014 RC, which measured 60 feet in diameter, skimmed the Earth this weekend, ABC News reports. \nMurillo said Nicaragua will ask international experts to help local scientists in understanding what happened.\n\nThe crater left by the meteorite had a radius of 39 feet and a depth of 16 feet,  said Humberto Saballos, a volcanologist with the Nicaraguan Institute of Territorial Studies who was on the committee. He said it is still not clear if the meteorite disintegrated or was buried.\n\nHumbe

In [0]:
articles = [sent_tokenize(i) for i in texts]

In [23]:
articles[0]

["A small meteorite crashed into a wooded area in Nicaragua's capital of Managua overnight, the government said Sunday.",
 "Residents reported hearing a mysterious boom that left a 16-foot deep crater near the city's airport, the Associated Press reports.",
 'Government spokeswoman Rosario Murillo said a committee formed by the government to study the event determined it was a "relatively small" meteorite that "appears to have come off an asteroid that was passing close to Earth."',
 'House-sized asteroid 2014 RC, which measured 60 feet in diameter, skimmed the Earth this weekend, ABC News reports.',
 'Murillo said Nicaragua will ask international experts to help local scientists in understanding what happened.',
 'The crater left by the meteorite had a radius of 39 feet and a depth of 16 feet,  said Humberto Saballos, a volcanologist with the Nicaraguan Institute of Territorial Studies who was on the committee.',
 'He said it is still not clear if the meteorite disintegrated or was bu

## Check 2:

first element of texts and articles should be as given below. 

In [24]:
texts[0]

'A small meteorite crashed into a wooded area in Nicaragua\'s capital of Managua overnight, the government said Sunday. Residents reported hearing a mysterious boom that left a 16-foot deep crater near the city\'s airport, the Associated Press reports. \n\nGovernment spokeswoman Rosario Murillo said a committee formed by the government to study the event determined it was a "relatively small" meteorite that "appears to have come off an asteroid that was passing close to Earth." House-sized asteroid 2014 RC, which measured 60 feet in diameter, skimmed the Earth this weekend, ABC News reports. \nMurillo said Nicaragua will ask international experts to help local scientists in understanding what happened.\n\nThe crater left by the meteorite had a radius of 39 feet and a depth of 16 feet,  said Humberto Saballos, a volcanologist with the Nicaraguan Institute of Territorial Studies who was on the committee. He said it is still not clear if the meteorite disintegrated or was buried.\n\nHumbe

In [25]:
articles[0]

["A small meteorite crashed into a wooded area in Nicaragua's capital of Managua overnight, the government said Sunday.",
 "Residents reported hearing a mysterious boom that left a 16-foot deep crater near the city's airport, the Associated Press reports.",
 'Government spokeswoman Rosario Murillo said a committee formed by the government to study the event determined it was a "relatively small" meteorite that "appears to have come off an asteroid that was passing close to Earth."',
 'House-sized asteroid 2014 RC, which measured 60 feet in diameter, skimmed the Earth this weekend, ABC News reports.',
 'Murillo said Nicaragua will ask international experts to help local scientists in understanding what happened.',
 'The crater left by the meteorite had a radius of 39 feet and a depth of 16 feet,  said Humberto Saballos, a volcanologist with the Nicaraguan Institute of Territorial Studies who was on the committee.',
 'He said it is still not clear if the meteorite disintegrated or was bu

 ## <font color=red> Milestone - 2 </font>

#### Now iterate through each article and each sentence to encode the words into ids using t.word_index <h1>[10 marks]</h1>

Here, to get words from sentence you can use `text_to_word_sequence` from keras preprocessing text.

1. Import text_to_word_sequence

2. Initialize a variable of shape (no.of articles, MAX_SENTS, MAX_SENT_LENGTH) with name `data` with zeros first (you can use numpy [np.zeros](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html) to initialize with all zeros)and then update it while iterating through the words and sentences in each article.

In [26]:
len(t.word_index.values())

27873

In [0]:
from keras.preprocessing.text import text_to_word_sequence
import numpy as np
data = np.zeros(shape=(len(articles),MAX_SENTS,MAX_SENT_LENGTH),dtype = np.int32)

In [0]:
for article_index in range(len(articles)):
  sentences = articles[article_index]
  for sentence_index in range(len(sentences)):
    if sentence_index >= MAX_SENTS:
      break
    sentence = sentences[sentence_index]
    words = text_to_word_sequence(sentence)
    for word_index in range(len(words)):
      if word_index >= MAX_SENT_LENGTH:
        break
      word = words[word_index]
      #print(word,article_index,sentence_index,word_index)
      data[article_index,sentence_index,word_index]=t.word_index[word]

In [29]:
data.shape

(49972, 20, 20)

### Check 3:

Accessing first element in data should give something like given below.

In [30]:
data[0,:,:]

array([[    3,   481,   427,  7211,    81,     3,  3733,   331,     5,
         3891,   350,     4,  1431,  2958,     1,    89,    12,   464,
            0,     0],
       [  758,    95,  1047,     3,  2679,  1752,     7,   189,     3,
         1217,  1075,  2030,   700,   159,     1,  3032,   448,     1,
          555,   235],
       [   89,  1067,  4115,  2349,    12,     3,  1092,  3306,    19,
            1,    89,     2,  1793,     1,   521,  2009,    15,     9,
            3,  3111],
       [  181,  3640,   972,   200,  2556,    44,  6775,  1722,  1252,
            5, 13317, 17936,     1,   778,    31,   740,  3990,    67,
           85,     0],
       [ 2349,    12,  1557,    38,  1094,   351,   775,     2,   367,
          260,  1770,     5,  4450,    70,   494,     0,     0,     0,
            0,     0],
       [    1,   700,   189,    19,     1,   427,    32,     3,  7417,
            4,  2159,  1252,     6,     3,  5270,     4,  1217,  1252,
           12,  3363],
       [  

In [31]:
data[0, :, :]

array([[    3,   481,   427,  7211,    81,     3,  3733,   331,     5,
         3891,   350,     4,  1431,  2958,     1,    89,    12,   464,
            0,     0],
       [  758,    95,  1047,     3,  2679,  1752,     7,   189,     3,
         1217,  1075,  2030,   700,   159,     1,  3032,   448,     1,
          555,   235],
       [   89,  1067,  4115,  2349,    12,     3,  1092,  3306,    19,
            1,    89,     2,  1793,     1,   521,  2009,    15,     9,
            3,  3111],
       [  181,  3640,   972,   200,  2556,    44,  6775,  1722,  1252,
            5, 13317, 17936,     1,   778,    31,   740,  3990,    67,
           85,     0],
       [ 2349,    12,  1557,    38,  1094,   351,   775,     2,   367,
          260,  1770,     5,  4450,    70,   494,     0,     0,     0,
            0,     0],
       [    1,   700,   189,    19,     1,   427,    32,     3,  7417,
            4,  2159,  1252,     6,     3,  5270,     4,  1217,  1252,
           12,  3363],
       [  

### Repeat the same process for the `Headings` as well. Use variables with names `texts_heading` and `articles_heading` accordingly. <h1> [10 marks] </h1>

In [0]:
texts_heading = []
articles_heading = []

texts_heading = dataset['Headline']
articles_heading = [sent_tokenize(i) for i in texts_heading]

In [0]:
data_heading = np.zeros(shape=(len(articles_heading),MAX_SENTS,MAX_SENT_LENGTH),dtype = np.int32)
for article_index in range(len(articles_heading)):
  sentences = articles_heading[article_index]
  for sentence_index in range(len(sentences)):
    if sentence_index >= MAX_SENTS:
      break
    sentence = sentences[sentence_index]
    words = text_to_word_sequence(sentence)
    for word_index in range(len(words)):
      if word_index >= MAX_SENT_LENGTH:
        break
      word = words[word_index]
      #print(word,article_index,sentence_index,word_index)
      data_heading[article_index,sentence_index,word_index]=t.word_index[word]

In [34]:
data_heading[0,:,:]

array([[  718,   206,   343,  7134,   193,    34,  1338, 11554,    21,
          233,   686,     0,     0,     0,     0,     0,     0,     0,
            0,     0],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0],
       [  

### Now the features are ready, lets make the labels ready for the model to process.

### Convert labels into one-hot vectors

You can use [get_dummies](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.get_dummies.html) in pandas to create one-hot vectors.

In [0]:
labels = dataset['Stance']
labels = pd.get_dummies(labels).values

### Check 4:

The shape of data and labels shoould match the given below numbers.

In [36]:
print('Shape of data tensor:', data.shape)
print('Shape of label tensor:', labels.shape)

Shape of data tensor: (49972, 20, 20)
Shape of label tensor: (49972, 4)


### Shuffle the data

In [0]:
## get numbers upto no.of articles
indices = np.arange(data.shape[0])
## shuffle the numbers
np.random.shuffle(indices)

In [0]:
## shuffle the data
data = data[indices]
data_heading = data_heading[indices]
## shuffle the labels according to data
labels = labels[indices]

### Split into train and validation sets. Split the train set 80:20 ratio to get the train and validation sets.


Use the variable names as given below:

x_train, x_val - for body of articles.

x_heading_train, x_heading_val - for heading of articles.

y_train - for training labels.

y_val - for validation labels.

<h1> [10 marks] </h1>

In [0]:
total_rows = len(indices)

In [40]:
total_rows

49972

In [0]:
train_rows=int(np.ceil(0.8*total_rows))
x_train = data[0:train_rows,:,:]
x_val = data[train_rows:,:,:]

x_heading_train = data_heading[0:train_rows,:,:]
x_heading_val = data_heading[train_rows:,:,:]

y_train = labels[0:train_rows,:]
y_val = labels[train_rows:,:]

### Check 5:

The shape of x_train, x_val, y_train and y_val should match the below numbers.

In [42]:
print(x_train.shape)
print(y_train.shape)

print(x_val.shape)
print(y_val.shape)

(39978, 20, 20)
(39978, 4)
(9994, 20, 20)
(9994, 4)


In [43]:
print(x_train.shape)
print(y_train.shape)

print(x_val.shape)
print(y_val.shape)

(39978, 20, 20)
(39978, 4)
(9994, 20, 20)
(9994, 4)


### Create embedding matrix with the glove embeddings


Run the below code to create embedding_matrix which has all the words and their glove embedding if present in glove word list.

In [44]:
# load the whole embedding into memory
embeddings_index = dict()
f = open('./glove.6B.100d.txt')
for line in f:
	values = line.split()
	word = values[0]
	coefs = np.asarray(values[1:], dtype='float32')
	embeddings_index[word] = coefs
f.close()
print('Loaded %s word vectors.' % len(embeddings_index))

# Ask about this
vocab_size = len(t.word_index)
# create a weight matrix for words in training docs
embedding_matrix = np.zeros((vocab_size, 100))


for word, i in t.word_index.items():
	embedding_vector = embeddings_index.get(word)
	if embedding_vector is not None:
		embedding_matrix[i] = embedding_vector

Loaded 400000 word vectors.


In [45]:
embedding_matrix.shape

(27873, 100)

 ## <font color=red> Milestone - 3 </font>

## Try different sequential models and report accuracy scores for each model.

<h1>[50 marks]  </h1>

## Flatten Input Arrays

In [46]:
# Flatten 3D array into 2D
x_train = x_train.reshape(-1,MAX_SENTS*MAX_SENT_LENGTH)
x_heading_train = x_heading_train.reshape(-1,MAX_SENTS*MAX_SENT_LENGTH)

# Flatten 3D array into 2D
x_val = x_val.reshape(-1,MAX_SENTS*MAX_SENT_LENGTH)
x_heading_val = x_heading_val.reshape(-1,MAX_SENTS*MAX_SENT_LENGTH)

# Merge heading and article dataset
x_train = np.concatenate((x_train, x_heading_train), axis=0)
x_val = np.concatenate((x_val, x_heading_val), axis=0)
y_train = np.concatenate((y_train, y_train), axis=0)
y_val = np.concatenate((y_val, y_val), axis=0)

print(x_train.shape, x_val.shape, y_train.shape, y_val.shape)

(79956, 400) (19988, 400) (79956, 4) (19988, 4)


### Import layers from Keras to build the model

In [0]:
from keras import layers

### Model

In [48]:
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D

lstm_out = 196

batch_size = 32
epochs = 1

# Create model - LSTM
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(4,activation='softmax'))

print(model.summary())

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_1 (Spatial (None, 400, 128)          0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 196)               254800    
_________________________________________________________________
dense_1 (Dense)              (None, 4)                 788       
Total params: 2,815,588
Trainable params: 2,815,588
Non-trainable params: 0
_________________________________________________________________
None


### Compile and fit the model

In [0]:
model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

In [57]:
# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

Instructions for updating:
Use tf.cast instead.
Epoch 1/1


<keras.callbacks.History at 0x7fcf291b46a0>

In [58]:
score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

score: 0.78
acc: 0.73


Score: 0.78

Accuracy: 0.73

Time taken: 1728 sec (22ms/step)

In [50]:
from keras.layers import Bidirectional

# Create model - Bidirectional LSTM
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.2, return_sequences=False, recurrent_dropout=0.2)))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_2 (Spatial (None, 400, 128)          0         
_________________________________________________________________
bidirectional_1 (Bidirection (None, 392)               509600    
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 1572      
Total params: 3,071,172
Trainable params: 3,071,172
Non-trainable params: 0
_________________________________________________________________
None
Instructions for updating:
Use tf.cast instead.
Epoch 1/1
score: 0.64
acc: 0.77


Score: 0.64

Accuracy: 0.77

Time taken: 3274 sec (41ms/step)

In [51]:
# Create model - Bidirectional LSTM - change dropout
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.1, return_sequences=False, recurrent_dropout=0.1)))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_3 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_3 (Spatial (None, 400, 128)          0         
_________________________________________________________________
bidirectional_2 (Bidirection (None, 392)               509600    
_________________________________________________________________
dense_3 (Dense)              (None, 4)                 1572      
Total params: 3,071,172
Trainable params: 3,071,172
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/1
score: 0.63
acc: 0.77


Score: 0.63

Accuracy: 0.77

Time taken: 3551 sec (42ms/step)

In [52]:
# Create model - Bidirectional LSTM - Change dropout
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=False, recurrent_dropout=0.4)))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_4 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_4 (Spatial (None, 400, 128)          0         
_________________________________________________________________
bidirectional_3 (Bidirection (None, 392)               509600    
_________________________________________________________________
dense_4 (Dense)              (None, 4)                 1572      
Total params: 3,071,172
Trainable params: 3,071,172
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/1
score: 0.65
acc: 0.76


Score: 0.65

Accuracy: 0.76

Time taken: 3310 sec (41ms/step)

In [53]:
# Create model - LSTM with multiple hidden layers
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.4, recurrent_dropout=0.4, return_sequences=True))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(lstm_out,return_sequences=False))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_5 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_5 (Spatial (None, 400, 128)          0         
_________________________________________________________________
lstm_5 (LSTM)                (None, 400, 196)          254800    
_________________________________________________________________
spatial_dropout1d_6 (Spatial (None, 400, 196)          0         
_________________________________________________________________
lstm_6 (LSTM)                (None, 196)               308112    
_________________________________________________________________
dense_5 (Dense)              (None, 4)                 788       
Total params: 3,123,700
Trainable params: 3,123,700
Non-trainable params: 0
_________________________________________________________________


Score: 0.77

Accuracy: 0.74

Time taken:  2923 sec (37ms/step)

In [54]:
# Create model - Bidirectional-LSTM with multiple hidden layers
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=True, recurrent_dropout=0.4)))
model.add(SpatialDropout1D(0.4))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=False, recurrent_dropout=0.4)))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_6 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_7 (Spatial (None, 400, 128)          0         
_________________________________________________________________
bidirectional_4 (Bidirection (None, 400, 392)          509600    
_________________________________________________________________
spatial_dropout1d_8 (Spatial (None, 400, 392)          0         
_________________________________________________________________
bidirectional_5 (Bidirection (None, 392)               923552    
_________________________________________________________________
dense_6 (Dense)              (None, 4)                 1572      
Total params: 3,994,724
Trainable params: 3,994,724
Non-trainable params: 0
_________________________________________________________________


Score: 0.66

Accuracy: 0.76

Time taken: 6367 sec (80ms/step)

In [55]:
# Create model - Bidirectional-LSTM with LSTM as hidden layer
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.2))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=True, recurrent_dropout=0.4)))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(lstm_out,return_sequences=False))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_7 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_9 (Spatial (None, 400, 128)          0         
_________________________________________________________________
bidirectional_6 (Bidirection (None, 400, 392)          509600    
_________________________________________________________________
spatial_dropout1d_10 (Spatia (None, 400, 392)          0         
_________________________________________________________________
lstm_10 (LSTM)               (None, 196)               461776    
_________________________________________________________________
dense_7 (Dense)              (None, 4)                 788       
Total params: 3,532,164
Trainable params: 3,532,164
Non-trainable params: 0
_________________________________________________________________


Score: 0.77

Accuracy: 0.75

Time taken: 4292 sec (54ms/step)

In [56]:
# Create model - LSTM with Bidirectional-LSTM as hidden layer
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(lstm_out, dropout=0.4, recurrent_dropout=0.4, return_sequences=True))
model.add(SpatialDropout1D(0.2))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=False, recurrent_dropout=0.4)))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, verbose = 2, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_8 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_11 (Spatia (None, 400, 128)          0         
_________________________________________________________________
lstm_11 (LSTM)               (None, 400, 196)          254800    
_________________________________________________________________
spatial_dropout1d_12 (Spatia (None, 400, 196)          0         
_________________________________________________________________
bidirectional_7 (Bidirection (None, 392)               616224    
_________________________________________________________________
dense_8 (Dense)              (None, 4)                 1572      
Total params: 3,432,596
Trainable params: 3,432,596
Non-trainable params: 0
_________________________________________________________________


Score: 0.68

Accuracy: 0.75

Time taken: 4700 sec (59ms/step)

In [49]:
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from keras.layers import Bidirectional

lstm_out = 196

batch_size = 32
epochs = 1

model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.4, recurrent_dropout=0.4, return_sequences=True))
model.add(SpatialDropout1D(0.3))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=True, recurrent_dropout=0.4)))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.4, recurrent_dropout=0.4, return_sequences=True))
model.add(SpatialDropout1D(0.3))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=True, recurrent_dropout=0.4)))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.4, recurrent_dropout=0.4, return_sequences=True))
model.add(SpatialDropout1D(0.3))
model.add(Bidirectional(LSTM(lstm_out, dropout=0.4, return_sequences=True, recurrent_dropout=0.4)))
model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2, return_sequences=True))
model.add(LSTM(lstm_out, dropout=0.4, recurrent_dropout=0.4, return_sequences=False))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=epochs,batch_size=batch_size)

score,acc = model.evaluate(x_val, y_val, batch_size = batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_2 (Spatial (None, 400, 128)          0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 400, 196)          254800    
_________________________________________________________________
spatial_dropout1d_3 (Spatial (None, 400, 196)          0         
_________________________________________________________________
bidirectional_1 (Bidirection (None, 400, 392)          616224    
_________________________________________________________________
spatial_dropout1d_4 (Spatial (None, 400, 392)          0         
_________________________________________________________________
lstm_3 (LSTM)                (None, 400, 196)          461776    
__________

Score: 0.81

Accuracy: 0.73

Time taken: 17506 sec (219ms/step)

In [50]:
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from keras.layers import Bidirectional

lstm_out = 196

# Run the best model for more epochs
# Create model - LSTM with multiple hidden layers
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, 128,input_length = MAX_SENT_LENGTH*MAX_SENTS))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.4, recurrent_dropout=0.4, return_sequences=True))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(lstm_out,return_sequences=False))
model.add(Dense(4,activation='softmax'))

print(model.summary())

model.compile(loss = 'categorical_crossentropy', optimizer='adam',metrics = ['accuracy'])

# Fit model
model.fit(x_train,y_train,epochs=5,batch_size=64)

score,acc = model.evaluate(x_val, y_val, batch_size = 64)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_3 (Embedding)      (None, 400, 128)          2560000   
_________________________________________________________________
spatial_dropout1d_4 (Spatial (None, 400, 128)          0         
_________________________________________________________________
lstm_3 (LSTM)                (None, 400, 196)          254800    
_________________________________________________________________
spatial_dropout1d_5 (Spatial (None, 400, 196)          0         
_________________________________________________________________
lstm_4 (LSTM)                (None, 196)               308112    
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 788       
Total params: 3,123,700
Trainable params: 3,123,700
Non-trainable params: 0
_________________________________________________________________


Score: 0.68

Accuracy: 0.75

Time Taken: 18ms/step