Objectives:
        i) Experiments in Keras
       ii) Learning to work with Neural Network

# About Data

IMDB dataset: A set of 50,000 highly polarized reviews from the Internet Movie Database. They’re split into 25,000 reviews for training and 25,000 reviews for testing, each set consisting of 50% negative and 50% positive reviews.

The IMDB dataset comes packaged with Keras. It has already been preprocessed: the reviews (sequences of words) have been turned into sequences of integers, where each integer stands for a specific word in a dictionary.

### Objectives:
         i) Experiments in Keras
        ii) Learning to work with Neural Network 
	   iii) Classify the data using Neural Network
	    iv) Work with the text data 
	     v) Transform the data for to be used in the model

In [None]:
# 1.0 Call libraries

import pandas as pd
import os

In [None]:
#Loading the IMDB dataset
from keras.datasets import imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(
num_words=10000)

In [None]:
# The argument num_words=10000 means you’ll only keep the top 10,000 most frequently 
# occurring words in the training data. Rare words will be discarded

In [None]:
#The variables train_data and test_data are lists of reviews 
# Each review is a list of word indices (encoding a sequence of words)
# train_labels and test_labels are # lists of 0s and 1s, where 0 stands for negative and 1 stands for positive

In [None]:
#train_data[0] #Each review is a list of word indices (encoding a sequence of words)

In [None]:
train_labels[0]

In [None]:
max([max(sequence) for sequence in train_data])   #9999
#no word index will exceed 10,000

In [None]:
#We can decode one of these reviews back to English words:

In [None]:
imdb.get_word_index?

In [None]:
word_index = imdb.get_word_index()
type(word_index)

In [None]:
#word_index.items() #dictionary of words (key) to index (value)

In [None]:
reverse_word_index = dict(
[(value, key) for (key, value) in word_index.items()])
##dictionary of index (key) to words (value)

In [None]:
decoded_review = ' '.join(
[reverse_word_index.get(i - 3, '?') for i in train_data[0]]) #Taking one review data

In [None]:
train_data[0]

In [None]:
decoded_review

In [None]:
'''Decodes the review. Note that the indices
are offset by 3 because 0, 1, and 2 are
reserved indices for “padding,” “start of sequence,” and “unknown”'''

In [None]:
train_data.ndim

In [None]:
train_data.shape

In [None]:
#list(enumerate(train_data))

In [None]:
# Encoding the integer sequences into a binary matrix

import numpy as np
def vectorize_sequences(sequences, dimension=10000):
    results = np.zeros((len(sequences), dimension))
	#Creates an all-zero matrix of shape (len(sequences), dimension)
    for i, sequence in enumerate(sequences):
        results[i, sequence] = 1.
		#Sets specific indices of results[i] to 1s
    return results

In [None]:
x_train = vectorize_sequences(train_data) #Vectorized training data

In [None]:
x_train

In [None]:
x_test = vectorize_sequences(test_data) #Vectorized test data

In [None]:
#You should also vectorize your labels
y_train = np.asarray(train_labels).astype('float32')
y_test = np.asarray(test_labels).astype('float32')

In [None]:
y_train

In [None]:
'''
There are two key architecture decisions to be made about such a stack of Dense layers:
1. How many layers to use
2. How many hidden units to choose for each layer
'''

In [None]:
#The model definition
from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(16, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

In [None]:
#choose a loss function and an optimizer

'''
Crossentropy is a quantity from the field of Information Theory
that measures the distance between probability distributions or, in this
case, between the ground-truth distribution and your predictions
'''
#Compiling the model
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])

In [None]:
#Using custom losses and metrics
from keras import optimizers
from keras import losses
from keras import metrics
model.compile(optimizer= optimizers.RMSprop(lr = 0.002),
loss=losses.binary_crossentropy,
metrics=[metrics.binary_accuracy])

In [None]:
#Setting aside a validation set
x_val = x_train[:10000]
partial_x_train = x_train[10000:]
y_val = y_train[:10000]
partial_y_train = y_train[10000:]

In [None]:
#Training your model
#model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model.fit(partial_x_train,
partial_y_train,
epochs=10,
batch_size=512,
validation_data=(x_val, y_val))

In [None]:
'''
The call to model.fit() returns a History object. This object has a member
history, which is a dictionary containing data about everything that happened
during training.
'''

In [None]:
history_dict = history.history
history_dict.keys()

In [None]:
#[u'acc', u'loss', u'val_acc', u'val_loss']
#The dictionary contains four entries: one per metric that was being monitored during
#training and during validation

In [None]:
#Plotting the training and validation loss
import matplotlib.pyplot as plt
history_dict = history.history
loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']
#epochs = range(1, len(acc) + 1)
epochs = range(1, 10 + 1)
plt.plot(epochs, loss_values, 'bo', label='Training loss')
plt.plot(epochs, val_loss_values, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

In [None]:
binary_accuracy_values = history_dict['binary_accuracy']
val_binary_accuracy_values = history_dict['val_binary_accuracy']
#epochs = range(1, len(acc) + 1)
epochs = range(1, 10 + 1)
plt.plot(epochs, binary_accuracy_values, 'bo', label='Training Accuracy')
plt.plot(epochs, val_binary_accuracy_values, 'b', label='Validation Accuracy')
plt.title('Training and validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()