# IMDB movie review classification with MLPs

In this notebook, we'll train a multi-layer perceptron model to classify IMDB movie reviews using **Keras** (version $\ge$ 2 required). This notebook is largely based on the [Classifying movie reviews notebook](https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb) by François Chollet.

First, the needed imports. Keras tells us which backend (Theano, Tensorflow, CNTK) it will be using.

In [None]:
%matplotlib inline

from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
#from keras.utils import np_utils
from keras import backend as K

from distutils.version import LooseVersion as LV
from keras import __version__

from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

print('Using Keras version:', __version__, 'backend:', K.backend())
assert(LV(__version__) >= LV("2.0.0"))

## IMDB data set

Next we'll load the IMDB dataset. First time we may have to download the data, which can take a while.

The dataset has already been preprocessed, and each word has been replaced by an integer index. (Word indices begin at "3", as "1" is used to mark the start of a review and "2" represents all out-of-vocabulary words.)

In [None]:
from keras.datasets import imdb

# number of most-frequent words 
nb_words = 10000

(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=nb_words)

In [None]:
print()
print('IMDB data loaded: train:',len(X_train),'test:',len(X_test))
print('X_train:', X_train.shape)
print('y_train:', y_train.shape)

First review in the training set: 

In [None]:
print(X_train[0], "length:", len(X_train[0]), "class:", y_train[0])

As a sanity check, we can convert the review back to text:

In [None]:
reverse_word_index = dict([(value, key) for (key, value) in imdb.get_word_index().items()])
decoded_review = ' '.join([reverse_word_index.get(i - 3, '?') for i in X_train[0]])
print(decoded_review)