[View in Colaboratory](https://colab.research.google.com/github/michalMalujdy/machine-learning-colab/blob/master/imdb_binary_classification.ipynb)

# Binary classification with Keras
Learning binary classification of film reviews (positive or negative) from IMDB using Keras framework

In [1]:
from keras.datasets import imdb

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

Using TensorFlow backend.


Downloading data from https://s3.amazonaws.com/text-datasets/imdb.npz




---



Function preprocessing a single review by transforming a list of integers to numpy one demensional array (a vector) where all values are zeros except for the indexes that corresponds to the list numbers (keys for words), where the value is set to 1.

In [0]:
import numpy as np

def preprocessReview(review):
  preprocessedVector = np.zeros(10000)
  
  for index in review:
    preprocessedVector[index] = 1
  
  return preprocessedVector



---



Function that does the trick mentioned above for all reviews in passed list of lists, so the return type is numpy array.

In [0]:
def preprocessReviews(reviews):

  input_array = np.empty([len(train_data), 10000])

  for i, review in enumerate(reviews):
    input_array[i] = preprocessReview(review)
    
  return input_array



---



Preprocess input data and labels data. The latter needs to be float32 numpay array, so the transformation is simple from int to float. But the input is preprocessed by adequate functions stated previously.

In [0]:
input_train = preprocessReviews(train_data)
input_test = preprocessReviews(test_data)

train_labels = np.array(train_labels).astype('float32')
test_labels = np.array(test_labels).astype('float32')



---



Create the network model

In [0]:
from keras.models import Sequential
from keras.layers import Dense, Activation

net = Sequential()
net.add(Dense(16, activation = 'relu', input_dim = 10000))
net.add(Dense(16, activation = 'relu'))
net.add(Dense(1, activation = 'sigmoid'))

net.compile(optimizer = 'rmsprop',
           loss = 'binary_crossentropy',
           metrics = ['accuracy'])



---



Actual network learning.

In [6]:
training_history = net.fit(
    input_train, 
    train_labels, 
    epochs = 20, 
    batch_size = 512)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20

Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20

Epoch 18/20
Epoch 19/20
Epoch 20/20




---



Display the accuraccy on test data

In [7]:
(_, accuracy) = net.evaluate(input_test, test_labels)
print(accuracy)

0.84904




---



# Optional

Decode the review from a list of integers indicating the word in dictionary to a readable text. The words are not stored in a concise order, so the output may not be understandable.

In [8]:
from random import randint

word_index_dict = imdb.get_word_index()
index_word_dict = {value: key for (key, value) in word_index_dict.items()}

random_review_index = randint(0, len(test_data) - 1)
review_ints_list = test_data[random_review_index];
review_words_list = [index_word_dict[key] for key in review_ints_list]
review_text = ' '.join(review_words_list)

print(review_text)

Downloading data from https://s3.amazonaws.com/text-datasets/imdb_word_index.json
the always out vision wife of you but of how my this if derivative believe was plays good hour no that with how running this handful this of phone innocence was video looking that in at is return br etc and concept entertaining is you br as grasp precisely begins br longer just such he more good bottom slave may of shines and this to and this as you but despite this of how you and there is married starred opinion film be and all and and suave making uncomfortable this minute more it over add br be natural friends of amazing face i i by because come end cut we leaves she are grinch watching instantly who charm realizing may made to of and being slowly experience this of performance face that it remembered for and there honest is buildings they of and you church honest so plan but where sea another more but be you you prison it oh to and it other and easy watching comment this and never so and being most cu



---

