# Keras Neural Network Tutorial
Teacher: Carl Shan

Before running this Jupyter Notebook, make sure you've run the requisite commands to install tensorflow and keras.

If you haven't, here are the commands you need to run in Terminal:

* `sudo easy_install -U pip` - This will make sure you have the latest version of pip installed
* `sudo pip3 install tensorflow` - Installing tensorflow
* `sudo pip3 install keras` - installing keras



This notebook consists of two parts:

* **Part 1**: Showing how to set up and use Keras
* **Part 2**: Using Keras on a multiclass classification problem

# Part 1: Showing how to set up and use Keras

## Step 0: Importing libraries

In [1]:
import tensorflow as tf
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation

  return f(*args, **kwds)
Using TensorFlow backend.


## Step 1: Setting Up the Model

In [2]:
# Initializinga  model
model = Sequential()

# The first layer of your network needs to know the input shape
input_shape = (784, )

# Adding additional layers
model.add(Dense(units=32, input_shape=input_shape))
model.add(Activation('relu'))
model.add(Dense(units=10))
model.add(Activation('softmax'))

## Step 2: Compiling the Model

[According to the Keras tutorial](https://keras.io/getting-started/sequential-model-guide/): 

*Before training a model, you need to configure the learning process, which is done via the compile method. It receives three arguments:*

1. **An optimizer**. This could be the string identifier of an existing optimizer (such as `'rmsprop'` or `'adagrad'`), or an instance of the `Optimizer` class. See this link to all: [optimizers](https://keras.io/optimizers/).
2. **A loss function**. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as `'categorical_crossentropy'` or `'mse'`), or it can be an objective function. See: losses.
3. **A list of metrics**. For any classification problem you will want to set this to `metrics=['accuracy']`. A metric could be the string identifier of an existing metric or a custom metric function.*


*Below are some examples of how you can compile a model with different parameters.*


### Example: For a multi-class classification problem


In [3]:
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

### Example: For a binary classification problem

In [4]:
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

### Example: For a mean squared error regression problem


In [5]:
model.compile(optimizer='rmsprop',
              loss='mse')

### Step 3: Training the Model

Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the  `.fit` function. Read its [documentation here](https://keras.io/models/sequential/).

# Part 2: Training a Neural Network on the Reuters Newswire data
Source: [Official Keras Github Repo](https://github.com/fchollet/keras/blob/master/examples/reuters_mlp.py)

In [6]:
from __future__ import print_function

import numpy as np
import keras
from keras.datasets import reuters
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.preprocessing.text import Tokenizer

In [7]:
# Setting the hyperparameters
max_words = 1000
batch_size = 32
epochs = 5


# Downloading the Reuters data
print('Loading data...')
(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=max_words,
                                                         test_split=0.2)


Loading data...
Downloading data from https://s3.amazonaws.com/text-datasets/reuters.npz


### Understanding the Dataset
If you are curious about the Reuters dataset, see [this link in the official Keras documentation](https://keras.io/datasets/#reuters-newswire-topics-classification)

Here are some details:

> The Reuters dataset is a dataset of 11,228 newswires from Reuters, labeled over 46 topics. As with the IMDB dataset, each wire is encoded as a sequence of word indexes (same conventions).

> Each newswire is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words".

> As a convention, "0" does not stand for a specific word, but instead is used to encode any unknown word.

In [8]:
# Inspecting the loaded data
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')

num_classes = np.max(y_train) + 1
print(num_classes, 'classes')

8982 train sequences
2246 test sequences
46 classes


In [9]:
# Each element in x_train corresponds to a word index.
# For instance the integer "3" encodes the 3rd most frequent word in the data.
x_train[0]

[1,
 2,
 2,
 8,
 43,
 10,
 447,
 5,
 25,
 207,
 270,
 5,
 2,
 111,
 16,
 369,
 186,
 90,
 67,
 7,
 89,
 5,
 19,
 102,
 6,
 19,
 124,
 15,
 90,
 67,
 84,
 22,
 482,
 26,
 7,
 48,
 4,
 49,
 8,
 864,
 39,
 209,
 154,
 6,
 151,
 6,
 83,
 11,
 15,
 22,
 155,
 11,
 15,
 7,
 48,
 9,
 2,
 2,
 504,
 6,
 258,
 6,
 272,
 11,
 15,
 22,
 134,
 44,
 11,
 15,
 16,
 8,
 197,
 2,
 90,
 67,
 52,
 29,
 209,
 30,
 32,
 132,
 6,
 109,
 15,
 17,
 12]

In [10]:
# There are 46 total classes (e.g., newswire topics) our network will learn to classify
np.unique(y_train)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45])

#### Processing the data

In [11]:
num_classes = np.max(y_train) + 1
print(num_classes, 'classes')

print('Vectorizing sequence data...')
tokenizer = Tokenizer(num_words=max_words)
x_train = tokenizer.sequences_to_matrix(x_train, mode='binary')
x_test = tokenizer.sequences_to_matrix(x_test, mode='binary')
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

print('Convert class vector to binary class matrix '
      '(for use with categorical_crossentropy)')
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print('y_train shape:', y_train.shape)
print('y_test shape:', y_test.shape)

46 classes
Vectorizing sequence data...
x_train shape: (8982, 1000)
x_test shape: (2246, 1000)
Convert class vector to binary class matrix (for use with categorical_crossentropy)
y_train shape: (8982, 46)
y_test shape: (2246, 46)


#### Building, compiling and fitting the model

In [12]:
print('Building model...')
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))

print('Compiling model...')
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

print('Fitting model...')
history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_split=0.1)

Building model...
Compiling model...
Fitting model...
Train on 8083 samples, validate on 899 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


#### Scoring the model

In [13]:
score = model.evaluate(x_test, y_test,
                       batch_size=batch_size, verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])

Test score: 0.886729920218
Test accuracy: 0.791184327694
