# Convolutional Neural Networks (CNN) with Keras


In this notebook a simple implementation of a deep CNN in Keras with a tensorflow backend is shown. This does not cover the theory of cnn but focuses on the implementation.
Topics covered here:

  *  Keras Sequential Model
  *  Keras Functional API
  *  Implementation of an inception model using the functional API


#### Load Modules and MNIST Dataset

In [1]:
%matplotlib inline
import tensorflow as tf
import math
import numpy as np
from tqdm import tqdm
import matplotlib.pyplot as plt


from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(".", one_hot=True, reshape=False)

train_features = mnist.train.images
train_labels = mnist.train.labels
valid_features = mnist.validation.images
valid_labels = mnist.validation.labels
test_features = mnist.test.images
test_labels = mnist.test.labels

Extracting ./train-images-idx3-ubyte.gz
Extracting ./train-labels-idx1-ubyte.gz
Extracting ./t10k-images-idx3-ubyte.gz
Extracting ./t10k-labels-idx1-ubyte.gz


In [16]:
from keras.models import Sequential, Model
from keras.layers import Input, concatenate, Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D, AveragePooling2D
from keras.optimizers import SGD, Adam

## Keras Sequential Model

The Sequential model is a linear stack of layers.

```python
model = Sequential()
```

#### Specifying the input shape
The model needs to know what input shape it should expect. For this reason, the first layer in a Sequential model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape.

```python
# Define input shape
model.add(Convolution2D(32, (3, 3), padding='valid', input_shape=(28, 28, 1)))
```

#### Adding layers
A Sequential model can be created by passing a list of layer instances to the constructor or by adding layers via the .add() method like above


#### Compilation

Before training a model, the learning process must be configured, which is done via the compile method. It receives three arguments:

* An optimizer. This could be the string identifier of an existing optimizer (such as rmsprop or adagrad), or an instance of the Optimizer class. 

* A loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function (such as categorical_crossentropy or mse), or it can be an objective function. 

* A list of metrics. For any classification problem you will want to set this to metrics=['accuracy']. A metric could be the string identifier of an existing metric or a custom metric function.

```python
# For a multi-class classification problem
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
```

#### Training

Keras models are trained on Numpy arrays of input data and labels. For training a model, you will typically use the fit function.

```python
model.fit(train_features, train_labels, epochs=2, batch_size=64, validation_split=0.2)
```

### Build Model and Train

In [20]:
model = Sequential()

model.add(Convolution2D(32, (3, 3), padding='valid', input_shape=(28, 28, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))

model.add(Convolution2D(64, (3, 3), padding='valid'))
model.add(Activation('relu'))

model.add(Convolution2D(64, (3, 3), padding='valid'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(256, kernel_initializer='uniform'))
model.add(Activation('relu'))

model.add(Dense(10, kernel_initializer='uniform'))
model.add(Activation('softmax'))

# Compile and train the model here.

adam = Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

history = model.fit(train_features, train_labels, epochs=2, batch_size=64, 
                    validation_data=(valid_features, valid_labels))

Train on 55000 samples, validate on 5000 samples
Epoch 1/2
Epoch 2/2


### Evaluate model on test data

In [8]:
metrics = model.evaluate(test_features, test_labels, batch_size=64)
for metric_i in range(len(model.metrics_names)):
    metric_name = model.metrics_names[metric_i]
    metric_value = metrics[metric_i]
    print('{}: {}'.format(metric_name, metric_value))

acc: 0.9836


## Keras functional API

The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.

* A layer instance is callable (on a tensor), and it returns a tensor
* Input tensor(s) and output tensor(s) can then be used to define a Model

```python
model = Model(inputs=input_img, outputs=out)
```

* Such a model can be trained just like Keras Sequential models.

In [21]:
# construct the network
from keras.models import Model

# Define input shape - this returns a tensor
input_img = Input(shape=(28, 28, 1))

# Add Convolutional and MaxPooling Layers with Dropout
c1 = Convolution2D(64, (3, 3))(input_img)
c1 = MaxPooling2D((2, 2))(c1)

c1_drop = Dropout(0.5)(c1)

c2 = Convolution2D(64, (3, 3))(c1_drop)
c2 = MaxPooling2D((2, 2))(c2)

# Add an Inception Module
tower_1 = Convolution2D(64, (1, 1), padding='same', activation='relu')(c2)
tower_1 = Convolution2D(64, (3, 3), padding='same', activation='relu')(tower_1)

tower_2 = Convolution2D(64, (1, 1), padding='same', activation='relu')(c2)
tower_2 = Convolution2D(64, (5, 5), padding='same', activation='relu')(tower_2)

tower_3 = MaxPooling2D((3, 3), strides=(1, 1), padding='same')(c2)
tower_3 = Convolution2D(64, (1, 1), padding='same', activation='relu')(tower_3)

inception = concatenate([tower_1, tower_2, tower_3], axis=1)

# Average Pool and dropout
av_pool = AveragePooling2D((3,3), strides=(1,1))(inception)
av_pool_drop = Dropout(0.5)(av_pool)

# Flatten 
flat = Flatten()(av_pool_drop)

# Add a fully connected layer
fc = Dense(256, activation='relu')(flat)

# Add output layer
out = Dense(10, activation='softmax')(fc)

func_model = Model(inputs=input_img, outputs=out)

# Compile and train the model here

adam = Adam(lr=0.0007, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
func_model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

history = func_model.fit(train_features, train_labels, epochs=2, batch_size=64, validation_split=0.15,
                         validation_data=(valid_features, valid_labels))

Train on 55000 samples, validate on 5000 samples
Epoch 1/2
Epoch 2/2


### Evaluate model on test data

In [22]:
# Evaluate model on test data

metrics = func_model.evaluate(test_features, test_labels, batch_size=64)
for metric_i in range(len(func_model.metrics_names)):
    metric_name = func_model.metrics_names[metric_i]
    metric_value = metrics[metric_i]
    print('{}: {}'.format(metric_name, metric_value))

acc: 0.9858
