#  Keras Intro: Convolutional Models

Keras Documentation: https://keras.io

In this notebook we explore how to use Keras to implement Convolutional models

## Machine learning on images

In [None]:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

### MNIST

In [None]:
from keras.datasets import mnist

In [None]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [None]:
X_train.shape

In [None]:
X_test.shape

In [None]:
plt.imshow(X_train[0], cmap='gray')

In [None]:
X_train_flat = X_train.reshape(-1, 28*28)
X_test_flat = X_test.reshape(-1, 28*28)

In [None]:
X_train_flat.shape

In [None]:
X_train_sc = X_train_flat.astype('float32') / 255.0
X_test_sc = X_test_flat.astype('float32') / 255.0

In [None]:
from keras.utils.np_utils import to_categorical

In [None]:
y_train_cat = to_categorical(y_train)
y_test_cat = to_categorical(y_test)

In [None]:
y_train[0]

In [None]:
y_train_cat[0]

In [None]:
y_train_cat.shape

In [None]:
y_test_cat.shape

### Fully connected on images

In [None]:
from keras.models import Sequential
from keras.layers import Dense
import keras.backend as K

K.clear_session()

model = Sequential()
model.add(Dense(512, input_dim=28*28, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])


In [None]:
h = model.fit(X_train_sc, y_train_cat, batch_size=128, epochs=10, verbose=1, validation_split=0.3)

In [None]:
plt.plot(h.history['acc'])
plt.plot(h.history['val_acc'])
plt.legend(['Training', 'Validation'])
plt.title('Accuracy')
plt.xlabel('Epochs')

In [None]:
test_accuracy = model.evaluate(X_test_sc, y_test_cat)[1]
test_accuracy

## Convolutional layers

In [None]:
from keras.layers import Conv2D
from keras.layers import MaxPool2D
from keras.layers import Flatten, Activation

In [None]:
X_train_t = X_train_sc.reshape(-1, 28, 28, 1)
X_test_t = X_test_sc.reshape(-1, 28, 28, 1)

In [None]:
X_train_t.shape

In [None]:
K.clear_session()

model = Sequential()

model.add(Conv2D(32, (3, 3), input_shape=(28, 28, 1)))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Activation('relu'))

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dense(10, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
model.fit(X_train_t, y_train_cat, batch_size=128,
          epochs=2, verbose=1, validation_split=0.3)

In [None]:
model.evaluate(X_test_t, y_test_cat)

## Exercise

You've been hired by a shipping company to overhaul the way they route mail, parcels and packages. They want to build an image recognition system  capable of recognizing the digits in the zipcode on a package, so that it can be automatically routed to the correct location.
You are tasked to build the digit recognition system. Luckily, you can rely on the MNIST dataset for the intial training of your model!

Build a deep convolutional neural network with at least two convolutional and two pooling layers before the fully connected layer.

- Start from the network we have just built
- Insert a `Conv2D` layer after the first `MaxPool2D`, give it 64 filters.
- Insert a `MaxPool2D` after that one
- Insert an `Activation` layer
- retrain the model
- does performance improve?
- how many parameters does this new model have? More or less than the previous model? Why?
- how long did this second model take to train? Longer or shorter than the previous model? Why?
- did it perform better or worse than the previous model?