## ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Convolutional Neural Networks I

### LEARNING OBJECTIVES
_By the end of this lesson, students should be able to:_
- Build convolutional neural networks in Keras.

We'll recreate a very similar neural network to the example provided at the end of the notes.

In [1]:
# 1. Import libraries and modules
import numpy as np
#np.random.seed(123)  # for reproducibility

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.datasets import mnist
 
# 2. Load pre-shuffled MNIST data into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Using TensorFlow backend.


Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz


In [2]:
# Preprocess input data - 1 because greyscale (3 if RGB)
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)

# Typecast as float
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# Rescale range from 0-255 to 0-1
X_train /= 255
X_test /= 255

In [3]:
# Make y categorical
Y_train = np_utils.to_categorical(y_train, 10)
Y_test = np_utils.to_categorical(y_test, 10)

In [4]:
# Define model architecture
model = Sequential()

model.add(Convolution2D(6, #6 filters
                        3, # kernel size (3x3 filter)
                        activation='relu',
                        input_shape=(28, 28, 1)
                        ))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(16,
                        3,
                        activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten()) # 400 outputs
model.add(Dense(128, activation='relu')) # connect 400 nodes to 128 nodes
# total nodes = 400 * 128 + 128 = 51,328 parameters
model.add(Dropout(0.5))
model.add(Dense(10, activation=('softmax')))

In [5]:
# Compile model
model.compile('adam',
              'categorical_crossentropy', # good for unordered discrete predictions
              ['accuracy']) 

In [6]:
# Fit model
model.fit(X_train, Y_train, 32, 10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0xb1fe96588>

In [7]:
# Evaluate model

score = model.evaluate(X_test, Y_test, verbose=1)
labels = model.metrics_names



In [8]:
print(str(labels[0]) + ': ' + str(score[0]))
print(str(labels[1]) + ': ' + str(score[1]))

loss: 0.03374472677594531
acc: 0.9899


In [9]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 26, 26, 6)         60        
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 6)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 16)        880       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 16)          0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 5, 5, 16)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 400)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               51328     
__________

- 3x3 filters --> 9 parameters per filter
- 6 filters --> 54 parameters across all filters
- 6 filters x 1 bias parameter per filter --> 6 bias parameters
    - Total number of parameters: 60


- 3x3 filters --> 9 parameters per filter
- 16 filters --> 16 x 9 = 144 parameters
- 144 parameters x 6 channels from the previous layer --> 864 params
- 16 filters x 1 bias parameter per filter --> 16 params
    - 864 + 16 = 880

## Conclusion

<details><summary>Why are neural networks better equipped to handle image data than non-neural networks?
</summary>
```
Neural networks are naturally set up to consider interactions among features.
```
</details>

<details><summary>Why are **convolutional neural networks** better equipped to handle image data than non-CNNs?
</summary>
```
CNNs are naturally set up to consider interactions among "close pixels" only and drastically cuts down the number of parameters needed to learn through parameter sharing.
```
</details>

