# 5.1 Introduction to Convnets

## Convnets: convolutional nerual networks

### We first see the MNIST example using the basic convnet, a stack of *Conv2D* and *MaxPooling2D* layers ---- even though the convnet is basic, its accuracy will outperform the densely connected model from chapter 2

In [1]:
# Instantiating a small convnet

from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation = 'relu', input_shape = (28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation = 'relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation = 'relu'))

Using TensorFlow backend.


Instructions for updating:
If using Keras pass *_constraint arguments to layers.


### A convnet takes as input tensors of shape *(image_height, image_width, image_channels)* (not including the batch dimension). In this case, we configure the convnet to process inputs of size (28, 28, 1), which is the format of MNIST images

### Let's display the architecture of the convnet so far:

In [2]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 3, 3, 64)          36928     
Total params: 55,744
Trainable params: 55,744
Non-trainable params: 0
_________________________________________________________________


### Here the output of every Conv2D and MaxPooling2D layer is a 3D tensor of shape (height, width, channels). The width and height dimensions tend to shrink as we go deeper in the network and the number of channels is controlled by the first argument passed to the Conv2D layers (32 or 64)

### The next step is to feed the last output tensor (shape(3, 3, 64)) into a densely connected classifier network, which process vectors(1D), whereas the current output is a 3D tensor. First we need to flatten the 3D outputs and then add a few Dense layers on top.

In [3]:
model.add(layers.Flatten())
model.add(layers.Dense(64, activation = 'relu'))
model.add(layers.Dense(10, activation = 'softmax'))

### we will do a 10-way classification, using a final layer with 10 outputs and a softmax activation. Here's what the network looks like now:

In [4]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten_1 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)               

### As you see, the (3, 3, 64) outputs are flattened into vectors of shape (576, ) (3 &times; 3 &times; 64) before going through two Dense layers.

### Now let's train  the convnet on the MNIST digits to see its performance

In [5]:
from keras.datasets import mnist
from keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

model.compile(optimizer = 'rmsprop',
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

In [None]:
model.fit(train_images, train_labels, epochs = 5, batch_size = 64)