# 8.1 - Intoduction to Convnets

* [8.1.1 - The convolution operation](#first-bullet)
* [8.1.2 - The max-pooling operation](#second-bullet)
* [7.3.3 - Writing your own callbacks](#third-bullet)
* [7.3.4 - Monitoring and visualization with TensorBoard](#fourth-bullet)

Below is a simple example on how to implement convolutions into a neural network.  `layers.Conv2D` is creating 32 filters, and `layers.MaxPooling2D` calculates the maximum value in each patch of the feature map.

In [2]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras import layers
from matplotlib import pyplot as plt

inputs = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(filters=32, kernel_size=3, activation='relu')(inputs)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation='relu')(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation='relu')(x)
x = layers.Flatten()(x)
outputs = layers.Dense(10, activation='softmax')(x)
model = keras.Model(inputs=inputs, outputs=outputs)


Model: "functional_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 3, 3, 128)         73856     
_________________________________________________________________
flatten_1 (Flatten)          (None, 1152)             

In [3]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255
model.compile(optimizer='rmsprop',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x1dce215b400>

In [4]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("Test accuracy: %.3f" % (test_acc,))

Test accuracy: 0.993


This is an improvement over a basic network of stacked `Dense` layers.

## 8.1.1 The convolution operation <a class="anchor" id="first-bullet"></a>

`Dense` layers learn global features, whereas `Conv2D` layers learn local features.  In the MNIST example, a model of only `Dense` layers is considering all the pixels from each image, and a `Conv2D` layer is looking local patterns for the image.

## 8.1.2 The max-pooling operation <a class="anchor" id="second-bullet"></a>

The role of `MaxPooling2D` is to downsample the feature maps created by `Conv2D`. 

What would happen if we removed the `MaxPooling2D` layers from our model?

In [5]:
inputs = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(filters=32, kernel_size=3, activation='relu')(inputs)
x = layers.Conv2D(filters=64, kernel_size=3, activation='relu')(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation='relu')(x)
x = layers.Flatten()(x)
outputs = layers.Dense(10, activation='softmax')(x)
model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)
model_no_max_pool.summary()

Model: "functional_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_3 (InputLayer)         [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 22, 22, 128)       73856     
_________________________________________________________________
flatten_2 (Flatten)          (None, 61952)             0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                619530    
Total params: 712,202
Trainable params: 712,202
Non-trainable params: 0
________________________________________________

Our feature map would have a total of 22 x 22 x 128 (61,952) total coefficients per sample, which is huge!  And, when flattening to pass to a `Dense` layer of size 10 we have over a half million parameters.  The model is far too large and complicated for our simple problem and will lead to overfitting.