# Section 5.1 in the book 'Deeplearning with Python" by Francois Challet

  1. CONVOLUTION 
  2. MAX POOLING  
  3. FEATURE MAPS

### 1. Model Convolutional Network

   Convnet taks input tensor of shape (image_height, image_width, number_channel) as input. \
   MNIST Images are of the shape (28, 28) so we reshape them to (28, 28, 1) \
   output of every convnet and MaxPooling is a 3D tensor of shape (height, width, channels) \
   number_channel is controlled by the number of filters in the Conv2D Layer.

In [15]:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3,3), activation='relu'))
model.add(layers.MaxPooling2D(2,2))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_18 (Conv2D)           (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_20 (Conv2D)           (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten_3 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 64)               

### 2. Training the ConvNet on MNIST Images.

    Dense Layers learn Gloabal Patterns. 
    Convolution Layers learn Local Patterns. 
    
    KeyCharacteristics of the Convolutional Layers.
    1. Patterns they learn are translational - invariant.
        After learning a pattern at one location, same pattern can be identified
        by the conv layer anywhere in the image
    2. They can learn spatial hierarchial of patterns. 
        Initial Layers learn the Edges, Texture and other properties.
        Higher Layers learn the ear shape, eye shape and all.
        
    ConvLayer operates over 3D tensor called 'FEATURE' maps, 
    with two 'SPATIAL' axes (height and width) and one 'CHANNEL' axis.
    
    Convolution takes a feature map and gives a feature map with 
    channel axis represent number of filters in the convolutional Layer. 
    
    RESPONSE MAP : 2D map of the presence of a pattern at different locations of a map.
    FEATURE MAP : every dimension in the depth axis is a feature map. 
    2D tensor output[:, :, n] is the 2D SPATIAL MAP of the response of this filter over the input.

In [16]:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

print(train_images.shape)
print(train_labels.shape)
print(test_images.shape)
print(train_labels.shape)

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', 
            metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(60000,)
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f86ee12b430>

### 3. MAXPOOLING OPERATION.

    Max Pooling consists of extracting windows from the input feature maps and outputting 
    the max value of each channel. 
    Max Polling is usually done with 2X2 window and Convolution is done with 3X3 Window.
    
    Let us consider removing the MaxPooling layer and check the Model once as shown below. 
    
    There are problems associated with the model
    
         1.  Third Layer feature 3X3 is only from the 7X7 window of the input image. 
             So it does not represent the spatial hierarchy which is the advantage of the Convolutional Layer.
         2.  final feature map has 36928 parameters associated with it.
             when converted into dense layer it will take around 15860224 parameters. 
             This is too big for a small network and would lead to overfitting

In [17]:
model_no_max_pool = models.Sequential()

model_no_max_pool.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model_no_max_pool.add(layers.Conv2D(64, (3,3), activation='relu'))
model_no_max_pool.add(layers.Conv2D(64, (3, 3), activation='relu'))

model_no_max_pool.add(layers.Flatten())
model_no_max_pool.add(layers.Dense(512, activation='relu'))
model_no_max_pool.add(layers.Dense(10, activation='softmax'))


model_no_max_pool.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_21 (Conv2D)           (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_22 (Conv2D)           (None, 24, 24, 64)        18496     
_________________________________________________________________
conv2d_23 (Conv2D)           (None, 22, 22, 64)        36928     
_________________________________________________________________
flatten_4 (Flatten)          (None, 30976)             0         
_________________________________________________________________
dense_8 (Dense)              (None, 512)               15860224  
_________________________________________________________________
dense_9 (Dense)              (None, 10)                5130      
Total params: 15,921,098
Trainable params: 15,921,098
Non-trainable params: 0
__________________________________________