# Part 2 - Convolutional and ResNets

## Two parts
Convolutional Neural Networks (CNN) are a type of neural network that can be viewed as consisting
of two parts, a frontend and a backend. The backend is a deep neural network (DNN), which we
have already covered. The name convolutional neural network comes from the frontend, referred to
as a convolutional layer(s). The frontend acts as a preprocessor. 

## Downsampling (resize)

If we reduce the image resolution too far, at some point we may lose the
ability to distinguish clearly what's in the image - it becomes fuzzy and/or has artifacts. So, the first
step is to reduce the resolution down to the level that we still have enough details. The common
convention for everyday computer vision is around 224 x 224

## Convolutions and Strides
Typical filter sizes are 3x3 and 5x5, with 3x3 the most
common. The number of filters varies more, but they are typically multiples of 16, such as 16, 32
or 64 are the most common. Additionally, one specifies a stride. The stride is the rate that the
filter is slid across the image. In a stride of 3, there would be no overlap. Most common
practice is to use strides of 1 and 2.

the common practice is to keep the same or
increase the number of filters on deeper layers, and to use stride of 1 on the first layer and 2 on
deeper layers. The increase in filters provides the means to go from coarse detection of features
to more detailed detection within coarse features, while the increase in stride offsets the
increase in size of retained data.
More Filters => More Data
Bigger Strides => Less Data

## Pooling

The next step is to reduce the total amount of data, while retaining the features detected and
corresponding spatial relationship between the detected features.
This step is referred to as pooling. Pooling is the same as downsampling (or sub-sampling); whereby
the feature maps are resized to a smaller dimension using either max (downsampling) or mean
(sub-sampling) pixel average within the feature map. In pooling, we set the size of the area to pool
as a NxM matrix as well as a stride. The common practice is a 2x2 pool size with a stride of 2. This
will result in a 75% reduction in pixel data, while still preserving enough resolution that the detected
features are not lost through pooling.

## Flattening
For example, if we have 16 pooled maps of size 20x20 and three channels per pooled map
(e.g., RGB channels in color image), our 1D vector size will be 16 x 20 x 20 x 3 = 19,200
elements.

# Basic CNN

In [1]:
# Keras's Neural Network components
from keras.models import Sequential
from keras.layers import Dense, ReLU, Activation
# Kera's Convolutional Neural Network components
from keras.layers import Conv2D, MaxPooling2D, Flatten

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [3]:
model = Sequential()
# Create a convolutional layer with 16 3x3 filters and stride of two as the input
# layer

# Frontend
model.add(Conv2D(16, kernel_size=(3, 3), strides=(2, 2), padding="same",
input_shape=(128,128,1)))
# Pass the output (feature maps) from the input layer (convolution) through a
# rectified linear unit activation function.
model.add(ReLU())
# Add a pooling layer to max pool (downsample) the feature maps into smaller pooled
# feature maps
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# Add a flattening layer to flatten the pooled feature maps to a 1D input vector
# for the DNN classifier

# Backend
model.add(Flatten())
# Add the input layer for the DNN, which is connected to the flattening layer of
# the convolutional frontend
model.add(Dense(512))
model.add(ReLU())
# Add the output layer for classifying the 26 hand signed letters
model.add(Dense(26))
model.add(Activation('softmax'))
# Use the Categorical Cross Entropy loss function for a Multi-Class Classifier.
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])

In [4]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 64, 64, 16)        160       
_________________________________________________________________
re_lu_1 (ReLU)               (None, 64, 64, 16)        0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 32, 32, 16)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 16384)             0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               8389120   
_________________________________________________________________
re_lu_2 (ReLU)               (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 26)                13338     
__________

## With activation in dense classes

In [5]:
# Keras's Neural Network components
from keras.models import Sequential
from keras.layers import Dense
# Kera's Convolutional Neural Network components
from keras.layers import Conv2D, MaxPooling2D, Flatten
model = Sequential()
# Create a convolutional layer with 16 3x3 filters and stride of two as the input
# layer
model.add(Conv2D(16, kernel_size=(3, 3), strides=(2, 2), padding="same",
activation='relu', input_shape=(128,128, 1)))
# Add a pooling layer to max pool (downsample) the feature maps into smaller pooled
# feature maps
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
# Add a flattening layer to flatten the pooled feature maps to a 1D input vector
# for the DNN
model.add(Flatten())
# Create the input layer for the DNN, which is connected to the flattening layer of
# the convolutional front-end
model.add(Dense(512, activation='relu'))
model.add(Dense(26, activation='softmax'))
# Use the Categorical Cross Entropy loss function for a Multi-Class Classifier.
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

## Functional method

In [6]:
from keras import Input, Model
from keras.layers import Dense
from keras.layers import Conv2D, MaxPooling2D, Flatten
# Create the input vector (128 x 128).
inputs = Input(shape=(128, 128, 1))
layer = Conv2D(16, kernel_size=(3, 3), strides=(2, 2), padding="same",
activation='relu')(inputs)
layer = MaxPooling2D(pool_size=(2, 2), strides=(2, 2))(layer)
layer = Flatten()(layer)
layer = Dense(512, activation='relu')(layer)
output = Dense(26, activation='softmax')(layer)
# Now let's create the neural network, specifying the input layer and output layer.
model = Model(inputs, output)