# Basic Convnet for MNIST

### Conv2d Layer

A convnet takes as input, tensors of shape (image_height, image_width, image_channels) not including the batch dimension. In the case of MNIST, the image configuration is (28, 28, 1). 

```python
keras.layers.Conv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="valid",
    data_format=None,
    dilation_rate=(1, 1),
    groups=1,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)
```

This layer creates a convolution kernel that is convolved with the input layer to produce a tensor of outputs. If `use_bias` is `True`, a bias vector is created and added to the outputs. Finally, if `activation` is not `None`, it is applied to the outputs as well.

*Parameters*

* `filters`: Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution).

* `kernel_size`: An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.

* `strides`: An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying any dilation_rate value != 1.

* `activation`: Activation function to use. If you don't specify anything, no activation is applied (see keras.activations).

* `use_bias`: Boolean, whether the layer uses a bias vector.

### Max2D Pooling Layer

Downsamples the input representation by taking the maximum value over the window defined by `pool_size` for each dimension along the features axis. The window is shifted by `strides` in each dimension. 

```python
keras.layers.MaxPooling2D(
    pool_size=(2, 2), strides=None, padding="valid", data_format=None, **kwargs
)

```
*Parameters*

* `pool_size`: integer or tuple of 2 integers, window size over which to take the maximum. (2, 2) will take the max value over a 2x2 pooling window. If only one integer is specified, the same window length will be used for both dimensions.

* `strides`: Integer, tuple of 2 integers, or None. Strides values. Specifies how far the pooling window moves for each pooling step. If None, it will default to pool_size.

### Dense Layer

```python
    keras.layers.Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)
```
Note that ```use_bias=True``` by default.

In [1]:
from keras import layers
from keras import models

model = models.Sequential()

model.add(layers.Conv2D(filters=32, 
                        kernel_size=(3, 3), 
                        activation='relu', 
                        use_bias=False,
                        input_shape=(28, 28, 1)))

model.add(layers.MaxPooling2D(pool_size=(2, 2)))

model.add(layers.Conv2D(filters=64, 
                        kernel_size=(3, 3), 
                        activation='relu', 
                        use_bias=False))

model.add(layers.MaxPooling2D(pool_size=(2, 2)))

model.add(layers.Conv2D(filters=64, 
                        kernel_size=(3, 3), 
                        activation='relu', 
                        use_bias=False))

# Converting the tensor representation into vectors
model.add(layers.Flatten())
model.add(layers.Dense(units=64, activation='relu'))
model.add(layers.Dense(units=10, activation='softmax'))

In [2]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        288       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18432     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 3, 3, 64)          36864     
_________________________________________________________________
flatten (Flatten)            (None, 576)               0         
_________________________________________________________________
dense (Dense)                (None, 64)                3

### Number of Trainable Parameters

* input layer = 0

* conv2d layer = ```(kernel_size[0] * kernel_size[1] * number_filters_prev_layer + 1) * number_filters_current_layer.``` The +1 is required if ```use_bias=True```

* pool layer = 0

* dense layer = ```number_neurons_prev_layer * number_neurons_current_layer + 1 * number_neurons_current_layer```. The ```1 * number_neurons_current_layer``` is for the bias which is set to ```True``` by default.

## Train on MNIST Data

In [3]:
from keras.datasets import mnist
from keras.utils import to_categorical


(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f2224782040>

In [4]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
test_acc



0.9915000200271606

## Model With No Pooling

In [10]:
model_no_max_pool = models.Sequential()
model_no_max_pool.add(layers.Conv2D(filters=32, 
                                    kernel_size=(3, 3), 
                                    use_bias=False,
                                    activation='relu', 
                                    input_shape=(28, 28, 1)))

model_no_max_pool.add(layers.Conv2D(filters=64, 
                                    kernel_size=(3, 3),
                                    use_bias=False,
                                    activation='relu'))

model_no_max_pool.add(layers.Conv2D(filters=64, 
                                    kernel_size=(3, 3),
                                    use_bias=False,
                                    activation='relu'))
# Converting the tensor representation into vectors
model_no_max_pool.add(layers.Flatten())
model_no_max_pool.add(layers.Dense(units=64, activation='relu'))
model_no_max_pool.add(layers.Dense(units=10, activation='softmax'))

In [11]:
model_no_max_pool.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_9 (Conv2D)            (None, 26, 26, 32)        288       
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 24, 24, 64)        18432     
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 22, 22, 64)        36864     
_________________________________________________________________
flatten_1 (Flatten)          (None, 30976)             0         
_________________________________________________________________
dense_2 (Dense)              (None, 64)                1982528   
_________________________________________________________________
dense_3 (Dense)              (None, 10)                650       
Total params: 2,038,762
Trainable params: 2,038,762
Non-trainable params: 0
____________________________________________

In [14]:
model_no_max_pool.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model_no_max_pool.fit(train_images, train_labels, epochs=5, batch_size=64)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f212e9dda60>

In [15]:
test_loss, test_acc = model_no_max_pool.evaluate(test_images, test_labels)
test_acc



0.9896000027656555

The model without the max2d pooling layers were overfitted.