***Here in this notebook, I have created a CNN model with Conv2D, MaxPooling, Flattening and Dense layers. I have used Adam optimizer and used accuracy matrix for getting the results.
I have built a model on MNIST data set and played with some hyper parameter like number of filters, Maxpooling layer, dropout layers to check the model accuracy on training and testing data set.
Please go through the notebook and let’s have a look on loss and accuracy metrics, how it is impacting after changing some hyperparameter.***


In [12]:
import tensorflow as tf
from tensorflow import keras
from functools import partial

In [13]:
dataset= keras.datasets.fashion_mnist

In [14]:
(X_train_full, y_train_full), (X_test, y_test)= dataset.load_data()

In [15]:
X_train_full.shape

(60000, 28, 28)

In [16]:
X_valid, X_train= X_train_full[:5000]/255, X_train_full[5000:]/255
y_valid, y_train= y_train_full[:5000], y_train_full[5000:]

In [17]:
X_valid.shape

(5000, 28, 28)

In [18]:
y_valid.shape

(5000,)

***We have used a partial layer to use those parameters as a default parameter at Convolution layer. In the first model we have used MaxPooling layer after every 2 CNN layers and we have increased number of filters to its double than previous value. italicized text***

In [19]:
Convl2d= partial(keras.layers.Conv2D, kernel_size=3, activation='relu', padding='SAME')

In [20]:
model= keras.models.Sequential([
    Convl2d(filters=32, kernel_size=7, input_shape=[28,28,1]),
    keras.layers.MaxPooling2D(pool_size=2),
    Convl2d(filters=128),
    Convl2d(filters=128),
    keras.layers.MaxPool2D(pool_size=2),
    Convl2d(filters=256),
    Convl2d(filters=256),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation="relu"),
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dense(10, activation="softmax")
])

In [21]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_5 (Conv2D)           (None, 28, 28, 32)        1600      
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 14, 14, 32)       0         
 2D)                                                             
                                                                 
 conv2d_6 (Conv2D)           (None, 14, 14, 128)       36992     
                                                                 
 conv2d_7 (Conv2D)           (None, 14, 14, 128)       147584    
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 7, 7, 128)        0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 7, 7, 256)        

In [22]:
loss_fun= keras.losses.SparseCategoricalCrossentropy()
optimizer_fun= keras.optimizers.Adam()
model.compile(loss=loss_fun,
              optimizer=optimizer_fun,
              metrics=["accuracy"])

In [23]:
model.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x7f482a1de190>

***Training Accuracy= 98%***     ***Validation accuracy= 91***

***It seems that this model is overfitting. To reduce the overfitting, we can try adding dropout layer with 0.5% drop out rate after every Dense layer and check how it is behaving.*** 



In [24]:
model_1= keras.models.Sequential([
    Convl2d(filters=32, kernel_size=7, input_shape=[28,28,1]),
    keras.layers.MaxPooling2D(pool_size=2),
    Convl2d(filters=128),
    Convl2d(filters=128),
    keras.layers.MaxPool2D(pool_size=2),
    Convl2d(filters=256),
    Convl2d(filters=256),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation="relu"),
    keras.layers.Dropout(0.5),   
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(10, activation="softmax")
])

loss_fun= keras.losses.SparseCategoricalCrossentropy()
optimizer_fun= keras.optimizers.Adam()
model_1.compile(loss=loss_fun,
              optimizer=optimizer_fun,
              metrics=["accuracy"])

model_1.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x7f48137ca790>

***Training Accuracy=95***      ***Validation accuracy= 91***

***After using 2 drop out layers training accuracy has dropped significantly and came closer to the validation accuracy.***


***In the below model we have kept number of filter same everywhere which don’t have not much impact but generally we should double the size of the filters after using MaxPooling layers.***

***As we go deep in the CNN and after using MaxPooling layer we reduce the number of features significantly hence it is always better to use a greater number of filters to capture more data.***


In [25]:
model_2= keras.models.Sequential([
    Convl2d(filters=32, kernel_size=7, input_shape=[28,28,1]),
    keras.layers.MaxPooling2D(pool_size=2),
    Convl2d(filters=128),
    Convl2d(filters=128),
    keras.layers.MaxPool2D(pool_size=2),
    Convl2d(filters=128),
    Convl2d(filters=128),
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation="relu"),
    keras.layers.Dropout(0.5),   
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(10, activation="softmax")
])

loss_fun= keras.losses.SparseCategoricalCrossentropy()
optimizer_fun= keras.optimizers.Adam()
model_2.compile(loss=loss_fun,
              optimizer=optimizer_fun,
              metrics=["accuracy"])

model_2.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x7f48136a80a0>

In [26]:
model_3= keras.models.Sequential([
    Convl2d(filters=32, kernel_size=7, input_shape=[28,28,1]),
    keras.layers.MaxPooling2D(pool_size=2),
    Convl2d(filters=128),
    Convl2d(filters=128),
    keras.layers.MaxPool2D(pool_size=2),
    Convl2d(filters=256),
    Convl2d(filters=256),
    keras.layers.MaxPool2D(pool_size=2),
    Convl2d(filters=512),
    Convl2d(filters=512),
  
    keras.layers.Flatten(),
    keras.layers.Dense(128, activation="relu"),
    keras.layers.Dropout(0.5),   
    keras.layers.Dense(64, activation="relu"),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(10, activation="softmax")
])

loss_fun= keras.losses.SparseCategoricalCrossentropy()
optimizer_fun= keras.optimizers.Adam()
model_3.compile(loss=loss_fun,
              optimizer=optimizer_fun,
              metrics=["accuracy"])

model_3.fit(X_train, y_train, epochs=30, validation_data=(X_valid, y_valid))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x7f481340e3a0>

***Adding an extra layer has reduced the training accuracy and validation accuracy is not changed. ***

***In this note book we have played with very a smaller number of hyperparameter. In Deep Learning Architectures we do have lot of parameter to tweak and check the model performance. 
There are methods available like GridSearchCv, that we can use and play with the parameters also we can create custom functions and the best parameters.***
