<a href="https://colab.research.google.com/github/aastha12/Deep-Learning/blob/main/CNN/Fashion_MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Create CNN Model and Optimize it using Keras Tuner

Make sure to change runtime to GPU otherwise code will take time to execute

In [1]:
!pip install keras-tuner

Collecting keras-tuner
[?25l  Downloading https://files.pythonhosted.org/packages/20/ec/1ef246787174b1e2bb591c95f29d3c1310070cad877824f907faba3dade9/keras-tuner-1.0.2.tar.gz (62kB)
[K     |█████▏                          | 10kB 23.9MB/s eta 0:00:01[K     |██████████▍                     | 20kB 29.0MB/s eta 0:00:01[K     |███████████████▋                | 30kB 32.8MB/s eta 0:00:01[K     |████████████████████▉           | 40kB 31.9MB/s eta 0:00:01[K     |██████████████████████████      | 51kB 28.0MB/s eta 0:00:01[K     |███████████████████████████████▎| 61kB 23.4MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 7.9MB/s 
Collecting terminaltables
  Downloading https://files.pythonhosted.org/packages/9b/c4/4a21174f32f8a7e1104798c445dacdc1d4df86f2f26722767034e4de4bff/terminaltables-3.1.0.tar.gz
Collecting colorama
  Downloading https://files.pythonhosted.org/packages/44/98/5b86278fbbf250d239ae0ecb724f8572af1c91f4a11edf4d36a206189440/colorama-0.4.4-py2.py3-none-an

In [2]:
import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow import keras

In [3]:
fashion_mnist=keras.datasets.fashion_mnist

In [4]:
(train_images,train_labels),(test_images,test_labels)=fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


In [5]:
train_images.shape

(60000, 28, 28)

In [6]:
test_images.shape

(10000, 28, 28)

There are 60K images in train set of size 28x28 and 10K images in test size with same size. Let's look at the first element of our train set.

In [7]:
train_images[0]

array([[  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
          0,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   1,
          0,   0,  13,  73,   0,   0,   1,   4,   0,   0,   0,   0,   1,
          1,   0],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   3,
          0,  36, 136, 127,  62,  54,   0,   0,   0,   1,   3,   4,   0,
          0,   3],
       [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   6,
          0, 102, 204, 176, 134, 144, 123,  23,   0,   0,   0,   0,  12,
         10,   0],
       [  

Each image in the dataset has pixel intensity ranging from 0 to 255, so we normalized images by dividing every pixel value by 255, and now the new scale ranging from 0 to 1.

In [8]:
train_images=train_images/255
test_images=test_images/255

#### Reshaping

We need to reshape the array in the form (m * n * channels). Since these are gray scale images, our channel will be 1.

CNN needs input of the format: number of samples, rows, columns,channels

In [9]:
train_images=train_images.reshape(train_images.shape +(1,))

In [10]:
train_images.shape

(60000, 28, 28, 1)

In [11]:
test_images=test_images.reshape(test_images.shape +(1,))

In [12]:
test_images.shape

(10000, 28, 28, 1)

#### Keras Tuner

##### Early Stopping and Model Checkpoint:

Source https://machinelearningmastery.com/how-to-stop-training-deep-neural-networks-at-the-right-time-using-early-stopping/

As soon as the loss of the model begins to increase on the test dataset, we will stop training by defining the early stopping callback.

You might also noticed that the accuracy of the model starts deteriorating towards the last few epochs. This means that although the performance of the model has improved, we may not have the best performing or most stable model at the end of training. In this case, we are interested in saving the model with the best accuracy on the test dataset. We can address this by using a ModelChecckpoint callback.



In [16]:
set(train_labels)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [37]:
def build_model(hp):
  model= keras.Sequential([
                                # Convolutional Layer #1
                                # Computes __ feature maps(decided by no. of filters) using a mxm filter with ReLU activation.
                                # Padding is added to preserve width and height.
                                # Input Tensor Shape: [batch_size, 28, 28, 1]
                                # Output Tensor Shape: [batch_size, 28, 28, no. of filters]
                           keras.layers.Conv2D(
                               filters=hp.Int('conv1_filters',min_value=32,max_value=128,step=16), #number of filters in convolution
                               kernel_size=hp.Choice('conv1_kernel_size',[3,5]),
                               strides=(1,1), 
                               padding='same',
                               kernel_initializer='lecun_normal',
                               activation='selu',
                               input_shape=(28,28,1)
                               ),
                                #Adding Pooling layers to reduce overfitting
                                # Pooling Layer #1
                                # First max pooling layer with a 2x2 filter and stride of 2
                                # Input Tensor Shape: [batch_size, 28, 28, no. of filters]
                                # Output Tensor Shape: [batch_size, 14, 14, no. of filters]                    
                           keras.layers.MaxPool2D(pool_size=(2,2),
                                                  strides=(2,2),
                                                  padding='same'),

                                # Convolutional Layer #2
                                # Computes __ feature maps(decided by no. of filters) using a mxm filter with ReLU activation.
                                # Padding is added to preserve width and height.
                                # Input Tensor Shape: [batch_size, 14, 14, no. of filters]
                                # Output Tensor Shape: [batch_size, 14, 14, no. of filters]                         
                            keras.layers.Conv2D(
                               filters=hp.Int('conv2_filters',min_value=32,max_value=64,step=16), #number of filters in convolution
                               kernel_size=hp.Choice('conv2_kernel_size',[3,5]),
                               strides=(1,1),
                               padding='same',
                               kernel_initializer='lecun_normal',
                               activation='selu'
                               ),    
                                #Adding Pooling layers to reduce overfitting
                                # Pooling Layer #2
                                # First max pooling layer with a 2x2 filter and stride of 2
                                # Input Tensor Shape: [batch_size, 14, 14, no. of filters]
                                # Output Tensor Shape: [batch_size, 7, 7, no. of filters]                    
                           keras.layers.MaxPool2D(pool_size=(2,2),
                                                  strides=(2,2),
                                                  padding='same'), 
                           keras.layers.Flatten(),
                           keras.layers.Dense(
                               units = hp.Int('Dense layer_1',min_value=32,max_value=512,step=16),
                               kernel_initializer='lecun_normal',
                               activation='selu'
                           ),
                                # Add dropout operation;  
                                # rate=Fraction of the input units to drop.
                          keras.layers.Dropout(rate=0.6 ,seed=123),
                          keras.layers.Dense(
                               units = hp.Int('Dense layer_2',min_value=32,max_value=512,step=16),
                               kernel_initializer='lecun_normal',
                               activation='selu'
                           ),
                                # Add dropout operation;  
                                # rate=Fraction of the input units to drop.
                          keras.layers.Dropout(rate=0.6 ,seed=123),                                               
                          keras.layers.Dense(units=10,activation='softmax') #output layer using softmax as we have 10 classes

                           
  ])

  model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp.Choice('learning_rate',values=[1e-2, 1e-3])),
                #since we have 10 integer classes use, sparse_categorical_crossentropy
                #read more here: https://keras.io/api/losses/probabilistic_losses/#sparse_categorical_crossentropy-function
                #for binary classes, use binary_crossentropy
                #for one hot encoded multiple classes, use categorical_crossentropy
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy']
                )
  return model

In [38]:
from kerastuner import BayesianOptimization

In [39]:
tuner_search=BayesianOptimization( build_model,
                          objective='val_accuracy',
                          max_trials=5,directory='output',project_name="MNIST Fashion", seed=123)

INFO:tensorflow:Reloading Oracle from existing project output/MNIST Fashion/oracle.json
INFO:tensorflow:Reloading Tuner from output/MNIST Fashion/tuner0.json


We will first use 5 epochs to get the best model and then train our best model on 100 epochs

In [40]:
tuner_search.search(train_images,train_labels,epochs=5,validation_split=0.1 )

INFO:tensorflow:Oracle triggered exit


In [41]:
tuner_search.search_space_summary()

Search space summary
Default search space size: 7
conv1_filters (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 128, 'step': 16, 'sampling': None}
conv1_kernel_size (Choice)
{'default': 3, 'conditions': [], 'values': [3, 5], 'ordered': True}
conv2_filters (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 64, 'step': 16, 'sampling': None}
conv2_kernel_size (Choice)
{'default': 3, 'conditions': [], 'values': [3, 5], 'ordered': True}
Dense layer_1 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 16, 'sampling': None}
Dense layer_2 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 16, 'sampling': None}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001], 'ordered': True}


In [42]:
model=tuner_search.get_best_models(num_models=1)[0]

In [23]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 28, 28, 96)        960       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 96)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 14, 14, 32)        27680     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 32)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1568)              0         
_________________________________________________________________
dense (Dense)                (None, 448)               702912    
_________________________________________________________________
dropout (Dropout)            (None, 448)               0

In [24]:
tuner_search.results_summary()

Results summary
Results in output/MNIST Fashion
Showing 10 best trials
Objective(name='val_accuracy', direction='max')
Trial summary
Hyperparameters:
conv1_filters: 96
conv1_kernel_size: 3
conv2_filters: 32
conv2_kernel_size: 3
Dense layer_1: 448
Dense layer_2: 208
learning_rate: 0.001
Score: 0.9048333168029785
Trial summary
Hyperparameters:
conv1_filters: 32
conv1_kernel_size: 3
conv2_filters: 32
conv2_kernel_size: 5
Dense layer_1: 32
Dense layer_2: 512
learning_rate: 0.001
Score: 0.8973333239555359
Trial summary
Hyperparameters:
conv1_filters: 32
conv1_kernel_size: 5
conv2_filters: 64
conv2_kernel_size: 3
Dense layer_1: 48
Dense layer_2: 480
learning_rate: 0.001
Score: 0.8961666822433472
Trial summary
Hyperparameters:
conv1_filters: 80
conv1_kernel_size: 5
conv2_filters: 48
conv2_kernel_size: 5
Dense layer_1: 272
Dense layer_2: 352
learning_rate: 0.001
Score: 0.8961666822433472
Trial summary
Hyperparameters:
conv1_filters: 128
conv1_kernel_size: 5
conv2_filters: 48
conv2_kernel_size:

In [25]:
model_history=model.fit(train_images,train_labels,epochs=100,validation_split=0.1,initial_epoch=3)

Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch

In [26]:
print(model_history.history.keys())

dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])


In [27]:
import plotly.express as px
import plotly.graph_objects as go

fig=go.Figure()

x=np.linspace(4, 100,endpoint=True)

fig.add_trace(go.Scatter(x=x, y=model_history.history['accuracy'],
                    mode='lines',
                    name='Training accuracy'))
fig.add_trace(go.Scatter(x=x, y=model_history.history['val_accuracy'],
                    mode='lines',
                    name='Validation accuracy'))
fig.add_trace(go.Scatter(x=x, y=model_history.history['loss'],
                    mode='lines',
                    name='Training loss'))
fig.add_trace(go.Scatter(x=x, y=model_history.history['val_loss'],
                    mode='lines',
                    name='Validation loss'))

fig.show()

#### Adding Early Stopping and ModelCheckpoint

In [45]:
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.models import load_model

# simple early stopping
"""
Source: https://machinelearningmastery.com/how-to-stop-training-deep-neural-networks-at-the-right-time-using-early-stopping/

“patience” argument:
In this case, we will wait 200 epochs before training is stopped. 
This means that we will allow training to continue for up to an additional 
200 epochs after the point that validation loss started to degrade, giving 
the training process an opportunity to get across flat spots or find some 
additional improvement.
"""
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=200)

"""
we will use accuracy on the validation in the ModelCheckpoint callback 
to save the best model observed during training
We could also seek the model with the best loss on the test dataset, 
but this may or may not correspond to the model with the best accuracy.
"""
mc = ModelCheckpoint('/content/best_model.h5', monitor='val_accuracy', mode='max', verbose=1, save_best_only=False)

#get best model from kerastuner
model=tuner_search.get_best_models(num_models=1)[0]

# fit model
model_history=model.fit(train_images,train_labels,epochs=4000,validation_split=0.1,initial_epoch=3, callbacks=[es, mc])

Epoch 4/4000
Epoch 00004: saving model to /content/best_model.h5
Epoch 5/4000
Epoch 00005: saving model to /content/best_model.h5
Epoch 6/4000
Epoch 00006: saving model to /content/best_model.h5
Epoch 7/4000
Epoch 00007: saving model to /content/best_model.h5
Epoch 8/4000
Epoch 00008: saving model to /content/best_model.h5
Epoch 9/4000
Epoch 00009: saving model to /content/best_model.h5
Epoch 10/4000
Epoch 00010: saving model to /content/best_model.h5
Epoch 11/4000
Epoch 00011: saving model to /content/best_model.h5
Epoch 12/4000
Epoch 00012: saving model to /content/best_model.h5
Epoch 13/4000
Epoch 00013: saving model to /content/best_model.h5
Epoch 14/4000
Epoch 00014: saving model to /content/best_model.h5
Epoch 15/4000
Epoch 00015: saving model to /content/best_model.h5
Epoch 16/4000
Epoch 00016: saving model to /content/best_model.h5
Epoch 17/4000
Epoch 00017: saving model to /content/best_model.h5
Epoch 18/4000
Epoch 00018: saving model to /content/best_model.h5
Epoch 19/4000
Ep

In [47]:
import plotly.express as px
import plotly.graph_objects as go

fig=go.Figure()

x=np.linspace(4, 100,endpoint=True)

fig.add_trace(go.Scatter(x=x, y=model_history.history['accuracy'],
                    mode='lines',
                    name='Training accuracy'))
fig.add_trace(go.Scatter(x=x, y=model_history.history['val_accuracy'],
                    mode='lines',
                    name='Validation accuracy'))
fig.add_trace(go.Scatter(x=x, y=model_history.history['loss'],
                    mode='lines',
                    name='Training loss'))
fig.add_trace(go.Scatter(x=x, y=model_history.history['val_loss'],
                    mode='lines',
                    name='Validation loss'))

fig.show()

In [48]:
# evaluate the model

train_acc = model.evaluate(train_images,train_labels, verbose=0)
test_acc = model.evaluate(test_images,test_labels, verbose=0)
print("Training Performance:",dict(zip(model.metrics_names, train_acc)))
print("Testing performance:",dict(zip(model.metrics_names, test_acc)))

Training Performance: {'loss': 0.1709606647491455, 'accuracy': 0.9675499796867371}
Testing performance: {'loss': 1.1751949787139893, 'accuracy': 0.9111999869346619}
