# Using MNIST dataset - database of handwritten digits
- **CNN model** 
- **CNN model with stride**
- **CNN model with padding**
- **CNN with stride and padding**

In [1]:
# !pip install tensorflow

In [2]:
# pip install keras

In [3]:
# import basic library
import tensorflow as tf
from tensorflow import keras
from keras.layers import Dense, Flatten, Conv2D
from keras.models import Sequential, Model
from keras.datasets import mnist

In [4]:
# loading and splitting data
(x_train, y_train),(x_test, y_test) = mnist.load_data()

In [5]:
x_train.shape

(60000, 28, 28)

In [6]:
x_test.shape

(10000, 28, 28)

# CNN Model

#### Conv2D requires sequential model
- **Conv2D Layer**: Requires a Sequential model to stack layers.

#### 16 filters weights are randomly initialised (He, Random, Xavier init)
- **Filters**: The Conv2D layer initializes 16 filters with random weights, using initialization methods like He, Random, or Xavier initialization.

#### 3*3 kernel means 3*3 matrix will be picked from i/p of shape 28*28 (*1) means grayscale img
- **Kernel Size**: A 3x3 matrix (kernel) is applied to the input of shape 28x28x1 (grayscale image).

#### No padding is used padding = valid so the output size will be smaller than the input.
- **Padding**: No padding is used (`padding='valid'`), resulting in an output size smaller than the input.

#### Flatten converts 2D matrix to a 1D vector
- **Flatten Layer**: Converts the 2D matrix output from Conv2D layers into a 1D vector.

#### Using Dense to perform classification based on the features extracted by the Conv2D layers
- **Dense Layers**: Fully connected layers that perform classification based on features extracted by the Conv2D layers.

#### Dense layer 1 as hidden layer 2 for o/p
- **Hidden Layer**: The first Dense layer acts as a hidden layer.
- **Output Layer**: The second Dense layer acts as the output layer.

#### Output layer 10 neurons for multiclass classification
- **Output Layer**: The final Dense layer has 10 neurons, suitable for multi-class classification (e.g., digit classification with 10 classes).


In [7]:
model = Sequential()
model.add(Conv2D(16, kernel_size=(3,3), activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(16, kernel_size=(3,3), activation='relu', padding='valid'))
model.add(Flatten())
model.add(Dense(16, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


### Conclusion : Shape reduced from (26, 26) to (24, 24) 

# CNN model with Padding

In [8]:
model1 = Sequential()
model1.add(Conv2D(16, kernel_size=(3,3), activation='relu', padding = "same", input_shape=(28,28,1)))
model1.add(Conv2D(16, kernel_size=(3,3), activation='relu', padding="same"))
model1.add(Flatten())
model1.add(Dense(16, activation='relu'))
model1.add(Dense(10, activation='softmax'))
model1.summary()

### Output shape is same as Input shape (28, 28)

# CNN Model with Stride
#### strides = 2 means the second step is shifted by the value 2 while picking 3*3 matrix from i/p matrix

In [9]:
model2 = Sequential()
model2.add(Conv2D(16, kernel_size=(3,3), activation='relu',strides = 2, padding = "valid", input_shape=(28,28,1)))
model2.add(Conv2D(16, kernel_size=(3,3), activation='relu',strides = 2, padding="valid"))
model2.add(Flatten())
model2.add(Dense(16, activation='relu'))
model2.add(Dense(10, activation='softmax'))
model2.summary()

### Conclusion : Output shape is reduced 

# CNN Model with stride and padding
#### using padding = 1 and strides = 2

In [10]:
model4 = Sequential()
model4.add(Conv2D(16, kernel_size=(3,3), activation='relu',strides = 2, padding = "same", input_shape=(28,28,1)))
model4.add(Conv2D(16, kernel_size=(3,3), activation='relu',strides = 2, padding="same"))
model4.add(Flatten())
model4.add(Dense(16, activation='relu'))
model4.add(Dense(10, activation='softmax'))
model4.summary()

### Data augmentation can be applied here if required to handle data loss due to stride
### if we are using data augmention layer then there is no need of pooling

# Pooling Layer

- **Purpose**: The pooling layer is used for translation invariance, which means it helps the model recognize features regardless of their location in the image.
- **Operation**: The effect of pooling depends on the stride and kernel size used.
- **Application**: Pooling layers should typically be applied after each convolutional layer.
- **Nature**: Pooling is a non-trainable operation; it does not involve learning parameters.

#### Max Pooling:
- **Introduction**: Max pooling became popular around 2012.
- **Rule**: The standard practice is to use a stride of 2 and a pool size of 2x2.
- **Mechanism**: It takes the maximum value from each 2x2 matrix of the input, effectively selecting the most prominent feature in that region (the clearest pixel).
- **Dimensionality Reduction**: Max pooling reduces the dimensionality of the feature map, which helps in decreasing the computational load in the subsequent layers.
- **Efficiency**: By reducing the dimensions, the model becomes more efficient and faster, especially when the reduced feature maps are fed into dense layers for final classification or regression tasks.


In [14]:
from keras.layers import MaxPooling2D

# CNN Model with MaxPooling layer

In [36]:
model5 = Sequential()
# conv layer and relu layer
model5.add(Conv2D(25, kernel_size=(3,3), activation='relu',strides = 2, padding = "same", input_shape=(28,28,1)))
# pooling layer 1
model5.add(MaxPooling2D(pool_size=(2,2), strides = 2))
# conv later 2 and relu layer
model5.add(Conv2D(25, kernel_size=(3,3), activation='relu',strides = 2, padding="same"))
# pooling layer 2
model5.add(MaxPooling2D(pool_size=(2,2), strides = 2))
# Flatten to convert 2D to 1D
model5.add(Flatten())
# fully connected layer
model5.add(Dense(128, activation='relu'))
# o/p layer 10 o/p for multiclass classifcation
model5.add(Dense(10, activation='softmax'))
model5.summary()

### Output Shape Calculation
- **Conv2D (conv2d_13)**- ((n-k+2p)+1)/s -> ((28-3+2*1)+1)/2 -> 14 -> 14,14,25
- **MaxPooling2D (max_pooling2d_5)**- ((n-k+2p)+1)/s -> ((14-2+2*0)+1)/2 -> ceiling value -> 7 -> 7,7,25
- **Conv2D (conv2d_14)**- ((n-k+2p)+1)/s -> ((7-3+2*1)+1)/2 -> 4 -> 4,4,25
- **MaxPooling2D (max_pooling2d_6)**- ((n-k+2p)+1)/s -> ((4-2+2*0)+1)/2 -> 2 ->2,2,25
- **Flatten (flatten_4)**- (2*2*25) -> 100
- **Dense (dense_8)**- 128 neurones
- **Dense (dense_9)**- 10 neurones 

### Parameter calculations:
- **Conv2D (conv2d_13)**- (Number of filters)*(Size of each filter)+(Number of filters)-> 25x3x3x1+25 -> 250 
- **MaxPooling2D (max_pooling2d_5)**- non trainable paramter -> 0 
- **Conv2D (conv2d_14)**-  (Number of filters)*(Size of each filter)+(Number of filters)-> 25x3x3x25+25 -> 5650 
- **MaxPooling2D (max_pooling2d_6)**- non trainable paramter -> 0 
- **Flatten (flatten_4)**- non trainable paramter -> 0 
- **Dense (dense_8)**- i/p*weight+bias -> 100x128+128 ->12928
- **Dense (dense_9)**- i/p*weight+bias -> 128x10+10 -> 1290

In [39]:
x_train.shape

(60000, 28, 28)

In [40]:
x_train = x_train.astype('float64')
x_test = x_test.astype('float64')
x_train /=255.0
x_test /=255.0

In [41]:
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)

In [42]:
# Compile the model
model5.compile(optimizer='Adam', loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])

In [43]:
# Fit the model
history = model5.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y_test))

Epoch 1/10
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - accuracy: 0.7875 - loss: 0.7267 - val_accuracy: 0.9595 - val_loss: 0.1295
Epoch 2/10
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.9627 - loss: 0.1244 - val_accuracy: 0.9730 - val_loss: 0.0861
Epoch 3/10
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.9717 - loss: 0.0900 - val_accuracy: 0.9779 - val_loss: 0.0687
Epoch 4/10
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.9787 - loss: 0.0713 - val_accuracy: 0.9795 - val_loss: 0.0649
Epoch 5/10
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.9809 - loss: 0.0634 - val_accuracy: 0.9801 - val_loss: 0.0653
Epoch 6/10
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.9834 - loss: 0.0527 - val_accuracy: 0.9823 - val_loss: 0.0515
Epoch 7/10
[1m938/938[0m 

## Using Cnn on Mnist dataset we are getting accuracy of 98%