# AIM: Design a CNN architecture to implement the image classification task over an image dataset. Perform the Hyper-parameter tuning and record the results.



## Database
* The data that will be incorporated is the **MNIST database** which contains 60,000 images for training and 10,000 test images.
* The dataset consists of small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9
* The MNIST dataset is conveniently bundled within Keras, and we can easily analyze some of its features in Python.

In [None]:
from tensorflow import keras
from keras.datasets import mnist     # MNIST dataset is included in Keras
(X_train, y_train), (X_test, y_test) = mnist.load_data()

print("X_train shape", X_train.shape)
print("y_train shape", y_train.shape)
print("X_test shape", X_test.shape)
print("y_test shape", y_test.shape)

In [None]:
# Visualize any random image
import matplotlib.pyplot as plt
i=50;
plt.imshow(X_train[i], cmap='gray');

### Formatting the Input

In [None]:
# Single-channel input data (grey-scale)
# First apply convolutions then flatten

X_train = X_train.reshape(60000, 28, 28, 1) # single-channel input
X_test = X_test.reshape(10000, 28, 28, 1)

X_train = X_train.astype('float32')         # change integers to 32-bit floating point numbers
X_test = X_test.astype('float32')

X_train /= 255                              # min-max normalization
X_test /= 255

print("Training matrix shape", X_train.shape)
print("Testing matrix shape", X_test.shape)

# Convolutional Neural Network

* Convolution applies **kernels** (filters) that traverse through each image and generate **feature maps**
* keras Conv2D:  https://keras.io/api/layers/convolution_layers/convolution2d/
* Each kernel in a CNN learns a different characteristic of an image.
* **max pooling** helps in reducing the number of learnable parameters, and decreasing the computational cost (e.g. system memory)

## Building a Convolutional Neural Network

In [None]:
from keras import backend as K
from keras import __version__

print('Using Keras version:', __version__, 'backend:', K.backend())

In [None]:
# import cnn layers
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense
import tensorflow as tf

In [None]:
model = Sequential()                                 # Linear stacking of layers

# Convolution Layer 1: 8 filters, kernel size 3x3, relu activation, valid padding, stride 1

# MaxPooling: pool size 2, stride 2

# Convolution Layer 2: 16 filters, kernel size 3x3, relu activation, valid padding, stride 1

# MaxPooling: pool size 2, stride 2

# Flatten final feature matrix into a 1d array

# Fully Connected Layer: 64 units and relu activation

# Dropout layer, 0.2 rate

# Final output dense Layer

#Compile the model with sparse_categorical_crossentropy loss


In [None]:
model.summary()

In [None]:
# Conv1: 3x3 kernels, one for each the single channel, 8 such filters and 8 biases
print('Conv1: ',3*3*1*8 + 8)
# Conv2: 3x3 kernels, one for each of the 8 channels, 16 such filters and 16 biases
print('Conv2: ',3*3*8*16 + 16)
# input to dense layer
print('Flatten:', 5*5*16)
# 400 inputs, 1 bias connected to each of 64 units in dense layer
print('Dense1: ',400*64+64)
# 64 inputs, 1 bias connected to each of 10 units in output layer
print('Dense2: ',64*10+10)

In [None]:
# Visualize the model
from keras.utils.vis_utils import plot_model
plot_model(model, show_shapes=True, show_layer_names=False)

#### Train the model

* Validation data =0.2*60,000 = 12,000
* Batch size = 128
* Number of batches during training are (60000-12000)/128 = 48000/128 = 375




In [None]:
# Train the model
batch_size=128
epochs=10
hist = model.fit(X_train, y_train,epochs=epochs,batch_size=batch_size,verbose=1,validation_split=0.2)

### Evaluate Model

In [None]:
score = model.evaluate(X_test, y_test, verbose = 0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

In [None]:
# make one prediction
print('Actual class:',y_test[0])
print('Class Probabilities:')
model.predict(X_test[0].reshape(1,28,28,1))

In [None]:
import numpy as np
yhat_test = np.argmax(model.predict(X_test),axis=-1);
print(yhat_test[0:10]);
print(y_test[0:10]);

In [None]:
from sklearn.metrics import accuracy_score
print('Accuracy:')
print(float(accuracy_score(y_test, yhat_test))*100,'%')

In [None]:
from sklearn.metrics import confusion_matrix
print('Confusion Matrix:')
print(confusion_matrix(y_test, yhat_test))

### Plot Learning curves

In [None]:
hist.history.keys()

In [None]:
# Plot Accuracy vs epochs (DIY)


In [None]:
# Plot Loss vs epochs (DIY)
