<a href="https://colab.research.google.com/github/arkajyotiMukherjee/tensorflow_docs_prac/blob/master/MNIST_using_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#MNIST dataset classification using CNN

##Importing libraries
Importing the various libraries required for the classification.

**__ future __** : The future package will provide support for running your code on Python 2.6, 2.7, and 3.3+ mostly unchanged.

Rest of the imports are pretty much straight forward.

%%capture is a built-in magic command that hides the output of the packages installing

In [0]:
%%capture
from __future__ import absolute_import, division, print_function, unicode_literals

!pip install tensorflow-gpu==2.0.0-beta1
import tensorflow as tf
from tensorflow.keras import layers, datasets, models

##Download and prepare the MNIST dataset

In [0]:
# Downloading the data and splting it into train and test images
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()

# Reshape it into the following shapes
# The images are of (28,28,1) shapes : (width, height, channels)
# The channel size is 1 since the images are grayscale. If it were RGB it would have channel size of 3
print('Before: ', train_images.shape, test_images.shape)

train_images = train_images.reshape((60000,28,28,1))
test_images = test_images.reshape((10000,28,28,1))

print('After: ', train_images.shape, test_images.shape)

# Normalize the images so that it's pixel values remain between 0 and 1
# This helps some models perform better
train_images, test_images = train_images/255, test_images/255

Before:  (60000, 28, 28) (10000, 28, 28)
After:  (60000, 28, 28, 1) (10000, 28, 28, 1)


##Defaining the model
This model consists of :

*  **Convolutional** and **MaxPooling layers**.
The first layer takes a special argument input_shape to know to shape in which the input will be provided.
The subsecutent layers do not need this argument as those recieve its input from the previous layers so keras handles it. 
* After twolayers of Con2D and MaxPolling2D we use a **Flatten** layer to flatten the last Conv layer into a long vector
* A **Dense** layer is used that is ***fully connected*** to each of the inputs from the Flatten layer
* Finally a **Dense** layer of **10 units** (for the 0-9 digits in the MNIST dataset) is used with a **softmax activation**







In [0]:
model = models.Sequential([
    layers.Conv2D(filters=32, kernel_size=(3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D(pool_size=(2,2)),
    layers.Conv2D(filters=64, kernel_size=(3,3), activation='relu'),
    layers.MaxPooling2D(pool_size=(2,2)),
    layers.Conv2D(filters=64, kernel_size=(3,3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

Display the model summary/architecture.
This will also show you the number of parameters in use

In [0]:
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_9 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten_2 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 64)               

##Compile the model


*   **optimizer** : The function used for the minimization of the loss
*   **loss** : The end goal to minimize
*   **metrics** : List of metrics to be evaluated by the model during training and testing



In [0]:
model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy'])

##Train the model
Here we input the training date and it's labels to train the model. Epochs is the number of training interations the model will do with the entire dataset.

In [0]:
model.fit(train_images, train_labels, epochs=5)

Train on 60000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f2b22ba1518>

##Now we evaluate the model on unseen data

In [0]:
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print('Test accuracy : ', round(test_accuracy*100, 2), '%')

Test accuracy :  99.01 %


***Congratulations we have an accuracy of 99% above on unseen data of the MNIST dataset with our model***

(Results may vary with different runs but should generally be above 98%)