# Introduction

The MNIST dataset is a dataset of the images of thousands of handwritten digits in the size of 28*28 pixels. The goal of the task for this dataset is to build a digit recognizer which can classify the images into the ten digits from 0-9.

Check out an Exploratory Data Analysis of the MNIST dataset [here](http://varianceexplained.org/r/digit-eda/) to better understand the data.

![Look at the MNIST dataset](https://d3i71xaburhd42.cloudfront.net/243056f558c737cdd15e22f3015375446d959941/76-Figure4.6-1.png)

# Approach

Since it is an image classification problem and a multiple class one, we are going to use a Convolutional Neural Network for the task and use the Softmax activation function in the last layer to classify the images.


**Importing the dependencies**

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import mnist

**Loading the Dataset**

In [None]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

**Now we will check the shape of the data. We will check how much big the training and test dataset are.**

In [None]:
train_images.shape

In [None]:
len(train_labels)

In [None]:
train_labels

In [None]:
# checking shape of the test data
test_images.shape

In [None]:
# length of the test label set
len(test_labels)

# Architecture

We are going to use convolutions since we are using a CNN. The convolution layers are used to help the computer determine features that could be missed in simply flattening an image into its pixel values. The convolution layers are typically split into two sections, convolutions and pooling. For pooling, we are going to use a Max Pooling Layers. In the final steps, we are going to feed the results of our CNN into a Dense layer in which, we are goint to use the softmax activation function to classify the digit images.

In [None]:
# importing the models
from tensorflow.keras import models

In [None]:
# the model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

In [None]:
# display architecture of the convnet model
model.summary()

In [None]:
# flatten the output of the last conv2d layer into a densely connected layer neural network
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax')) # 10 for the softmax layer to classify 0-9

In [None]:
model.summary()

In [None]:
from tensorflow.keras.utils import to_categorical

In [None]:
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# compiling and running the model

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64)

# evaluation of the model on test set
test_loss, test_acc = model.evaluate(test_images, test_labels)

# Results

In [None]:
print(test_acc)

With that, this kernel comes to an end.

For more reference:

[Different Activation Functions](https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/#:~:text=Activation%20functions%20are%20mathematical%20equations,relevant%20for%20the%20model's%20prediction.)

[Convolutions for Deep Learning](https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-learning-1f6f42faee1)

[More on Convoltional Neural Networks](http://https://medium.com/technologymadeeasy/the-best-explanation-of-convolutional-neural-networks-on-the-internet-fbb8b1ad5df8#:~:text=CNNs%2C%20like%20neural%20networks%2C%20are,and%20responds%20with%20an%20output.)

[A look at different Optimizers like RMSprop and Gradient Descent](https://towardsdatascience.com/a-look-at-gradient-descent-and-rmsprop-optimizers-f77d483ef08b)

[Understanding Convolutions and Pooling](https://towardsdatascience.com/understanding-convolutions-and-pooling-in-neural-networks-a-simple-explanation-885a2d78f211)

[Loss Functions](https://towardsdatascience.com/cross-entropy-for-classification-d98e7f974451)

[EDA of the MNIST dataset](http://varianceexplained.org/r/digit-eda/)

Finally, if you liked the kernel please cast an upvote to support me which helps us stay motivated.