<a href="https://colab.research.google.com/github/tyfmanlapaz/Data-Science/blob/deep-learning/01_DL_Implementation_with_MNIST_digit_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **1.2.1 Understanding math for DL with practical implementation**

To understand DL concepts in depth one needs to understand various mathematical concepts such as tensors, operations based on tensors, differentiation, integration, gradient descent, etc. Let us learn it practically.

For this we are considering the MNIST digit dataset.
Dataset: http://yann.lecun.com/exdb/mnist/

In [None]:
# adding seed for reproducibility
seed(123)

**Loading the MNIST dataset in Keras**

In [1]:
from keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


tr_images, tr_labels are used for training datasets and te_images and te_labels are used for the testing dataset.

The model will be learnt from the training dataset and it will be applied to the testing dataset to evaluate the model.

The whole image dataset is encoded as NumPy arrays and labels are an array of those digits which ranges from 0 to 9.

In [20]:
tr_images = x_train
tr_labels = y_train
te_images = x_test
te_labels = y_test

**Size of training dataset:**

The shape (60000, 28, 28) represents the dimensions of your tr_images dataset. Here’s what each number signifies:

* 60000: This is the number of images in your training dataset. Each image is
represented as a 2D array of pixels.
* 28: This represents the height of each image in pixels. Each image in the MNIST dataset is 28 pixels high.
* 28: This represents the width of each image in pixels. Each image in the MNIST dataset is 28 pixels wide.

So, in summary, your training dataset consists of 60,000 images, each of which is 28 pixels in height and 28 pixels in width. Each pixel in an image represents a grayscale intensity from 0 (white) to 255 (black).

In [21]:
tr_images.shape   # get dimension of train dataset

(60000, 28, 28)

In [22]:
len(tr_labels)    # get the number of elements/rows in the array

60000

In [23]:
tr_labels         # print array

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

**Size of testing dataset:**

In [24]:
te_images.shape   # get dimension of test dataset

(10000, 28, 28)

In [25]:
len(te_labels)    # get the number of elements/rows in the array

10000

In [26]:
te_labels        # print array

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)


**Building Network Architecture:**

In [28]:
from keras import layers
from keras import models

net = models.Sequential()
net.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
net.add(layers.Dense(10, activation='softmax'))

**Step by Step code explanation:**

This code is building a simple neural network model using Keras, a popular deep learning library in Python. Here’s what each part of the code does:

* from keras import layers, models: This imports the necessary modules from Keras. layers and models are two modules in Keras that are used to create neural network models.
* net = models.Sequential(): This initializes a new sequential model. Sequential is the easiest way to build a model in Keras. It allows you to build a model layer by layer. Each layer has weights that correspond to the layer that follows it.
* net.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,))): This adds the first layer to the neural network. This layer is a dense (also known as fully connected) layer with 512 neurons, and uses the ReLU (Rectified Linear Unit) activation function. The input_shape=(28 * 28,) specifies that the input to this layer is a 1D array of size 784 (since 28*28=784). This is because the MNIST images of size 28x28 are flattened to a 1D array before being fed to this layer.
* net.add(layers.Dense(10, activation='softmax')): This adds the second layer to the neural network. This layer is also a dense layer and it has 10 neurons, one for each of the output classes (0-9 for the MNIST dataset). The softmax activation function is used in this layer to ensure the output values are in the range of 0 and 1 and sum up to 1. This makes it possible to interpret the outputs as probabilities.


So, in summary, this code is creating a simple two-layer neural network for classifying the images in the MNIST dataset. The first layer has 512 neurons and uses the ReLU activation function, and the second layer has 10 neurons and uses the softmax activation function. The input images are flattened and fed into the first layer of the network. The output from the network will be a vector of 10 probabilities, one for each class.

**‘layer’** is the core building block of a neural network. It is a data processing module which filters the data. Layers are used to extract representations from the data fed to it. These extracted layers can also be arranged in chains which is a form of progressive data distillation. In the following code we have used loss function, optimiser, and metrics to monitor during training and testing.

In [29]:
net.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

This code is used to configure the learning process of the model before training it. Here’s what each part of the code does:

* optimizer='rmsprop': This sets the optimizer for the model. The optimizer is the algorithm that the model uses to adjust its weights based on the data it sees and its loss function. RMSprop is a popular optimizer that works well in practice and deals with some of the shortcomings of the simple stochastic gradient descent.
* loss='categorical_crossentropy': This sets the loss function for the model. The loss function is used to measure how well the model did on training, and thus how it should update its weights. Categorical cross entropy is a common loss function and is often used for multi-class classification. This function calculates the cross entropy loss between the true labels and the predicted labels.
* metrics=['accuracy']: This sets the list of metrics to be evaluated by the model during training and testing. In this case, we’re just interested in accuracy. The accuracy metric calculates the proportion of the correctly predicted labels to the total number of samples.

**Data Preparation:**

We have used a reshape function to change the dimension of the array without changing the data in the following code. This is the data preparation step.

In [30]:

tr_images = tr_images.reshape((60000, 28 * 28))
tr_images = tr_images.astype('float32') / 255

te_images = te_images.reshape((10000, 28 * 28))
te_images = te_images.astype('float32') / 255

Preparing labels of the dataset in categorical format:

In [34]:
from keras.utils import to_categorical

tr_labels = to_categorical(tr_labels)
te_labels = to_categorical(te_labels)

**Finally training the dataset using fit function:**

We have used two epochs. The model received accuracy of 92% in the first epoch and 96% in the second epoch.

In [38]:
net.fit(tr_images, tr_labels, epochs=2, batch_size=128)

Epoch 1/2
Epoch 2/2


<keras.src.callbacks.History at 0x7e2e001d5d50>

**Model Testing:**

In [39]:
te_loss, te_acc = net.evaluate(te_images, te_labels)



In [42]:
print('Test Accuracy:', round(te_acc, 2))

Test Accuracy: 0.97
