<a href="https://colab.research.google.com/github/Tejes-Aulakh/Python/blob/main/Intro_to_AI_MNIST_Data_(part_1).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MNIST Data - Using Artificial Neural Networks (ANN)

The MNIST dataset is a widely recognized dataset in the field of machine learning and computer vision.

### Description:
The MNIST dataset consists of 70,000 images of handwritten digits, each represented as a 28x28-pixel grayscale image. This dataset is commonly used for training and testing image classification algorithms, particularly neural networks.

### Features:
The dataset includes the following features:

- **Pixel Values**: Each image is represented by 784 features, corresponding to the 28x28 pixels. Each pixel value ranges from 0 (black) to 255 (white), indicating the intensity of the pixel.

These features collectively represent the visual information of the handwritten digits.

### Target:
The target variable is the digit represented by each image, ranging from 0 to 9. This is a multi-class classification problem where the goal is to correctly identify the digit in each image.

### Data Structure:
- **Number of samples**: 70,000
- **Number of features**: 784 (28x28 pixels)
- **Number of classes**: 10 (digits 0 through 9)

### Example Data:
Here is a sample from the dataset:

| Pixel 1 | Pixel 2 | Pixel 3 | ... | Pixel 784 | Digit |
|---------|---------|---------|-----|-----------|-------|
| 0       | 0       | 0       | ... | 0         | 5     |
| 0       | 0       | 0       | ... | 0         | 0     |
| 0       | 0       | 0       | ... | 0         | 4     |

In this notebook we will apply Artificial Neural Networks (ANN) to classify the handwritten digits in the MNIST dataset. ANNs are powerful models capable of learning complex patterns in data, making them ideal for image recognition tasks. The results demonstrate the effectiveness of ANNs in accurately classifying handwritten digits, showcasing their utility in handling visual data.


# Import libraries

You're probably getting quite used to this now; we import all of the usual libraries - plus Keras from TensorFlow.

__TensorFlow__ is an open-source machine learning framework developed by Google. It provides a comprehensive ecosystem of tools, libraries, and community resources that enable researchers and developers to build and deploy machine learning applications efficiently. TensorFlow is particularly well-suited for numerical computation and large-scale machine learning tasks.

__Keras__ is a high-level neural networks API, written in Python, that runs on top of TensorFlow (as well as other frameworks like Theano and CNTK). It offers a user-friendly, modular, and extensible interface for building and training deep learning models. Keras simplifies the process of designing complex neural networks by providing intuitive functions and pre-built layers, making it accessible for both beginners and experts in machine learning.

In [None]:
import time

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split

from tensorflow import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras import layers

# Set Things Up

We just initialise the seed, as we have previously, to ensure reproducability. And then set the maximium number of columns to show to 28 and adjust the width for display purposes later on. The 0.1 will keep 10% for validation.

In [None]:
np.random.seed(42)
pd.set_option("display.max_columns", 28)
pd.set_option('display.width', 1000)
validation_split = 0.1


# Load the data
Just like the previous models we've looked at have been built into `sklearn`, MNIST is built into `Keras` so we can use a built-in load function. We can then use `train_test_split` built into sklearn to split the data.

In [None]:
(x_train_val, y_train_val), (x_test, y_test) = mnist.load_data()
x_train, x_val, y_train, y_val = train_test_split(x_train_val, y_train_val, test_size=validation_split, stratify=y_train_val, random_state=7)

# See what we've got
We can have a look at what that leaves us with.

In [None]:
print(f"There are {x_train.shape[0]} samples and {y_train.shape[0]} labels in the training dataset.")
print(f"Each data sample in the training dataset is an image that is {x_train.shape[1]} pixels by {x_train.shape[2]} pixels.")
print(f"There are {x_val.shape[0]} samples and labels in the validation dataset.")
print(f"There are {x_test.shape[0]} samples and labels in the test dataset.")

# View the data
Let's have a look at some of the input images in the training dataset.

In [None]:
i_start = 0  # Change this to a different whole number to view different examples.
n = 25
print(f"Data samples from {i_start} to {i_start+n-1}:")
plt.subplots(figsize=(10, 10))
for j in range(n):
    plt.subplot(5, 5, j+1)
    plt.imshow(x_train[i_start+j], cmap=plt.get_cmap("binary"))
    plt.title(f"Number {y_train[i_start+j]}")
    plt.xticks(ticks=[])
    plt.yticks(ticks=[])

You might, as a person, find some of these hard to read. So think about the problems the computer may have!

#View a single number
Let's have a closer look at one of the images.

Each image is 28 pixels by 28 pixels, so 28x28 mean there are 784 pixels in each image, each pixel being a feature. We'll also view the raw data as integers.

In [None]:
i = 0  # Again you can change this to another whole number to view a different image.
fig, ax = plt.subplots(figsize=(10, 10))
plt.imshow(x_train[i], cmap=plt.get_cmap("binary"))
plt.title(f"A handwritten number {y_train[i]}")
minor_ticks = np.linspace(0.5, 27.5, 28)
ax.set_xticks(minor_ticks, minor=True)
ax.set_yticks(minor_ticks, minor=True)
ax.grid(which="minor")
plt.xticks(ticks=range(28))
plt.yticks(ticks=range(28));
print(f"The raw data of sample {i}. Handwritten number {y_train[i]}:")
df = pd.DataFrame(x_train[i])
print(df)

Can you see how the raw data correlates with the printed number?

# Prepare the data
The data needs to be scaled and reshaped. The scaling makes the numbers 0..1 instead of 0..255 which is easier for the network to deal with, as well as converting the labels into a binary-type representation.

The images are then reshaped from _n_ 28x28 squares to _n_ 1x784 strips of data.



In [None]:
# Rescale the matrices of numbers so that they are 0 to 1 instead of 0 to 255.
x_train = x_train.astype("float32") / 255
x_val = x_val.astype("float32") / 255
x_test = x_test.astype("float32") / 255

# Convert each label from a number from 0 to 9 to a 1x10 vector of 0s and 1s
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_val = keras.utils.to_categorical(y_val, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# Reshape
x_train = np.reshape(x_train, (x_train.shape[0], 28*28))
x_val = np.reshape(x_val, (x_val.shape[0], 28*28))
x_test = np.reshape(x_test, (x_test.shape[0], 28*28))

You can see how this changes the data...

In [None]:
print(f"The raw data of sample {i}. Handwritten number {y_train[i]}:")
df = pd.DataFrame(x_train[i])
print(df)

# Constructing the ANN
In this section, we will create an artificial neural network (ANN). In the code, the ANN we build will be stored in a variable named `model`.

In [None]:
# Set the size of the input layer and the output layer.
num_input_nodes = 784  # Each image has 784 pixels (28 x 28): this is the input data size.
num_output_nodes = 10  # Ten output nodes for the ten possible outputs: 0, 1, 2, 3,...,9

In [None]:
# Set the size of the hidden layers
model = keras.Sequential(
    [
        # Input layer
        keras.Input(shape=(num_input_nodes,)),
        # Hidden layers
        layers.Dense(128, activation='relu', name="hidden_layer_1"),
        # Output layer
        layers.Dense(num_output_nodes, activation="softmax", name="final_layer"),
    ]
)

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

# Set the parameters that control the training process
batch_size = 128  # Then umber of data samples that are input to the ANN in each training step
iterations = 5  # The number of training steps (the number of times that the entire dataset is input to the ANN)

# Summarise the model
We can call up a summary of the model we have just built.

In [None]:
model.summary()