# [INSERT PROJECT TITLE]

[INSERT ROJECT DESCRIPTION]

## 1.0: Data ingestion

Load the MNIST handwritten digits data.

In [None]:
import keras

mnist = keras.datasets.mnist
(x_train,y_train),(x_test,y_test) = mnist.load_data()


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


### 1.1: Inspect the data

In [12]:
print(f"No. training data = {x_train.shape[0]}")
print(f"No. test data = {x_test.shape[0]}")
print(f"Each training/test data is an image of {x_train.shape[1]} x {x_train.shape[2]} bits.")
print(f"Y data correspond to a digit which is depicted in a corresponding 28 x 28 image. For example, for 2nd image, y = {y_train[1]}")


No. training data = 60000
No. test data = 10000
Each training/test data is an image of 28 x 28 bits.
Y data correspond to a digit which is depicted in a corresponding 28 x 28 image. For example, for 2nd image, y = 0


## 2.0: Data preparation

### 2.1: Normalise pixel values

Each pixel is [0,255]. We need to make it symmetric around 0 and make the max and min smaller so our MLP trains more effectively. 

This means we have to convert them from [0,255] to [-1.1].

In [13]:
x_train, x_test = x_train / 127.5 - 1, x_test / 127.5 - 1

### 2.2: Flatten the input data

A Multi-Layer Perceptron (MLP) does not explicitly model spatial relationships in images. Instead, it expects the input to be a one-dimensional feature vector. Therefore, each image must be flattened before being passed to the network. This means that instead of x_train.shape = (60000,28,28), it should be (60000,784)

In [18]:
import numpy as np

# Find the total number of pixels 
nb_features = np.prod(x_train.shape[1:])

# Change the shape from (n_train,28,28) to (n_train,784)
n_train = x_train.shape[0]
x_train.resize((n_train, nb_features))
print(f"After flattening, the x_train shape = {x_train.shape}")

# Change shape of test data 
n_test = x_test.shape[0]
x_test.resize((n_test, nb_features))
print(f"After flattening, the x_test shape = {x_test.shape}")

After flattening, the x_train shape = (60000, 784)
After flattening, the x_test shape = (10000, 784)
