<a href="https://colab.research.google.com/github/aakankshch/deeplearning/blob/main/Neural_Network_MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Multi-layer perceptrons with Keras

In this demo, we will train and test a multi-layer perceptron model on the MNIST handwritten digits dataset using Keras.

## 1. Load dataset

Like sklearn, Keras also provides an API to download and load the MNIST dataset. The following code snippet will download the data, load it into memory, and convert pixel values to [0, 1].

In [None]:
from tensorflow import keras

In [None]:
from keras.datasets import mnist
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()
X_train = X_train.astype('float32') / 255
X_test = X_test.astype('float32') / 255

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
X_train.shape

(60000, 28, 28)

In [None]:
Y_train.shape

(60000, 10)

## 2. Flatten the inputs into vectors

When we load the data with Keras, the images are not flattened. So we flatten them into vectors to train with an MLP model.

In [None]:
X_train = X_train.reshape(X_train.shape[0],784)
X_test = X_test.reshape(X_test.shape[0],784)

In [None]:
X_train.shape

(60000, 784)

## 3. Convert label vectors into one-hot encodings

When using Keras for classification, the labels have to be converted into one-hot encoding vectors. We can do this using the [to_categorial](https://www.tensorflow.org/api_docs/python/tf/keras/utils/to_categorical) method.

In [None]:
#Labels are outputs
num_classes=10
Y_train = keras.utils.to_categorical(Y_train, num_classes)
Y_test = keras.utils.to_categorical(Y_test, num_classes)

In [None]:
Y_test

array([[0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

## 4. Define the MLP model

We can define an MLP model using a [Sequential](https://keras.io/api/models/sequential/) model and the [Dense](https://keras.io/api/layers/core_layers/dense/) layers. In most cases, we will define a model as a Sequential model, and then add layers to it one-by-one.

In [None]:
from keras.models import Sequential
from keras.layers import Dense

In [None]:
model = keras.Sequential()
model.add(Dense(50, activation = 'relu'))
model.add(Dense(50, activation = 'relu'))
model.add(Dense(10, activation = 'softmax')) #Activation is softmax when there is multiple outputs

## 5. Compile the model

Before training a Keras model, we need to compile it to set up all the options for training, such as loss function, optimizer, and evaluation metrics. Here we will use cross entropy loss and the SGD optimizer. Our evaluation metric will be accuracy.

In [None]:
model.compile(loss="categorical_crossentropy", optimizer="SGD", metrics=["accuracy"])

## 6. Train the model

Now we can train the model using the `fit(...)` method. We can specify the number of epochs and batch size for training.

In [None]:
batch_size = 128
epochs = 15

In [None]:
model.fit(X_train, Y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<keras.src.callbacks.History at 0x7f24583b3e80>

## 7. Evaluate the trained model on test set

Finally, we can compute the model accuracy on the test set.

In [None]:
score = model.evaluate(X_test, Y_test, verbose=0)
print("Test accuracy:", score[1])

Test accuracy: 0.9330999851226807
