#Demo 2 - Train a basic MLP on the MNIST dataset


## **Scenario: Handwritten Digit Recognition**
A startup working on digitizing old handwritten documents wants to build a fast, reliable model to automatically recognize handwritten digits (0â€“9).
They aim to use a simple Multilayer Perceptron (MLP) model to classify images from the MNIST dataset (which contains 28x28 grayscale images of handwritten digits).
Since this is an early prototype, the focus is on building and training a basic MLP without using complex architectures like CNNs.

## **Objectives:**
* Build and train a basic Multilayer Perceptron (MLP) model from scratch (or using PyTorch/Keras basic layers).

* Achieve at least 90% training accuracy.

* Understand the impact of hidden layers and activation functions (like ReLU) on model performance.

* Evaluate the model using metrics such as accuracy and loss curves.

## Step 1: Import Required Libraries
Import libraries for building the model, loading data, and evaluation.

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt

## Step 2: Load and Prepare the MNIST Dataset
Load the MNIST handwritten digit dataset from Keras datasets.

Normalize pixel values to range [0, 1] for faster convergence.


In [None]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Normalize the pixel values
x_train = x_train / 255.0
x_test = x_test / 255.0

## Step 3: Flatten the Images
Flatten each 28x28 image into a 784-dimensional vector.

This is necessary because MLPs expect 1D feature vectors as input.

In [None]:
x_train = x_train.reshape(-1, 28*28)
x_test = x_test.reshape(-1, 28*28)

## Step 4: Build the MLP Model
 Create a simple MLP with one hidden layer using ReLU activation.

 Output layer has 10 units (one for each digit) with softmax activation.

In [None]:
model = models.Sequential([
    layers.Dense(128, activation='relu', input_shape=(784,)),  # Hidden layer with 128 neurons
    layers.Dense(10, activation='softmax')                     # Output layer with 10 classes
])

## Step 5: Compile the Model
Define the optimizer, loss function, and evaluation metric for training.

In [None]:
model.compile(
    optimizer='adam',                          # Adam optimizer for faster convergence
    loss='sparse_categorical_crossentropy',     # Suitable loss for multi-class classification
    metrics=['accuracy']                        # Evaluate using accuracy
)

## Step 6: Train the Model
Train the model on the training data for 10 epochs.

 Store training history to visualize learning curves.

In [None]:
history = model.fit(x_train, y_train, epochs=10, validation_split=0.1)

## Step 8: Plot Accuracy and Loss Curves
 Visualize how model performance improved during training.

In [None]:
plt.figure(figsize=(12, 5))

# Accuracy curve
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Accuracy over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Loss curve
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Loss over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.show()