## MNIST Digit Recognition Model Training

This notebook trains a simple neural network model for classifying handwritten digits from the MNIST dataset using Keras 3.

### 1. Setup and Imports

In [1]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Input, Dense, Flatten
from keras.utils import to_categorical
import numpy as np
import os
from pathlib import Path

print(f"Keras version: {keras.__version__}")

Keras version: 3.4.1


### 2. Load and Preprocess the MNIST Dataset

In [2]:
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to be between 0 and 1
x_train, x_test = x_train / 255.0, x_test / 255.0

# One-hot encode the labels
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)

print(f"x_train shape: {x_train.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"x_test shape: {x_test.shape}")
print(f"y_test shape: {y_test.shape}")

x_train shape: (60000, 28, 28)
y_train shape: (60000, 10)
x_test shape: (10000, 28, 28)
y_test shape: (10000, 10)


### 3. Define the Model

We use a simple Sequential model with an Input layer, a Flatten layer, one Dense hidden layer, and an output Dense layer with softmax activation for 10 classes.

In [3]:
model = Sequential([
    Input(shape=(28, 28), name="input_layer"), # Keras 3 style: Input as a layer
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

model.summary()

### 4. Compile the Model

In [4]:
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

### 5. Train the Model

In [5]:
epochs = 5
batch_size = 32

history = model.fit(x_train, y_train, 
                    epochs=epochs, 
                    batch_size=batch_size, 
                    validation_data=(x_test, y_test))

Epoch 1/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 875us/step - accuracy: 0.8820 - loss: 0.4222 - val_accuracy: 0.9582 - val_loss: 0.1421
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 806us/step - accuracy: 0.9666 - loss: 0.1146 - val_accuracy: 0.9712 - val_loss: 0.0944
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 789us/step - accuracy: 0.9782 - loss: 0.0754 - val_accuracy: 0.9743 - val_loss: 0.0801
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 824us/step - accuracy: 0.9814 - loss: 0.0580 - val_accuracy: 0.9757 - val_loss: 0.0778
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 839us/step - accuracy: 0.9871 - loss: 0.0430 - val_accuracy: 0.9778 - val_loss: 0.0717


### 6. Evaluate the Model

In [6]:
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f'\nTest accuracy: {test_acc:.4f}')
print(f'Test loss: {test_loss:.4f}')

313/313 - 0s - 449us/step - accuracy: 0.9778 - loss: 0.0717

Test accuracy: 0.9778
Test loss: 0.0717


### 7. Save the Model

The model is saved in Keras's native `.keras` format. It will be saved to `../app/models/` relative to this notebook's directory (`project_root/app/models/`).

In [7]:
# Define the directory to save the model
# This path is relative to the notebook's location (notebooks/)
model_dir = Path('../app/models') 

# Create the directory if it doesn't exist
model_dir.mkdir(parents=True, exist_ok=True)

# Define the full path for the model file
model_filename = 'mnist_model.keras'
model_path = model_dir / model_filename

# Save the model
model.save(model_path)
print(f'Model saved to: {model_path.resolve()}')

Model saved to: D:\Abu Hassan\Documents\Python Projects\ml-deployment\app\models\mnist_model.keras


### 8. Test Loading the Saved Model and Make a Prediction (Optional)

In [8]:
try:
    loaded_model = keras.saving.load_model(model_path)
    print(f"Successfully loaded model from {model_path.resolve()}")
    
    # Take a sample from the test set
    sample_image = x_test[0]
    sample_label = y_test[0]
    
    # Model expects a batch of images, so add a batch dimension
    sample_image_batch = np.expand_dims(sample_image, axis=0)
    print(f"Sample image batch shape: {sample_image_batch.shape}")
    
    # Make a prediction
    prediction = loaded_model.predict(sample_image_batch)
    predicted_class = np.argmax(prediction[0])
    actual_class = np.argmax(sample_label)
    
    print(f"Predicted probabilities: {prediction[0]}")
    print(f"Predicted class: {predicted_class}")
    print(f"Actual class: {actual_class}")
    
    assert predicted_class == actual_class, "Prediction mismatch on test sample!"
    print("Test prediction matches actual label.")
    
except Exception as e:
    print(f"Error loading or testing model: {e}")

Successfully loaded model from D:\Abu Hassan\Documents\Python Projects\ml-deployment\app\models\mnist_model.keras
Sample image batch shape: (1, 28, 28)
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
Predicted probabilities: [1.0736654e-07 7.5545464e-10 6.7390797e-06 3.0726598e-05 1.0052134e-12
 1.0652690e-08 2.5737731e-14 9.9996078e-01 4.6570034e-07 1.2104141e-06]
Predicted class: 7
Actual class: 7
Test prediction matches actual label.
