# Deep Learning Project: MNIST Classification using Keras and TensorFlow

# Table of Contents

- [Introduction](#introduction)
- [1. Dataset Preprocessing](#1-dataset-preprocessing)
- [2. Model Creation](#2-model-creation)
- [3. Model Training, Evaluation, and Testing](#3-model-training-evaluation-and-testing)
- [4. Fine-Tuned Model (Hyperparameter Tuning)](#4-fine-tuned-model)
- [5. Summary and Conclusion](#5-summary-and-conclusion)

## Introduction

This project demonstrates the development of a fully connected neural network using Keras and TensorFlow for image classification tasks. We employ the classic MNIST dataset of handwritten digits (0–9) to walk through an end-to-end deep learning workflow.

Key steps in this project include:
- **Dataset Preprocessing:** Loading the MNIST dataset, normalizing pixel values, and splitting the data into training, validation, and test sets.
- **Model Creation:** Building a Multi-Layer Perceptron (MLP) with an input layer, hidden layers (with dropout for regularization), Dropout layers for preventing overfitting and an output layer using Softmax activation.
- **Training and Evaluation:** Training the model on the training data, validating its performance during training, evaluating it on unseen test data and visualizing training history and predictions.
- **Fine-Tuning:** Using Keras Tuner for hyperparameter optimization, exploring different architectures and parameters, comparing performance between original and fine-tuned models and analyzing improvements through various metrics and visualizations.

This notebook provides a comprehensive introduction to deep learning techniques with practical insights into model design, training, and optimization.

## 1. Dataset Preprocessing

This section focuses on preparing the MNIST dataset for training our deep learning model.

First, we will load the MNIST dataset directly from Keras, which provides it pre-split into training and testing sets. We will then further split the training set into training and validation sets.

In [1]:
# Import the required libraries
import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import keras_tuner as kt
from sklearn.metrics import confusion_matrix
import seaborn as sns

In [2]:
# Set random seed for reproducibility
tf.random.set_seed(42)
np.random.seed(42)

In [None]:
# Load MNIST dataset
(x_train_full, y_train_full), (x_test, y_test) = keras.datasets.mnist.load_data()

print("Dataset loaded successfully.")

The MNIST dataset is a classic dataset in machine learning and computer vision. It consists of grayscale images of handwritten digits from 0 to 9.

*   **Number of samples**: 60,000 training images and 10,000 test images.
*   **Number of classes**: 10 (digits 0-9).
*   **Input dimensions**: Each image is 28x28 pixels.

Let's explore the shapes and data types of our loaded data.

In [None]:
# Print shapes and data types of training and test datasets to verify dimensions and data format
print("x_train_full shape:", x_train_full.shape)
print("y_train_full shape:", y_train_full.shape)
print("x_test shape:", x_test.shape)
print("y_test shape:", y_test.shape)
print("Data type of x_train_full:", x_train_full.dtype)
print("Data type of y_train_full:", y_train_full.dtype)

Next, we normalize the pixel values to be in the range \[0, 1]. Currently, pixel values are integers in the range \[0, 255]. Normalization helps in faster convergence during training.

In [None]:
# Normalize pixel values to range [0, 1]
x_train_full = x_train_full.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

print("Dataset normalized.")

Now, let's create a validation set from the training set. We'll use an 80-20 split for training and validation sets respectively. Since the original dataset is already split into training and test sets, we will split the original training set into training and validation sets. We will use 10,000 samples for validation and the remaining for training.

In [None]:
# Create a validation set out of the training data
num_train = int(len(x_train_full) * 0.8)

# Split training set into training and validation sets
x_train, x_valid = x_train_full[:num_train], x_train_full[num_train:]
y_train, y_valid = y_train_full[:num_train], y_train_full[num_train:]

print("Training, validation, and test sets created.")
print("x_train shape:", x_train.shape)
print("y_train shape:", y_train.shape)
print("x_valid shape:", x_valid.shape)
print("y_valid shape:", y_valid.shape)

We will visualize a few training samples and their respective labels to confirm that the dataset is loaded and processed correctly.

In [None]:
# Visualize a few samples
plt.figure(figsize=(10, 10))
for i in range(25):
    plt.subplot(5, 5, i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(x_train[i], cmap=plt.cm.binary)
    plt.xlabel(y_train[i])
plt.show()

## 2. Model Creation

In this section, we will build a fully connected neural network (Multi-Layer Perceptron - MLP) using Keras for image classification.

Our model architecture is as follows:

1. **Input Layer**: The input images (28x28 pixels) are flattened into a 1D array of 784 elements.
2. **Hidden Layer 1**: A dense layer with 128 neurons using the ReLU activation function along with L2 regularization (lambda=0.01) to promote better generalization.
3. **Dropout Layer**: A dropout layer with a rate of 0.2 to help prevent overfitting by randomly dropping 20% of the neurons during training.
4. **Hidden Layer 2**: A dense layer with 64 neurons using the ReLU activation function and L2 regularization (lambda=0.01).
5. **Dropout Layer**: Another dropout layer with a 20% drop rate.
6. **Output Layer**: A dense layer with 10 neurons (one per class) and a softmax activation function to output probabilities for each class.

Below is the Keras code used to build this model:

In [None]:
# Build the model
model = keras.Sequential([
    # Flatten the 28x28 input images into a 1D array
    keras.layers.Flatten(input_shape=(28, 28)),
    # First dense layer with 128 neurons, ReLU activation and L2 regularization
    keras.layers.Dense(128, activation='relu', kernel_regularizer=keras.regularizers.L2(0.01)),
    # Dropout layer to prevent overfitting by randomly dropping 20% of neurons
    keras.layers.Dropout(0.2),
    # Second dense layer with 64 neurons, ReLU activation and L2 regularization
    keras.layers.Dense(64, activation='relu', kernel_regularizer=keras.regularizers.L2(0.01)),
    # Another dropout layer with 20% drop rate
    keras.layers.Dropout(0.2),
    # Output layer with 10 neurons (for 10 classes) and softmax activation
    keras.layers.Dense(10, activation='softmax')
])

# Model summary to see the architecture
model.summary()

We will now compile the model. For compilation, we need to specify:

*   **Loss function**: `sparse_categorical_crossentropy` is used because we have sparse labels (integers) and multiple classes.
*   **Optimizer**: `adam` is a popular and efficient optimizer.
*   **Metrics**: `accuracy` to evaluate the performance of the model.

In [None]:
# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

print("Model compiled.")

## 3. Model Training, Evaluation, and Testing

Now, we will train the model using the training data and validate it using the validation data. We will train for 20 epochs and use a batch size of 64.

In [None]:
# Define training hyperparameters
epochs = 20      # Number of complete passes through the training dataset
batch_size = 64  # Number of samples processed before model update

# Train the model
history = model.fit(
    x_train, y_train, 
    epochs=epochs, 
    batch_size=batch_size,
    validation_data=(x_valid, y_valid),
    verbose = 1
)

print("Model training completed.")

### Visualizing Loss, Accuracy and Training History

The visualization of training history, including loss and accuracy metrics, is crucial for understanding model performance. By plotting these metrics over epochs, we can:

1. Monitor Training Progress:
   - Track how well the model is learning
   - Identify potential overfitting or underfitting
   - Determine optimal number of epochs

2. Key Metrics to Visualize:
   - Training Loss
   - Validation Loss
   - Training Accuracy
   - Validation Accuracy

3. Interpretation:
   - Decreasing loss indicates model improvement
   - Diverging training/validation metrics may signal overfitting
   - Plateauing metrics suggest learning saturation




Visualize the training and validation acuuracy over epochs to analyze model performance

In [None]:
 # Validation and Training Accuracy
plt.figure(figsize=(8, 5))
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()
plt.grid(True)
plt.show()

Visualize the training and validation loss curves to assess model performance and detect potential overfitting

In [None]:
# Validtation and Training Loss
plt.figure(figsize=(8, 5))
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.grid(True)
plt.show()

### Model Evaluation
After training, we evaluate the model's performance on the test set to measure how well it generalizes to unseen data. Finally, we will test the model on the test set and display predictions for a few test samples.

In [None]:
# Evaluate the model on the test set
test_loss, test_accuracy = model.evaluate(x_test, y_test)
print(f"Test Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.4f}")

Let's visualize the model's predictions on some test samples

This code snippet demonstrates how well our model performs by:
1. Making predictions on the first 10 test images
2. Displaying these images in a 2x5 grid
3. Showing both the predicted label and true label for each image
4. Using a binary colormap to display the grayscale MNIST digits

The visualization helps us quickly assess if the model's predictions match the actual digits.

In [None]:
# Make predictions for the first 10 test samples
predictions = model.predict(x_test[:10])
predicted_labels = np.argmax(predictions, axis=1)
true_labels = y_test[:10]

# Display predictions vs true labels
plt.figure(figsize=(12, 6))
for i in range(10):
    plt.subplot(2, 5, i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(x_test[i], cmap='gray')
    color = 'green' if predicted_labels[i] == true_labels[i] else 'red'
    plt.xlabel(f"Pred: {predicted_labels[i]}\nTrue: {true_labels[i]}", color=color)
plt.show()

## 4. Fine-Tuned Model

In this section, we will fine-tune our model using Keras Tuner to systematically search for optimal hyperparameters. We'll define a search space for various hyperparameters and let the tuner find the best combination.

We define a model builder function that allows the tuner to explore:

- Architecture Parameters:
Number of units in dense layers (32 to 512 units, step of 32)
Dropout rates (0.1 to 0.5, step of 0.1)
L2 regularization values (0.01, 0.1, 1.0)

- Training Parameters:
Choice of optimizers (adam, sgd, rmsprop)

The tuner will:

- Perform random search through the hyperparameter space
- Execute multiple trials with different combinations
- Evaluate performance using validation accuracy
- Select the best performing configuration

We will then:

- Train the best model configuration
- Compare its performance with the original model
- Visualize the differences in:
  Training and validation accuracy
  Training and validation loss
  Prediction accuracy on test samples
- Create detailed comparisons of model performances

This systematic approach to hyperparameter tuning should help us identify a more optimal model configuration compared to our initial implementation.

This section demonstrates the importance of hyperparameter tuning in deep learning and provides practical experience with automated tuning tools.

In [16]:
# Define hyperparameter space
def model_builder(hp):
    model = keras.Sequential([
        keras.layers.Flatten(input_shape=(28, 28)),
        keras.layers.Dense(units=hp.Int('dense_units_1',min_value=32,max_value=512,step=32),activation='relu',
                     kernel_regularizer=keras.regularizers.L2(hp.Choice('l2_value_1', values=[0.01, 0.1, 1.0]))),
        keras.layers.Dropout(rate=hp.Float('dropout_rate_1',min_value=0.1,max_value=0.5,step=0.1)),
        keras.layers.Dense(units=hp.Int('dense_units_2',min_value=32,max_value=512,step=32),activation='relu',
                     kernel_regularizer=keras.regularizers.L2(hp.Choice('l2_value_2', values=[0.01, 0.1, 1.0]))),
        keras.layers.Dropout(rate=hp.Float('dropout_rate_2',min_value=0.1,max_value=0.5,step=0.1)),
        keras.layers.Dense(10, activation='softmax')
    ])
    model.compile(optimizer=hp.Choice('optimizer', values=['adam', 'sgd', 'rmsprop']),
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    return model

In [None]:
# Perform hyperparameter tuning
tuner = kt.RandomSearch(model_builder,
                        objective='val_accuracy',
                        max_trials=10,
                        executions_per_trial=2,
                        directory='hyperparameter_tuning',
                        project_name='mnist_tuning'
)

In [None]:
# Display a summary of the search space 
tuner.search_space_summary()

In [None]:
# Search for the best hyperparameters using training and validation data
tuner.search(x_train, y_train,
                   epochs=10,
                   validation_data=(x_valid, y_valid),
                   verbose=2)

In [None]:
# Display a summary of the results 
tuner.results_summary() 

In [None]:
# Get the best hyperparameters and model
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]
print("Best hyperparameters:")
for key, value in best_hps.values.items():
    print(f"{key}: {value}")

In [None]:
# Fine-Tuned Model
fine_tuned_model=tuner.hypermodel.build(best_hps)

# Train fine-tuned model
fine_tuned_history = fine_tuned_model.fit(
    x_train, y_train,
    epochs=20,
    validation_data=(x_valid, y_valid),
    verbose=1
)

In [None]:
# Evaluate the fine-tuned model on the test set
fine_tuned_test_loss, fine_tuned_test_accuracy = fine_tuned_model.evaluate(x_test, y_test)
print(f"Fine-tuned Test Loss: {fine_tuned_test_loss:.4f}")
print(f"Fine-tuned Test Accuracy: {fine_tuned_test_accuracy:.4f}")

Now, let's train the fine-tuned model.

Evaluate the fine-tuned model and compare its performance with the original model.

In [None]:
# Visualize training history of fine-tuned model
plt.figure(figsize=(12, 6))

pd.DataFrame(fine_tuned_history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1) # Set the vertical range to [0-1]
plt.xlabel("Epochs")
plt.ylabel("Value")
plt.title('Fine-tuned Model Training and Validation Accuracy and Loss')

plt.show()

In [None]:
# Make predictions with the fine-tuned model for the first 10 test samples
fine_tuned_predictions = fine_tuned_model.predict(x_test[:10])
fine_tuned_predicted_labels = np.argmax(fine_tuned_predictions, axis=1)

# Display predictions vs true labels for fine-tuned model
plt.figure(figsize=(12, 6))
for i in range(10):
    plt.subplot(2, 5, i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(x_test[i], cmap=plt.cm.binary)
    plt.xlabel(f"Predicted: {fine_tuned_predicted_labels[i]}\nTrue: {true_labels[i]}")
plt.title('Fine-tuned Model Predictions')
plt.show()

In [None]:
# Compare model performance
comparison_data = {
    "Model": ["Original Model", "Fine-Tuned Model"],
    "Test Accuracy": [test_accuracy, fine_tuned_test_accuracy],
    "Test Loss": [test_loss, fine_tuned_test_loss]
}

comparison_df = pd.DataFrame(comparison_data,)
print(comparison_df.to_string(index=False))

In [None]:
# Visualize and compare validation loss between original and fine-tuned models
plt.figure(figsize=(12, 4))

plt.plot(history.history['val_loss'], label='Original Model Val Loss')
plt.plot(fine_tuned_history.history['val_loss'], label='Tuned Model Val Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.title('Validation Loss Comparison')

plt.show()

In [None]:
# Visualize and compare validation accuracy between original and fine-tuned models
plt.figure(figsize=(12, 4))

plt.plot(history.history['val_accuracy'], label='Original Model Val Accuracy')
plt.plot(fine_tuned_history.history['val_accuracy'], label='Tuned Model Val Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.title('Validation Accuracy Comparison')

plt.show()

In [None]:
# Visualize and compare training loss between original and fine-tuned models
plt.figure(figsize=(12, 4))

plt.plot(history.history['loss'], label='Original Model training loss')
plt.plot(fine_tuned_history.history['loss'], label='Tuned Model training loss')
plt.xlabel('Epoch')
plt.ylabel('loss')
plt.legend()
plt.title('Training Loss Comparison')

plt.show()

In [None]:
# Visualize and compare training accuracy between original and fine-tuned models
plt.figure(figsize=(12, 4))

plt.plot(history.history['accuracy'], label='Original Model training accuracy')
plt.plot(fine_tuned_history.history['accuracy'], label='Tuned Model training accuracy')
plt.xlabel('Epoch')
plt.ylabel('loss')
plt.legend()
plt.title('Training Accuracy Comparison')

plt.show()

In [None]:
# Predictions comparison
fine_tuned_predictions = fine_tuned_model.predict(x_test)

print("\nPredictions Comparison (First 10 Test Samples):")
for i in range(10):
    original_predicted = np.argmax(predictions[i])
    fine_tuned_predicted = np.argmax(fine_tuned_predictions[i])
    true_label = y_test[i]
    print(f"Sample {i+1}: Original Predicted: {original_predicted}, Tuned Predicted: {fine_tuned_predicted}, True: {true_label}")

### Confusion Matrix

To further analyze model performance, we present a confusion matrix for the test set. This helps identify which digits are most frequently misclassified.

In [None]:
# Get predictions for the entire test set
y_pred = np.argmax(model.predict(x_test), axis=1)

# Compute confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix - Original Model')
plt.show()

For the fine-tuned model, repeat with fine_tuned_model

In [None]:
y_pred_tuned = np.argmax(fine_tuned_model.predict(x_test), axis=1)
cm_tuned = confusion_matrix(y_test, y_pred_tuned)
plt.figure(figsize=(8, 6))
sns.heatmap(cm_tuned, annot=True, fmt='d', cmap='Greens', cbar=False)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix - Fine-Tuned Model')
plt.show()

### Misclassified Examples

Below, we display a few test images where the model's prediction did not match the true label. This can provide insights into common sources of error and potential areas for improvement.

In [None]:
# Find misclassified indices for the original model
misclassified_idx = np.where(y_pred != y_test)[0]

# Display a few misclassified examples
num_to_show = 10
plt.figure(figsize=(15, 4))
for i, idx in enumerate(misclassified_idx[:num_to_show]):
    plt.subplot(2, 5, i+1)
    plt.imshow(x_test[idx], cmap='gray')
    plt.title(f"True: {y_test[idx]}, Pred: {y_pred[idx]}", color='red')
    plt.axis('off')
plt.suptitle('Misclassified Examples - Original Model')
plt.tight_layout(rect=[0, 0, 1, 0.92])
plt.show()

For the fine-tuned model, use y_pred_tuned

In [None]:
misclassified_idx_tuned = np.where(y_pred_tuned != y_test)[0]
plt.figure(figsize=(15, 4))
for i, idx in enumerate(misclassified_idx_tuned[:num_to_show]):
    plt.subplot(2, 5, i+1)
    plt.imshow(x_test[idx], cmap='gray')
    plt.title(f"True: {y_test[idx]}, Pred: {y_pred_tuned[idx]}", color='red')
    plt.axis('off')
plt.suptitle('Misclassified Examples - Fine-Tuned Model')
plt.tight_layout(rect=[0, 0, 1, 0.92])
plt.show()

## 5. Summary and Conclusion

In this project, we developed and fine-tuned a fully connected neural network for classifying handwritten digits from the MNIST dataset using Keras and TensorFlow. We implemented a systematic approach to model development, training, and optimization.

**Key Accomplishments:**

1. Data Preparation

    - Successfully preprocessed the MNIST dataset
    - Implemented proper data normalization
    - Created appropriate training, validation, and test splits
    - Visualized sample data for better understanding

2. Initial Model Development

    - Implemented a multi-layer perceptron architecture
    - Incorporated regularization techniques (L2 and dropout)
    - Achieved baseline performance for comparison

3. Hyperparameter Optimization

    - Implemented systematic hyperparameter tuning using Keras Tuner
    - Explored various model configurations including:
    - Different network architectures (32-512 units per layer)
    - Various dropout rates (0.1-0.5)
    - Different optimizers (adam, sgd, rmsprop)
    - Different L2 regularization values

4. Performance Analysis

    - Conducted thorough comparison between original and tuned models
    - Visualized training and validation metrics
    - Analyzed prediction accuracy on test samples

**Potential Areas for Improvement:**

1. Model Architecture

    - Experiment with different network architectures
    - Consider implementing Convolutional Neural Networks (CNNs)
    - Try different activation functions

2. Training Strategy

    - Implement learning rate scheduling
    - Explore different optimization algorithms
    - Try different batch sizes and training durations

3. Data Enhancement

    - Implement data augmentation techniques
    - Explore different preprocessing methods
    - Consider using additional datasets

4. Regularization

    - Test different dropout patterns
    - Experiment with other regularization techniques
    - Implement early stopping strategies

This project provides a solid foundation for understanding and implementing deep learning workflows for image classification. By systematically exploring the areas for improvement, you can further enhance the model's performance and gain deeper insights into deep learning practices.