# Model Evaluation and Comparison for MNIST

This notebook evaluates and compares the performance of three different classifiers implemented from scratch:
1. **Softmax Regression**
2. **Neural Network (Baseline)**
3. **Neural Network (Tuned)**

We will use the utility functions from `src/utils.py` to generate reports, confusion matrices, and other visualizations.

## 1. Setup and Data Loading

In [1]:
import numpy as np
import sys
sys.path.append('..') # Add the project root to the path

# Import our models and utility functions
from src.data_loader import load_mnist
from src.softmax_regression import SoftmaxRegression
from src.neural_network import NeuralNetwork
from src.utils import evaluate_model, plot_learning_curves

# Load the data
print("Loading data...")
X_train, X_test, y_train, y_test = load_mnist()

# Define class names for plotting
class_names = [str(i) for i in range(10)]

print("Data loaded successfully.")

ModuleNotFoundError: No module named 'seaborn'

# Experiment 0: One-vs-Rest Logistic Regression 

In [None]:
from src.logistic_regression import LogisticRegression, OneVsRestClassifier

# Initialize and train the OvR model.

lr_base = LogisticRegression(learning_rate=0.1, n_iterations=1000, regularization_lambda=0.01)
ovr_model = OneVsRestClassifier(base_classifier=lr_base, n_classes=10)

# The fit method in this class already prints progress, so we just call it.
ovr_model.fit(X_train, y_train)

# Evaluate the model
evaluate_model(ovr_model, X_test, y_test, class_names)

## 2. Experiment 1: Softmax Regression

First, we evaluate the Softmax Regression model. It's a strong linear baseline.

In [None]:
# Initialize and train the Softmax Regression model
softmax_model = SoftmaxRegression(learning_rate=0.1, n_iterations=1000, regularization_lambda=0.01)
softmax_model.fit(X_train, y_train)

# Evaluate the model
evaluate_model(softmax_model, X_test, y_test, class_names)

**Observation:** The Softmax model provides a solid baseline accuracy. The confusion matrix shows which digits are commonly confused (e.g., 4 vs. 9, 3 vs. 5). Being a linear model, its ability to separate complex patterns is limited.

## 3. Experiment 2: Neural Network (Baseline)

Now, let's evaluate the simple feed-forward neural network with its default hyperparameters.

In [None]:
# Initialize and train the baseline Neural Network
nn_baseline = NeuralNetwork(learning_rate=0.001, n_epochs=10, batch_size=64)
history_baseline = nn_baseline.fit(X_train, y_train)

# Plot its learning curve
plot_learning_curves(history_baseline, title="Baseline NN Training Loss")

# Evaluate the model
evaluate_model(nn_baseline, X_test, y_test, class_names)

**Observation:** The baseline neural network significantly outperforms the softmax model, demonstrating the power of non-linear hidden layers. The learning curve shows a steady decrease in loss. The accuracy is much higher, and the number of misclassifications in the confusion matrix is visibly lower.

## 4. Experiment 3: Hyperparameter Tuning the Neural Network

Let's try to improve the neural network. We will train it for more epochs and use a slightly smaller learning rate to allow for finer convergence. This is a common tuning strategy.

In [None]:
# Initialize and train the tuned Neural Network
nn_tuned = NeuralNetwork(learning_rate=0.0005, n_epochs=20, batch_size=64)
history_tuned = nn_tuned.fit(X_train, y_train)

# Plot its learning curve
plot_learning_curves(history_tuned, title="Tuned NN Training Loss")

# Evaluate the tuned model
evaluate_model(nn_tuned, X_test, y_test, class_names)

## 5. Conclusion

Based on the evaluations:

- The **Neural Network** architecture is far superior to the linear **Softmax Regression** model for this image classification task.
- **Hyperparameter tuning** (adjusting epochs and learning rate) provided a noticeable boost in accuracy for the Neural Network.
- The visualization tools in `utils.py` were crucial for diagnosing model performance, comparing results, and understanding where the models fail (e.g., via the confusion matrix and incorrect prediction plots).