# Softmax Regression with TensorFlow - Interactive Demo

This notebook demonstrates the implementation and training of Softmax Regression using TensorFlow for MNIST digit classification.

## Features:
- Data preprocessing and visualization
- Model creation and compilation
- Training with monitoring
- Evaluation and analysis
- Custom training loop demonstration

In [None]:
# Import libraries
import sys
import os

# Add src to path
sys.path.append('../')

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from src.model.softmax_regression import SoftmaxRegression
from src.data.data_preprocessing import load_and_preprocess_mnist
from src.training.trainer import ModelTrainer
from src.training.custom_trainer import CustomTrainer
from src.utils.visualization import *
from src.utils.evaluation import *

print(f"TensorFlow version: {tf.__version__}")

In [None]:
# Load and preprocess MNIST data
(X_train, Y_train), (X_val, Y_val) = load_and_preprocess_mnist()

print(f"Training samples: {X_train.shape[0]}")
print(f"Validation samples: {X_val.shape[0]}")
print(f"Input features: {X_train.shape[1]}")
print(f"Output classes: {Y_train.shape[1]}")

In [None]:
# Visualize sample images
fig, axes = plt.subplots(2, 5, figsize=(12, 6))
for i in range(10):
    row = i // 5
    col = i % 5
    
    # Reshape back to 28x28 for display
    image = X_train[i].numpy().reshape(28, 28)
    true_label = tf.argmax(Y_train[i]).numpy()
    
    axes[row, col].imshow(image, cmap='gray')
    axes[row, col].set_title(f'Label: {true_label}')
    axes[row, col].axis('off')

plt.tight_layout()
plt.show()

In [None]:
# Create and compile model
model_builder = SoftmaxRegression(input_size=784, num_classes=10)
model = model_builder.create_and_compile(
    learning_rate=0.01,
    optimizer='sgd'
)

model_builder.get_model_summary()

In [None]:
# Train model
trainer = ModelTrainer(model)

history = trainer.train(
    X_train, Y_train,
    X_val, Y_val,
    epochs=15,
    batch_size=128,
    verbose=1
)

In [None]:
# Plot training history
plot_training_history(history.history, show_plot=True)

In [None]:
# Evaluate model
val_loss, val_accuracy = trainer.evaluate(X_val, Y_val)
print(f"Validation Loss: {val_loss:.4f}")
print(f"Validation Accuracy: {val_accuracy:.4f}")

# Sample predictions
Y_val_classes = tf.argmax(Y_val, axis=1).numpy()
results = trainer.predict_samples_with_confidence(
    X_val[:10], Y_val_classes[:10], num_samples=10
)

In [None]:
# Demonstrate custom training loop
custom_model = SoftmaxRegression(784, 10).create_model()
custom_trainer = CustomTrainer(custom_model)

print("Training with custom loop...")
custom_history = custom_trainer.train(
    X_train[:1000], Y_train[:1000],
    X_val[:200], Y_val[:200],
    epochs=5,
    batch_size=64,
    verbose=1
)

# Plot comparison
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

ax1.plot(history.history['accuracy'][:5], label='Standard Training')
ax1.plot(custom_history['accuracy'], label='Custom Training')
ax1.set_title('Training Accuracy Comparison')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Accuracy')
ax1.legend()

ax2.plot(history.history['loss'][:5], label='Standard Training')
ax2.plot(custom_history['loss'], label='Custom Training')
ax2.set_title('Training Loss Comparison')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Loss')
ax2.legend()

plt.tight_layout()
plt.show()

## Key Insights

1. **Model Performance**: The Softmax Regression achieves ~90% accuracy on MNIST
2. **Training Speed**: Converges quickly due to the simplicity of the model
3. **Custom vs Standard Training**: Both approaches yield similar results
4. **Gradient Flow**: No vanishing gradient problems for this simple model

## Next Steps

- Try different optimizers (Adam, RMSprop)
- Experiment with learning rate scheduling
- Add regularization techniques
- Compare with more complex models