# Training LeNet-5 on EMNIST

Complete training pipeline: data loading, model setup, training loop, and evaluation.

## EMNIST Dataset

EMNIST is an extended version of MNIST:
- **770,000 samples** of handwritten digits and letters
- **62 classes**: 10 digits + 26 lowercase + 26 uppercase
- **28×28 grayscale images**
- Standard train/val/test split

For this notebook, we'll use just the **digits subset** (10 classes, ~130k samples).

## Step 1: Data Loading

In production Mojo code, you'd use the data loader. Here we show the structure:

In [None]:
from notebooks.utils import run_mojo_script
import numpy as np

# In practice, data loading happens in Mojo
# For this demo, we create synthetic data
print("Dataset will be loaded from Mojo training script")
print("Expected dimensions:")
print("  - Images: (N, 28, 28) for batch size N")
print("  - Labels: (N,) with values 0-9")
print("  - Batches: 32 samples each (configurable)")

## Step 2: Model Setup

LeNet-5 configuration:

In [None]:
config = {
    'model': 'LeNet-5',
    'input_shape': (1, 28, 28),
    'num_classes': 10,
    'architecture': [
        {'layer': 'Conv2D', 'in': 1, 'out': 6, 'kernel': 5},
        {'layer': 'ReLU'},
        {'layer': 'MaxPool2D', 'kernel': 2},
        {'layer': 'Conv2D', 'in': 6, 'out': 16, 'kernel': 5},
        {'layer': 'ReLU'},
        {'layer': 'MaxPool2D', 'kernel': 2},
        {'layer': 'Flatten'},
        {'layer': 'Linear', 'in': 256, 'out': 120},
        {'layer': 'ReLU'},
        {'layer': 'Linear', 'in': 120, 'out': 84},
        {'layer': 'ReLU'},
        {'layer': 'Linear', 'in': 84, 'out': 10},
    ]
}

print("Model Configuration:")
print(f"  Model: {config['model']}")
print(f"  Input: {config['input_shape']}")
print(f"  Output classes: {config['num_classes']}")
print(f"  Total parameters: ~60k")

## Step 3: Training Configuration

In [None]:
train_config = {
    'epochs': 10,
    'batch_size': 32,
    'learning_rate': 0.001,
    'optimizer': 'Adam',
    'loss': 'CrossEntropy',
    'dtype': 'float32',
    'device': 'CPU',  # or 'CUDA' if available
    'checkpoint_dir': './checkpoints/',
}

print("Training Configuration:")
for key, value in train_config.items():
    print(f"  {key}: {value}")

## Step 4: Training Loop

This runs in Mojo for performance. The training script:

In [None]:
from notebooks.utils import TrainingProgressBar
import numpy as np

# Simulate what the training loop does
print("Training Loop Structure (in Mojo):")
print("""
for epoch in range(num_epochs):
    train_loss = 0.0
    train_acc = 0.0
    
    for batch in train_loader:  # Iterate over batches
        images = batch.images   # Shape: (batch_size, 1, 28, 28)
        labels = batch.labels   # Shape: (batch_size,)
        
        # Forward pass
        predictions = model.forward(images)
        loss = criterion(predictions, labels)
        
        # Backward pass
        loss.backward()
        
        # Update weights
        optimizer.step()
        optimizer.zero_grad()
        
        # Accumulate metrics
        train_loss += loss.item()
        train_acc += accuracy(predictions, labels)
    
    # Average over batches
    train_loss /= len(train_loader)
    train_acc /= len(train_loader)
    
    # Validation
    val_loss, val_acc = evaluate(model, val_loader)
    
    print(f"Epoch {epoch+1}: Loss={train_loss:.4f}, Acc={train_acc:.4f}")
    
    # Save checkpoint
    model.save(f'checkpoint_{epoch}.pt')
""")

## Step 5: Running Training

To train the actual model in Mojo:

In [None]:
# This would run the actual training
# Uncomment to train (takes ~5-10 minutes depending on hardware)

# result = run_mojo_script(
#     "examples/lenet-emnist/run_train.mojo",
#     args=[
#         "--epochs", "10",
#         "--batch-size", "32",
#         "--lr", "0.001",
#         "--precision", "float32",
#     ],
#     timeout=600  # 10 minutes
# )

# if result['success']:
#     print(result['stdout'])
# else:
#     print(f"Training failed: {result['stderr']}")

print("Training would run here. See example output:")
print("""
Epoch 1/10
  Batch 100/3600 - Loss: 2.2341 (3%)
  Batch 200/3600 - Loss: 1.8234 (6%)
  ...
  ✓ Loss: 0.3234, Accuracy: 0.9823, Time: 45.2s

Epoch 2/10
  ✓ Loss: 0.0956, Accuracy: 0.9934, Time: 44.8s

...

Training complete!
  Best accuracy: 0.9958 at epoch 8
""")

## Step 6: Evaluation

After training, evaluate on test set:

In [None]:
from notebooks.utils import plot_confusion_matrix

# Simulate test results
cm = np.array([
    [980,   0,   1,   0,   0,   0,   1,   0,   0,   0],
    [  0, 1130,   1,   0,   0,   0,   1,   0,   0,   0],
    [  0,   1, 1026,   1,   0,   0,   0,   0,   1,   0],
    [  0,   0,   0, 1009,   0,   0,   0,   0,   0,   0],
    [  0,   0,   0,   0, 978,   0,   0,   0,   0,   0],
    [  0,   0,   0,   1,   0, 889,   0,   0,   0,   0],
    [  0,   0,   0,   0,   0,   0, 958,   0,   0,   0],
    [  0,   0,   0,   0,   0,   0,   0, 1026,   0,   0],
    [  0,   0,   0,   0,   0,   0,   0,   0, 974,   0],
    [  0,   0,   0,   0,   0,   0,   0,   0,   0, 1007],
])

class_names = [str(i) for i in range(10)]
fig = plot_confusion_matrix(cm, class_names, title="LeNet-5 Test Set Confusion Matrix")
print(f"\nTest Accuracy: {np.trace(cm) / cm.sum():.4f}")

## Tips for Successful Training

1. **Start simple**: Train for 1-2 epochs first to catch issues early
2. **Monitor loss**: Loss should decrease smoothly
3. **Watch for NaN**: Usually means learning rate too high
4. **Use validation set**: Detect overfitting early
5. **Save checkpoints**: Keep best model in case training fails
6. **Try different seeds**: Results vary due to random initialization
7. **Profile performance**: Identify bottlenecks

Next: Visualize and analyze results!