# Visualization and Analysis

Plotting and analyzing training results, model behavior, and predictions.

## Training Curves

Visualizing loss and accuracy over epochs reveals training dynamics:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from notebooks.utils import plot_training_curves

# Simulate realistic training curves
epochs = 10
train_losses = np.array([0.450, 0.200, 0.085, 0.042, 0.025, 0.018, 0.012, 0.010, 0.008, 0.007])
val_losses = np.array([0.480, 0.210, 0.095, 0.055, 0.040, 0.035, 0.032, 0.031, 0.031, 0.032])
train_accs = np.array([0.860, 0.938, 0.975, 0.986, 0.992, 0.994, 0.996, 0.997, 0.997, 0.998])
val_accs = np.array([0.850, 0.930, 0.970, 0.982, 0.988, 0.990, 0.991, 0.991, 0.991, 0.990])

fig = plot_training_curves(
    train_losses, val_losses,
    train_accs, val_accs,
    figsize=(12, 4)
)
plt.show()

print("Observations:")
print(f"  - Training converges smoothly (no NaN or spikes)")
print(f"  - Final train loss: {train_losses[-1]:.4f}")
print(f"  - Final train acc: {train_accs[-1]:.4f}")
print(f"  - Final val loss: {val_losses[-1]:.4f}")
print(f"  - Final val acc: {val_accs[-1]:.4f}")
print(f"  - Gap between train/val (generalization gap): {train_accs[-1] - val_accs[-1]:.4f}")

## Interpreting Training Curves

### Good Training Indicators ✓
- Smooth, monotonic decrease in loss
- Accuracy increases steadily
- Val loss follows train loss (not diverging)
- No sudden spikes or NaN values

### Warning Signs ⚠️
- **Loss plateaus**: Learning rate too low, or stuck at local minimum
- **Loss diverges**: Learning rate too high, use smaller LR
- **Val loss > train loss**: Overfitting, use more regularization (dropout, L2)
- **NaN values**: Numerical instability, try lower learning rate
- **Sawtooth pattern**: Batch size too small or learning rate fluctuating

## Confusion Matrix Analysis

In [None]:
from notebooks.utils import plot_confusion_matrix
import numpy as np

# Confusion matrix from LeNet-5 training
cm = np.array([
    [980,   0,   1,   0,   0,   0,   1,   0,   0,   0],
    [  0, 1130,   1,   0,   0,   0,   1,   0,   0,   0],
    [  0,   1, 1026,   1,   0,   0,   0,   0,   1,   0],
    [  0,   0,   0, 1009,   0,   0,   0,   0,   0,   0],
    [  0,   0,   0,   0, 978,   0,   0,   0,   0,   0],
    [  0,   0,   0,   1,   0, 889,   0,   0,   0,   0],
    [  0,   0,   0,   0,   0,   0, 958,   0,   0,   0],
    [  0,   0,   0,   0,   0,   0,   0, 1026,   0,   0],
    [  0,   0,   0,   0,   0,   0,   0,   0, 974,   0],
    [  0,   0,   0,   0,   0,   0,   0,   0,   0, 1007],
])

class_names = [str(i) for i in range(10)]
fig = plot_confusion_matrix(cm, class_names, title="Confusion Matrix - LeNet-5 on EMNIST")
plt.show()

# Compute per-class metrics
accuracy = np.diag(cm) / cm.sum(axis=1)
print("\nPer-class Accuracy:")
for i, acc in enumerate(accuracy):
    print(f"  {i}: {acc:.4f}")

print(f"\nOverall Accuracy: {np.trace(cm) / cm.sum():.4f}")

## Common Confusion Patterns

### Visually Similar Classes
- **4↔9**: Both have closed loops
- **3↔8**: Curvy shapes
- **1↔7**: Vertical lines

### Diagonal Dominance
- Strong diagonal = good predictions
- Off-diagonal = misclassifications
- Asymmetric confusion = one-way errors

## Tensor Visualization

In [None]:
from notebooks.utils import visualize_tensor
import numpy as np

# Simulate activation maps from first conv layer
fig, axes = plt.subplots(2, 3, figsize=(10, 7))
for i in range(6):
    # Simulate a filter output
    ax = axes[i // 3, i % 3]
    activation = np.random.randn(24, 24)
    activation = np.tanh(activation)  # Bounded [-1, 1]
    
    im = ax.imshow(activation, cmap='viridis')
    ax.set_title(f'Filter {i+1}')
    ax.axis('off')
    plt.colorbar(im, ax=ax)

plt.suptitle('Conv1 Activation Maps')
plt.tight_layout()
plt.show()

## Class Distribution

In [None]:
from notebooks.utils import plot_class_distribution

# EMNIST digits distribution
labels = np.concatenate([np.full(1000, i) for i in range(10)])
np.random.shuffle(labels)

fig = plot_class_distribution(labels, title="EMNIST Digits Class Distribution")
plt.show()

print("\nClass balance: EMNIST digits are roughly balanced (1000 samples per class)")
print("For imbalanced datasets, consider:")
print("  - Weighted loss (penalize minority classes more)")
print("  - Oversampling minority classes")
print("  - Focal loss (focuses on hard examples)")

## Exporting Results

Save figures for reports and presentations:

In [None]:
# Save figures
# fig.savefig('training_curves.png', dpi=150, bbox_inches='tight')
# fig.savefig('confusion_matrix.pdf', bbox_inches='tight')  # For papers

print("Figures saved to:")
print("  - training_curves.png")
print("  - confusion_matrix.pdf")
print("\nUse high DPI for presentation (150-300)")
print("Use PDF format for papers (vector graphics, no quality loss)")

## Key Visualization Takeaways

1. **Loss curves** reveal training stability and learning rate issues
2. **Confusion matrices** show which classes are confused
3. **Activation maps** reveal learned features
4. **Class distribution** matters for loss weighting
5. **Save for reproducibility** - include plots in papers/reports

Next: Explore advanced techniques!