# Module 01 — Mathematical & Programming Foundations## 01-04: Visualization with Matplotlib**Objective:** Master the Matplotlib Figure/Axes object model and learn to createthe plot types used throughout this course — scatter plots, histograms, heatmaps,training curves, confusion matrices, and decision boundaries.**Prerequisites:** 01-01 (Python, NumPy & Tensor Speed), 01-02 (Advanced NumPy & PyTorch Operations), 01-03 (Pandas for Tabular Data)

---## Part 0 — Setup & PrerequisitesEvery notebook in this course uses Matplotlib for visualization — training curves,confusion matrices, feature distributions, attention heatmaps, generated images,and decision boundaries. This notebook teaches you Matplotlib's object model soyou can create, customize, and debug any plot.We will cover:- **Figure/Axes model** — the two-layer architecture behind every Matplotlib plot- **Core plot types** — line, scatter, bar, histogram, heatmap, contour- **Subplots & layouts** — multi-panel figures with `subplots()` and `GridSpec`- **Styling & annotation** — colors, legends, text, arrows, mathematical labels- **ML-specific plots** — training curves, confusion matrices, decision boundaries,  image grids, and attention heatmapsWe use synthetic data and the California Housing dataset from 01-03.**Prerequisites:** 01-01, 01-02, 01-03 (Pandas for Tabular Data)

In [None]:
# ── Imports ──────────────────────────────────────────────────────────────────
import sys
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from matplotlib.patches import Patch, FancyArrowPatch
from matplotlib.colors import ListedColormap, Normalize
import matplotlib.cm as cm
import torch

from sklearn.datasets import (
    fetch_california_housing, make_moons, make_blobs, make_classification,
)
from sklearn.model_selection import train_test_split

print(f'Python: {sys.version.split()[0]}')
print(f'NumPy: {np.__version__}')
print(f'Matplotlib: {plt.matplotlib.__version__}')
print(f'PyTorch: {torch.__version__}')
if torch.cuda.is_available():
    print(f'CUDA: {torch.version.cuda}')
    print(f'GPU: {torch.cuda.get_device_name(0)}')

In [None]:
# ── Reproducibility ──────────────────────────────────────────────────────────
import random

SEED = 1103
random.seed(SEED)
np.random.seed(SEED)
torch.manual_seed(SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(SEED)

In [None]:
# ── Configuration ────────────────────────────────────────────────────────────
# Default figure aesthetics for the course
plt.rcParams.update({
    'figure.figsize': (10, 6),
    'font.size': 11,
    'axes.titlesize': 13,
    'axes.labelsize': 11,
    'legend.fontsize': 10,
    'xtick.labelsize': 10,
    'ytick.labelsize': 10,
    'figure.dpi': 100,
    'savefig.dpi': 150,
    'axes.grid': False,
})

# Course color palette
COLORS = {
    'blue': '#1E88E5',
    'red': '#E53935',
    'green': '#43A047',
    'orange': '#FF9800',
    'purple': '#9C27B0',
    'brown': '#795548',
    'teal': '#00897B',
    'gray': '#757575',
}
COLOR_LIST = list(COLORS.values())

### Data LoadingWe load the California Housing dataset for realistic scatter/histogram examples,plus synthetic datasets for classification visualizations.

In [None]:
# California Housing
housing = fetch_california_housing(as_frame=True)
df = housing.frame
print(f'California Housing: {df.shape}')

# Synthetic classification datasets
X_moons, y_moons = make_moons(n_samples=500, noise=0.25, random_state=SEED)
X_blobs, y_blobs = make_blobs(n_samples=300, centers=3, random_state=SEED, cluster_std=1.5)
print(f'Moons: {X_moons.shape}, Blobs: {X_blobs.shape}')

---## Part 1 — Matplotlib Fundamentals from ScratchMatplotlib has two interfaces:- **pyplot (implicit)** — `plt.plot()`, `plt.scatter()` — quick but limited- **Object-oriented (explicit)** — `fig, ax = plt.subplots()` — full controlWe always use the **object-oriented interface** in this course because it'sclearer, more flexible, and easier to debug. The key insight is that everyMatplotlib plot has two layers:- **`Figure`** — the entire canvas/window (controls size, DPI, background)- **`Axes`** — a single plot area within the figure (controls data, labels, ticks)One Figure can contain multiple Axes (subplots).

### 1.1 The Figure/Axes Object ModelLet's build up from the simplest plot to understand each component.

In [None]:
def demonstrate_figure_axes_model() -> None:
    """Show the Figure/Axes hierarchy with labeled components."""
    # Create a figure with one axes
    fig, ax = plt.subplots(figsize=(10, 6))

    # Plot some data
    x = np.linspace(0, 2 * np.pi, 100)
    ax.plot(x, np.sin(x), color=COLORS['blue'], linewidth=2, label='sin(x)')
    ax.plot(x, np.cos(x), color=COLORS['red'], linewidth=2, label='cos(x)')

    # Axes components
    ax.set_xlabel('x (radians)')       # X-axis label
    ax.set_ylabel('Amplitude')          # Y-axis label
    ax.set_title('Trigonometric Functions')  # Axes title
    ax.legend(loc='upper right')        # Legend
    ax.set_xlim(0, 2 * np.pi)          # X limits
    ax.set_ylim(-1.5, 1.5)             # Y limits
    ax.axhline(y=0, color='gray', linestyle='--', alpha=0.3)  # Reference line
    ax.grid(True, alpha=0.3)            # Grid

    # Figure-level title
    fig.suptitle('Figure → Axes → Plot Elements', fontsize=14, fontweight='bold')

    # Annotate the architecture
    ax.annotate('Axes (plot area)',
                xy=(0.02, 0.98), xycoords='axes fraction',
                fontsize=9, color='gray', va='top',
                bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))

    plt.tight_layout()
    plt.show()

    # Print the hierarchy
    print('Matplotlib Object Hierarchy:')
    print(f'  Figure: {type(fig).__name__}')
    print(f'    └── Axes: {type(ax).__name__}')
    print(f'          ├── XAxis: {type(ax.xaxis).__name__}')
    print(f'          ├── YAxis: {type(ax.yaxis).__name__}')
    print(f'          ├── Lines: {len(ax.lines)} Line2D objects')
    print(f'          └── Legend: {type(ax.get_legend()).__name__}')


demonstrate_figure_axes_model()

### 1.2 Core Plot TypesMatplotlib supports dozens of plot types. We focus on the eight most commonin ML work, each demonstrated with real or realistic data.

In [None]:
def demonstrate_core_plot_types() -> None:
    """Show the 8 most common plot types for ML with best practices."""
    fig, axes = plt.subplots(2, 4, figsize=(20, 9))
    np.random.seed(SEED)

    # 1. Line plot — training curves
    epochs = np.arange(1, 51)
    train_loss = 2.5 * np.exp(-0.08 * epochs) + 0.3 + np.random.randn(50) * 0.05
    val_loss = 2.5 * np.exp(-0.06 * epochs) + 0.5 + np.random.randn(50) * 0.08
    axes[0, 0].plot(epochs, train_loss, color=COLORS['blue'], label='Train', linewidth=1.5)
    axes[0, 0].plot(epochs, val_loss, color=COLORS['red'], label='Val', linewidth=1.5)
    axes[0, 0].set_xlabel('Epoch')
    axes[0, 0].set_ylabel('Loss')
    axes[0, 0].set_title('1. Line Plot (Training Curves)')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)

    # 2. Scatter plot — feature relationships
    sample_idx = np.random.choice(len(df), 1000, replace=False)
    scatter = axes[0, 1].scatter(
        df['MedInc'].iloc[sample_idx],
        df['MedHouseVal'].iloc[sample_idx],
        c=df['HouseAge'].iloc[sample_idx],
        cmap='viridis', s=10, alpha=0.6)
    axes[0, 1].set_xlabel('Median Income')
    axes[0, 1].set_ylabel('House Value')
    axes[0, 1].set_title('2. Scatter (color=HouseAge)')
    plt.colorbar(scatter, ax=axes[0, 1], shrink=0.8, label='Age')

    # 3. Histogram — distribution
    axes[0, 2].hist(df['MedInc'], bins=50, color=COLORS['green'],
                     alpha=0.7, edgecolor='white')
    axes[0, 2].axvline(df['MedInc'].mean(), color=COLORS['red'],
                        linestyle='--', label=f'Mean={df["MedInc"].mean():.2f}')
    axes[0, 2].axvline(df['MedInc'].median(), color=COLORS['orange'],
                        linestyle='--', label=f'Median={df["MedInc"].median():.2f}')
    axes[0, 2].set_xlabel('Median Income')
    axes[0, 2].set_ylabel('Count')
    axes[0, 2].set_title('3. Histogram')
    axes[0, 2].legend(fontsize=8)

    # 4. Bar chart — model comparison
    models = ['Baseline', 'LinReg', 'RF', 'XGBoost']
    scores = [0.50, 0.72, 0.87, 0.91]
    bar_colors = [COLORS['gray']] + [COLORS['blue']] * 3
    bars = axes[0, 3].bar(models, scores, color=bar_colors)
    bars[-1].set_color(COLORS['green'])  # Highlight best
    axes[0, 3].set_ylabel('R² Score')
    axes[0, 3].set_title('4. Bar Chart (Model Comparison)')
    axes[0, 3].set_ylim(0, 1.0)
    for bar, score in zip(bars, scores):
        axes[0, 3].text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.02,
                         f'{score:.2f}', ha='center', fontsize=9)

    # 5. Heatmap — correlation matrix
    corr = df[['MedInc', 'HouseAge', 'AveRooms', 'Population', 'MedHouseVal']].corr()
    im = axes[1, 0].imshow(corr, cmap='coolwarm', vmin=-1, vmax=1)
    axes[1, 0].set_xticks(range(len(corr)))
    axes[1, 0].set_xticklabels(corr.columns, rotation=45, ha='right', fontsize=8)
    axes[1, 0].set_yticks(range(len(corr)))
    axes[1, 0].set_yticklabels(corr.columns, fontsize=8)
    axes[1, 0].set_title('5. Heatmap (Correlation)')
    for i in range(len(corr)):
        for j in range(len(corr)):
            axes[1, 0].text(j, i, f'{corr.iloc[i,j]:.2f}', ha='center', va='center', fontsize=7)
    plt.colorbar(im, ax=axes[1, 0], shrink=0.8)

    # 6. Box plot — distribution comparison
    income_bins = pd.cut(df['MedInc'], bins=5, labels=['Q1', 'Q2', 'Q3', 'Q4', 'Q5'])
    box_data = [df['MedHouseVal'][income_bins == q].values for q in ['Q1', 'Q2', 'Q3', 'Q4', 'Q5']]
    bp = axes[1, 1].boxplot(box_data, labels=['Q1', 'Q2', 'Q3', 'Q4', 'Q5'],
                             patch_artist=True)
    for patch, color in zip(bp['boxes'], COLOR_LIST[:5]):
        patch.set_facecolor(color)
        patch.set_alpha(0.6)
    axes[1, 1].set_xlabel('Income Quintile')
    axes[1, 1].set_ylabel('House Value')
    axes[1, 1].set_title('6. Box Plot')

    # 7. Contour / filled contour — 2D functions
    xx, yy = np.meshgrid(np.linspace(-3, 3, 100), np.linspace(-3, 3, 100))
    zz = np.sin(xx) * np.cos(yy)
    axes[1, 2].contourf(xx, yy, zz, levels=20, cmap='viridis')
    axes[1, 2].contour(xx, yy, zz, levels=10, colors='white', linewidths=0.5, alpha=0.5)
    axes[1, 2].set_xlabel('x')
    axes[1, 2].set_ylabel('y')
    axes[1, 2].set_title('7. Contour Plot')

    # 8. Image display — for ML we show images
    image = np.random.rand(28, 28)
    axes[1, 3].imshow(image, cmap='gray')
    axes[1, 3].set_title('8. Image (imshow)')
    axes[1, 3].axis('off')

    plt.suptitle('8 Essential Plot Types for ML', fontsize=15, fontweight='bold')
    plt.tight_layout()
    plt.show()


demonstrate_core_plot_types()

### 1.3 Subplots & Multi-Panel LayoutsML notebooks frequently need multi-panel figures — comparing models, showingdifferent metrics side by side, or displaying image grids. Matplotlib offersthree ways to create multi-panel layouts:1. **`plt.subplots(nrows, ncols)`** — uniform grid (most common)2. **`GridSpec`** — panels of different sizes3. **`fig.add_subplot()`** — add panels one at a time

In [None]:
def demonstrate_subplot_layouts() -> None:
    """Show different multi-panel layout strategies."""
    # Layout 1: Simple uniform grid
    fig, axes = plt.subplots(1, 3, figsize=(15, 4))
    np.random.seed(SEED)

    for idx, (title, data) in enumerate([
        ('Normal', np.random.randn(1000)),
        ('Uniform', np.random.uniform(-3, 3, 1000)),
        ('Exponential', np.random.exponential(1, 1000)),
    ]):
        axes[idx].hist(data, bins=40, color=COLOR_LIST[idx], alpha=0.7, edgecolor='white')
        axes[idx].set_title(f'{title} (μ={data.mean():.2f}, σ={data.std():.2f})')
        axes[idx].set_xlabel('Value')
        axes[idx].set_ylabel('Count')

    plt.suptitle('Layout 1: Uniform 1×3 Grid', fontsize=13)
    plt.tight_layout()
    plt.show()

    # Layout 2: GridSpec for unequal panels
    fig = plt.figure(figsize=(14, 6))
    gs = gridspec.GridSpec(2, 3, width_ratios=[2, 1, 1], height_ratios=[1, 1],
                           hspace=0.4, wspace=0.3)

    # Large panel on left
    ax_main = fig.add_subplot(gs[:, 0])
    ax_main.scatter(df['MedInc'].iloc[:500], df['MedHouseVal'].iloc[:500],
                    s=10, alpha=0.5, color=COLORS['blue'])
    ax_main.set_xlabel('Median Income')
    ax_main.set_ylabel('House Value')
    ax_main.set_title('Main Plot (spans both rows)')

    # Small panels on right
    ax_top_r1 = fig.add_subplot(gs[0, 1])
    ax_top_r1.hist(df['MedInc'], bins=30, color=COLORS['green'], alpha=0.7)
    ax_top_r1.set_title('Income Dist')

    ax_top_r2 = fig.add_subplot(gs[0, 2])
    ax_top_r2.hist(df['MedHouseVal'], bins=30, color=COLORS['orange'], alpha=0.7)
    ax_top_r2.set_title('Value Dist')

    ax_bot_r1 = fig.add_subplot(gs[1, 1])
    ax_bot_r1.hist(df['HouseAge'], bins=30, color=COLORS['purple'], alpha=0.7)
    ax_bot_r1.set_title('Age Dist')

    ax_bot_r2 = fig.add_subplot(gs[1, 2])
    ax_bot_r2.hist(df['AveRooms'], bins=30, color=COLORS['teal'], alpha=0.7)
    ax_bot_r2.set_title('Rooms Dist')

    fig.suptitle('Layout 2: GridSpec (unequal panels)', fontsize=13)
    plt.show()


demonstrate_subplot_layouts()

### 1.4 Styling, Colors, and AnnotationProfessional-looking plots need careful styling. We'll cover:- Color palettes and colormaps- Text annotations and arrows- Mathematical notation in labels- Tick formatting and spine removal

In [None]:
def demonstrate_styling() -> None:
    """Show styling techniques for publication-quality plots."""
    fig, axes = plt.subplots(1, 3, figsize=(16, 5))
    np.random.seed(SEED)

    # ── Panel 1: Color and style options ─────────────────────────────────────
    x = np.linspace(0, 4 * np.pi, 200)
    styles = [
        {'color': COLORS['blue'], 'linestyle': '-', 'linewidth': 2, 'label': 'Solid'},
        {'color': COLORS['red'], 'linestyle': '--', 'linewidth': 2, 'label': 'Dashed'},
        {'color': COLORS['green'], 'linestyle': '-.', 'linewidth': 2, 'label': 'Dash-dot'},
        {'color': COLORS['orange'], 'linestyle': ':', 'linewidth': 2.5, 'label': 'Dotted'},
    ]
    for i, style in enumerate(styles):
        axes[0].plot(x, np.sin(x + i * 0.5) * (1 - i * 0.15), **style)
    axes[0].legend()
    axes[0].set_title('Line Styles & Colors')
    axes[0].set_xlabel('x')
    axes[0].set_ylabel('y')
    axes[0].grid(True, alpha=0.2)

    # ── Panel 2: Annotations and arrows ──────────────────────────────────────
    x_data = np.linspace(0, 10, 100)
    y_data = np.sin(x_data) * np.exp(-0.1 * x_data)
    axes[1].plot(x_data, y_data, color=COLORS['blue'], linewidth=2)
    axes[1].fill_between(x_data, y_data, alpha=0.1, color=COLORS['blue'])

    # Annotate maximum
    max_idx = np.argmax(y_data)
    axes[1].annotate(
        f'Peak = {y_data[max_idx]:.2f}',
        xy=(x_data[max_idx], y_data[max_idx]),
        xytext=(x_data[max_idx] + 2, y_data[max_idx] + 0.3),
        fontsize=10,
        arrowprops=dict(arrowstyle='->', color=COLORS['red'], lw=1.5),
        bbox=dict(boxstyle='round,pad=0.3', facecolor='yellow', alpha=0.7),
    )
    axes[1].set_title('Annotations & Fill')
    axes[1].set_xlabel('x')
    axes[1].set_ylabel('Damped sine')

    # ── Panel 3: LaTeX math labels ──────────────────────────────────────────
    x_gauss = np.linspace(-4, 4, 200)
    for mu, sigma, color in [(0, 1, COLORS['blue']),
                              (0, 0.5, COLORS['red']),
                              (1, 1.5, COLORS['green'])]:
        y_gauss = (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x_gauss - mu) / sigma) ** 2)
        axes[2].plot(x_gauss, y_gauss, color=color, linewidth=2,
                     label=f'$\\mu={mu}, \\sigma={sigma}$')
    axes[2].set_xlabel('$x$')
    axes[2].set_ylabel('$p(x)$')
    axes[2].set_title(r'$\mathcal{N}(\mu, \sigma^2)$ — Gaussian PDF')
    axes[2].legend()
    axes[2].grid(True, alpha=0.2)

    plt.tight_layout()
    plt.show()


demonstrate_styling()

### 1.5 Colormaps: Choosing the Right PaletteChoosing the right colormap is critical for readability:- **Sequential** (`viridis`, `plasma`, `inferno`) — for ordered data (values, intensities)- **Diverging** (`coolwarm`, `RdBu`) — for data with a meaningful center (correlations, errors)- **Qualitative** (`tab10`, `Set2`) — for categorical data (classes, clusters)**Always avoid `jet`** — it creates visual artifacts and is not perceptually uniform.

In [None]:
def demonstrate_colormaps() -> None:
    """Show colormap categories with examples."""
    fig, axes = plt.subplots(1, 3, figsize=(16, 5))
    np.random.seed(SEED)

    # Sequential — for ordered values
    data_seq = np.random.randn(20, 20)
    im0 = axes[0].imshow(data_seq, cmap='viridis')
    axes[0].set_title('Sequential: viridis\n(ordered data)')
    plt.colorbar(im0, ax=axes[0], shrink=0.8)

    # Diverging — centered at zero
    data_div = np.random.randn(20, 20)
    im1 = axes[1].imshow(data_div, cmap='coolwarm', vmin=-2, vmax=2)
    axes[1].set_title('Diverging: coolwarm\n(centered data)')
    plt.colorbar(im1, ax=axes[1], shrink=0.8)

    # Qualitative — for categories
    X_demo, y_demo = make_blobs(n_samples=200, centers=4, random_state=SEED)
    cmap_qual = ListedColormap(COLOR_LIST[:4])
    axes[2].scatter(X_demo[:, 0], X_demo[:, 1], c=y_demo, cmap=cmap_qual,
                     s=30, alpha=0.7, edgecolors='white', linewidths=0.5)
    axes[2].set_title('Qualitative: custom palette\n(categorical data)')
    axes[2].set_xlabel('Feature 1')
    axes[2].set_ylabel('Feature 2')

    plt.tight_layout()
    plt.show()

    # Colormap reference table
    cmap_ref = pd.DataFrame({
        'Type': ['Sequential', 'Sequential', 'Diverging', 'Diverging', 'Qualitative'],
        'Name': ['viridis', 'plasma', 'coolwarm', 'RdBu_r', 'tab10'],
        'Use Case': ['Loss values, probabilities', 'Attention weights',
                     'Correlations, residuals', 'Weight matrices', 'Class labels, clusters'],
        'Perceptual': ['Uniform', 'Uniform', 'Symmetric', 'Symmetric', 'Distinct'],
    })
    print('=== Colormap Reference ===')
    print(cmap_ref.to_string(index=False))


demonstrate_colormaps()

### 1.6 Clean Aesthetics: Spines, Ticks, and Minimal StyleRemoving visual clutter makes plots more professional. Common techniques:- Remove top/right spines- Reduce tick density- Use subtle gridlines- Choose appropriate font sizes

In [None]:
def demonstrate_clean_aesthetics() -> None:
    """Show how to create clean, publication-quality plots."""
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    np.random.seed(SEED)
    x = np.arange(1, 11)
    y = np.array([2.3, 3.1, 4.7, 5.2, 7.1, 8.3, 9.0, 10.5, 11.2, 12.8])

    # Left: Default (cluttered)
    axes[0].plot(x, y, 'o-', color=COLORS['blue'], linewidth=2, markersize=8)
    axes[0].set_title('Default Style (cluttered)')
    axes[0].set_xlabel('Experiment')
    axes[0].set_ylabel('Accuracy (%)')
    axes[0].grid(True)

    # Right: Clean style
    axes[1].plot(x, y, 'o-', color=COLORS['blue'], linewidth=2, markersize=8)
    axes[1].set_title('Clean Style (professional)')
    axes[1].set_xlabel('Experiment')
    axes[1].set_ylabel('Accuracy (%)')

    # Remove top and right spines
    axes[1].spines['top'].set_visible(False)
    axes[1].spines['right'].set_visible(False)

    # Subtle grid on y-axis only
    axes[1].grid(True, axis='y', alpha=0.2, linestyle='--')

    # Better tick formatting
    axes[1].set_xticks(x)
    axes[1].set_ylim(0, 15)

    plt.tight_layout()
    plt.show()


demonstrate_clean_aesthetics()

---## Part 2 — Putting It All Together: MLPlotter ClassLet's assemble the individual plotting techniques into a reusable `MLPlotter`class that provides the standard visualizations needed throughout this course.

In [None]:
class MLPlotter:
    """Reusable plotting utilities for ML/DL notebooks.

    Provides standard visualizations for training curves, confusion matrices,
    decision boundaries, image grids, and more.

    Attributes:
        colors: Color palette used for all plots.
    """

    def __init__(self, colors: list[str] | None = None) -> None:
        """Initialize with optional custom color palette.

        Args:
            colors: List of hex color strings. Uses course default if None.
        """
        self.colors = colors or COLOR_LIST

    def plot_training_curves(
        self,
        train_losses: list[float],
        val_losses: list[float],
        train_accs: list[float] | None = None,
        val_accs: list[float] | None = None,
        title: str = 'Training Curves',
    ) -> None:
        """Plot training and validation loss/accuracy curves.

        Args:
            train_losses: Training loss per epoch.
            val_losses: Validation loss per epoch.
            train_accs: Optional training accuracy per epoch.
            val_accs: Optional validation accuracy per epoch.
            title: Figure title.
        """
        n_panels = 2 if train_accs else 1
        fig, axes = plt.subplots(1, n_panels, figsize=(6 * n_panels, 4))
        if n_panels == 1:
            axes = [axes]

        epochs = range(1, len(train_losses) + 1)

        # Loss
        axes[0].plot(epochs, train_losses, color=self.colors[0], label='Train', linewidth=1.5)
        axes[0].plot(epochs, val_losses, color=self.colors[1], label='Val', linewidth=1.5)
        best_epoch = np.argmin(val_losses) + 1
        axes[0].axvline(best_epoch, color='gray', linestyle='--', alpha=0.5,
                         label=f'Best epoch: {best_epoch}')
        axes[0].set_xlabel('Epoch')
        axes[0].set_ylabel('Loss')
        axes[0].set_title('Loss')
        axes[0].legend()
        axes[0].grid(True, alpha=0.2)

        # Accuracy
        if train_accs and val_accs:
            axes[1].plot(epochs, train_accs, color=self.colors[0], label='Train', linewidth=1.5)
            axes[1].plot(epochs, val_accs, color=self.colors[1], label='Val', linewidth=1.5)
            axes[1].set_xlabel('Epoch')
            axes[1].set_ylabel('Accuracy')
            axes[1].set_title('Accuracy')
            axes[1].legend()
            axes[1].grid(True, alpha=0.2)

        fig.suptitle(title, fontsize=13)
        plt.tight_layout()
        plt.show()

    def plot_confusion_matrix(
        self,
        confusion_mat: np.ndarray,
        class_names: list[str] | None = None,
        title: str = 'Confusion Matrix',
        normalize: bool = False,
    ) -> None:
        """Plot a confusion matrix heatmap.

        Args:
            confusion_mat: Square confusion matrix array.
            class_names: Class label names. Uses indices if None.
            title: Plot title.
            normalize: If True, normalize rows to show percentages.
        """
        if normalize:
            row_sums = confusion_mat.sum(axis=1, keepdims=True)
            confusion_mat = confusion_mat.astype(float) / row_sums.clip(min=1)

        n_classes = confusion_mat.shape[0]
        if class_names is None:
            class_names = [str(i) for i in range(n_classes)]

        fig, ax = plt.subplots(figsize=(max(6, n_classes * 0.8), max(5, n_classes * 0.7)))
        im = ax.imshow(confusion_mat, cmap='Blues')

        ax.set_xticks(range(n_classes))
        ax.set_xticklabels(class_names, rotation=45, ha='right')
        ax.set_yticks(range(n_classes))
        ax.set_yticklabels(class_names)
        ax.set_xlabel('Predicted')
        ax.set_ylabel('Actual')
        ax.set_title(title)

        fmt = '.2f' if normalize else 'd'
        thresh = confusion_mat.max() / 2
        for i in range(n_classes):
            for j in range(n_classes):
                color = 'white' if confusion_mat[i, j] > thresh else 'black'
                ax.text(j, i, format(confusion_mat[i, j], fmt),
                         ha='center', va='center', color=color, fontsize=10)

        plt.colorbar(im, ax=ax, shrink=0.8)
        plt.tight_layout()
        plt.show()

    def plot_decision_boundary(
        self,
        X: np.ndarray,
        y: np.ndarray,
        predict_fn: callable,
        title: str = 'Decision Boundary',
        resolution: int = 200,
    ) -> None:
        """Plot 2D decision boundary for a classifier.

        Args:
            X: Feature array of shape (n_samples, 2).
            y: Label array of shape (n_samples,).
            predict_fn: Callable that takes (n, 2) array and returns labels.
            title: Plot title.
            resolution: Grid resolution for the boundary.
        """
        assert X.shape[1] == 2, f'Need 2D features, got {X.shape[1]}D'

        x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
        y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
        xx, yy = np.meshgrid(
            np.linspace(x_min, x_max, resolution),
            np.linspace(y_min, y_max, resolution),
        )
        grid_points = np.c_[xx.ravel(), yy.ravel()]
        Z = predict_fn(grid_points).reshape(xx.shape)

        n_classes = len(np.unique(y))
        cmap_bg = ListedColormap(self.colors[:n_classes])

        fig, ax = plt.subplots(figsize=(8, 6))
        ax.contourf(xx, yy, Z, alpha=0.3, cmap=cmap_bg, levels=n_classes - 1)
        ax.contour(xx, yy, Z, colors='white', linewidths=0.8, alpha=0.5)

        for cls in np.unique(y):
            mask = y == cls
            ax.scatter(X[mask, 0], X[mask, 1], c=self.colors[cls],
                        s=30, label=f'Class {cls}', edgecolors='white',
                        linewidths=0.5, alpha=0.8)

        ax.set_xlabel('Feature 1')
        ax.set_ylabel('Feature 2')
        ax.set_title(title)
        ax.legend()
        plt.tight_layout()
        plt.show()

    def plot_image_grid(
        self,
        images: np.ndarray,
        labels: list[str] | None = None,
        n_cols: int = 8,
        title: str = 'Image Grid',
        cmap: str = 'gray',
    ) -> None:
        """Display a grid of images.

        Args:
            images: Array of shape (n, H, W) or (n, H, W, C).
            labels: Optional label for each image.
            n_cols: Number of columns in the grid.
            title: Figure title.
            cmap: Colormap for grayscale images.
        """
        n_images = len(images)
        n_rows = (n_images + n_cols - 1) // n_cols

        fig, axes = plt.subplots(n_rows, n_cols, figsize=(n_cols * 1.5, n_rows * 1.8))
        axes = np.array(axes).ravel()

        for idx in range(len(axes)):
            if idx < n_images:
                axes[idx].imshow(images[idx], cmap=cmap)
                if labels is not None:
                    axes[idx].set_title(str(labels[idx]), fontsize=8)
            axes[idx].axis('off')

        fig.suptitle(title, fontsize=13)
        plt.tight_layout()
        plt.show()

    def plot_attention_heatmap(
        self,
        attention_weights: np.ndarray,
        x_labels: list[str] | None = None,
        y_labels: list[str] | None = None,
        title: str = 'Attention Weights',
    ) -> None:
        """Plot attention weights as a heatmap.

        Args:
            attention_weights: 2D array of shape (query_len, key_len).
            x_labels: Labels for keys (columns).
            y_labels: Labels for queries (rows).
            title: Plot title.
        """
        fig, ax = plt.subplots(figsize=(max(6, len(attention_weights[0]) * 0.6),
                                         max(4, len(attention_weights) * 0.5)))
        im = ax.imshow(attention_weights, cmap='viridis', aspect='auto')

        if x_labels is not None:
            ax.set_xticks(range(len(x_labels)))
            ax.set_xticklabels(x_labels, rotation=45, ha='right')
        if y_labels is not None:
            ax.set_yticks(range(len(y_labels)))
            ax.set_yticklabels(y_labels)

        ax.set_xlabel('Key')
        ax.set_ylabel('Query')
        ax.set_title(title)
        plt.colorbar(im, ax=ax, shrink=0.8)
        plt.tight_layout()
        plt.show()

Let's verify the MLPlotter class works correctly with sample data.

In [None]:
def test_ml_plotter() -> None:
    """Test all MLPlotter methods with synthetic data."""
    plotter = MLPlotter()
    np.random.seed(SEED)

    # 1. Training curves
    epochs = 30
    t_loss = [2.5 * np.exp(-0.1 * i) + 0.3 + np.random.randn() * 0.05 for i in range(epochs)]
    v_loss = [2.5 * np.exp(-0.08 * i) + 0.5 + np.random.randn() * 0.08 for i in range(epochs)]
    t_acc = [1 - l / 3 for l in t_loss]
    v_acc = [1 - l / 3 for l in v_loss]
    plotter.plot_training_curves(t_loss, v_loss, t_acc, v_acc, title='Test: Training Curves')
    print('Training curves: ✓')

    # 2. Confusion matrix
    cm = np.array([[45, 3, 2], [5, 38, 7], [1, 4, 45]])
    plotter.plot_confusion_matrix(cm, class_names=['Cat', 'Dog', 'Bird'], title='Test: Confusion Matrix')
    print('Confusion matrix: ✓')

    # 3. Decision boundary
    from sklearn.neighbors import KNeighborsClassifier
    knn = KNeighborsClassifier(n_neighbors=5).fit(X_moons, y_moons)
    plotter.plot_decision_boundary(X_moons, y_moons, knn.predict, title='Test: Decision Boundary (KNN)')
    print('Decision boundary: ✓')

    # 4. Image grid
    fake_images = np.random.rand(16, 28, 28)
    fake_labels = [f'Digit {i % 10}' for i in range(16)]
    plotter.plot_image_grid(fake_images, labels=fake_labels, n_cols=8, title='Test: Image Grid')
    print('Image grid: ✓')

    # 5. Attention heatmap
    tokens = ['The', 'cat', 'sat', 'on', 'the', 'mat']
    attn = np.random.dirichlet(np.ones(6), size=6)  # Each row sums to 1
    plotter.plot_attention_heatmap(attn, x_labels=tokens, y_labels=tokens,
                                    title='Test: Attention Heatmap')
    print('Attention heatmap: ✓')

    print()
    print('All MLPlotter methods passed!')


test_ml_plotter()

---## Part 3 — Application: ML Visualization WorkflowsNow we apply our plotting skills to realistic ML scenarios. These are theexact visualization patterns you'll use in Modules 2–20.

### 3.1 Complete EDA VisualizationA thorough EDA visualization should show distributions, correlations,geographic patterns, and outliers — all in a single, coherent figure.

In [None]:
def complete_eda_visualization(data: pd.DataFrame) -> None:
    """Create a comprehensive EDA dashboard for the housing dataset.

    Args:
        data: California Housing DataFrame.
    """
    fig = plt.figure(figsize=(18, 12))
    gs = gridspec.GridSpec(3, 3, hspace=0.35, wspace=0.3)

    # (0,0) Target distribution
    ax0 = fig.add_subplot(gs[0, 0])
    ax0.hist(data['MedHouseVal'], bins=50, color=COLORS['blue'], alpha=0.7, edgecolor='white')
    ax0.set_xlabel('Median House Value ($100K)')
    ax0.set_ylabel('Count')
    ax0.set_title('Target Distribution')
    ax0.axvline(data['MedHouseVal'].mean(), color=COLORS['red'], linestyle='--')

    # (0,1) Income vs Value scatter
    ax1 = fig.add_subplot(gs[0, 1])
    sample = data.sample(2000, random_state=SEED)
    ax1.scatter(sample['MedInc'], sample['MedHouseVal'], s=5, alpha=0.4, color=COLORS['blue'])
    ax1.set_xlabel('Median Income')
    ax1.set_ylabel('House Value')
    ax1.set_title('Income vs House Value')

    # (0,2) Correlation heatmap
    ax2 = fig.add_subplot(gs[0, 2])
    corr = data.corr()
    im = ax2.imshow(corr, cmap='coolwarm', vmin=-1, vmax=1)
    ax2.set_xticks(range(len(corr)))
    ax2.set_xticklabels(corr.columns, rotation=45, ha='right', fontsize=7)
    ax2.set_yticks(range(len(corr)))
    ax2.set_yticklabels(corr.columns, fontsize=7)
    ax2.set_title('Correlation Matrix')
    plt.colorbar(im, ax=ax2, shrink=0.7)

    # (1,0:2) Geographic scatter (wide)
    ax3 = fig.add_subplot(gs[1, :2])
    sc = ax3.scatter(
        data['Longitude'], data['Latitude'],
        c=data['MedHouseVal'], cmap='viridis',
        s=2, alpha=0.5)
    ax3.set_xlabel('Longitude')
    ax3.set_ylabel('Latitude')
    ax3.set_title('Geographic Distribution (color = house value)')
    plt.colorbar(sc, ax=ax3, shrink=0.8, label='Value ($100K)')

    # (1,2) Population histogram
    ax4 = fig.add_subplot(gs[1, 2])
    ax4.hist(np.log1p(data['Population']), bins=50, color=COLORS['green'],
              alpha=0.7, edgecolor='white')
    ax4.set_xlabel('log(Population + 1)')
    ax4.set_ylabel('Count')
    ax4.set_title('Population (log-transformed)')

    # (2,0) HouseAge distribution
    ax5 = fig.add_subplot(gs[2, 0])
    ax5.hist(data['HouseAge'], bins=50, color=COLORS['orange'], alpha=0.7, edgecolor='white')
    ax5.set_xlabel('House Age (years)')
    ax5.set_ylabel('Count')
    ax5.set_title('House Age Distribution')

    # (2,1) Feature importance by correlation
    ax6 = fig.add_subplot(gs[2, 1])
    target_corr = corr['MedHouseVal'].drop('MedHouseVal').sort_values()
    bar_colors = [COLORS['green'] if v > 0 else COLORS['red'] for v in target_corr.values]
    ax6.barh(target_corr.index, target_corr.values, color=bar_colors)
    ax6.set_xlabel('Correlation with MedHouseVal')
    ax6.set_title('Feature Importance')
    ax6.axvline(0, color='gray', linestyle='-', alpha=0.3)
    ax6.grid(True, axis='x', alpha=0.2)

    # (2,2) Box plot of target by income quintile
    ax7 = fig.add_subplot(gs[2, 2])
    quintiles = pd.qcut(data['MedInc'], q=5, labels=['Q1', 'Q2', 'Q3', 'Q4', 'Q5'])
    box_data = [data['MedHouseVal'][quintiles == q].values for q in ['Q1', 'Q2', 'Q3', 'Q4', 'Q5']]
    bp = ax7.boxplot(box_data, labels=['Q1', 'Q2', 'Q3', 'Q4', 'Q5'], patch_artist=True)
    for patch, color in zip(bp['boxes'], COLOR_LIST[:5]):
        patch.set_facecolor(color)
        patch.set_alpha(0.6)
    ax7.set_xlabel('Income Quintile')
    ax7.set_ylabel('House Value')
    ax7.set_title('Value by Income')

    fig.suptitle('California Housing — Comprehensive EDA', fontsize=16, fontweight='bold')
    plt.show()


complete_eda_visualization(df)

### 3.2 Decision Boundary VisualizationDecision boundaries are one of the most insightful ML visualizations. Theyshow how a classifier partitions the feature space. Let's compare severalclassifiers on the same dataset.

In [None]:
def compare_decision_boundaries() -> None:
    """Compare decision boundaries of different classifiers."""
    from sklearn.neighbors import KNeighborsClassifier
    from sklearn.linear_model import LogisticRegression
    from sklearn.tree import DecisionTreeClassifier
    from sklearn.svm import SVC

    classifiers = [
        ('Logistic Regression', LogisticRegression()),
        ('KNN (k=5)', KNeighborsClassifier(n_neighbors=5)),
        ('Decision Tree', DecisionTreeClassifier(max_depth=5, random_state=SEED)),
        ('SVM (RBF)', SVC(kernel='rbf', gamma='auto')),
    ]

    fig, axes = plt.subplots(1, 4, figsize=(20, 4.5))
    cmap_bg = ListedColormap([COLORS['blue'], COLORS['red']])

    # Create mesh grid
    x_min, x_max = X_moons[:, 0].min() - 1, X_moons[:, 0].max() + 1
    y_min, y_max = X_moons[:, 1].min() - 1, X_moons[:, 1].max() + 1
    xx, yy = np.meshgrid(np.linspace(x_min, x_max, 200),
                          np.linspace(y_min, y_max, 200))
    grid_points = np.c_[xx.ravel(), yy.ravel()]

    for idx, (name, clf) in enumerate(classifiers):
        clf.fit(X_moons, y_moons)
        Z = clf.predict(grid_points).reshape(xx.shape)
        accuracy = clf.score(X_moons, y_moons)

        axes[idx].contourf(xx, yy, Z, alpha=0.3, cmap=cmap_bg)
        axes[idx].contour(xx, yy, Z, colors='white', linewidths=0.8, alpha=0.5)

        for cls, color in zip([0, 1], [COLORS['blue'], COLORS['red']]):
            mask = y_moons == cls
            axes[idx].scatter(X_moons[mask, 0], X_moons[mask, 1],
                               c=color, s=15, edgecolors='white',
                               linewidths=0.3, alpha=0.7)

        axes[idx].set_title(f'{name}\nAcc: {accuracy:.2%}')
        axes[idx].set_xlabel('Feature 1')
        if idx == 0:
            axes[idx].set_ylabel('Feature 2')

    plt.suptitle('Decision Boundaries: Moons Dataset', fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()


compare_decision_boundaries()

### 3.3 Training Simulation & VisualizationLet's simulate a full training run and create the standard training curvevisualization that appears in every DL notebook.

In [None]:
def simulate_training_and_plot() -> None:
    """Simulate a training run and create publication-quality training curves."""
    np.random.seed(SEED)
    n_epochs = 50

    # Simulate realistic training curves
    train_losses: list[float] = []
    val_losses: list[float] = []
    train_accs: list[float] = []
    val_accs: list[float] = []

    for epoch in range(n_epochs):
        # Training loss decreases with noise
        t_loss = 2.0 * np.exp(-0.06 * epoch) + 0.2 + np.random.randn() * 0.03
        # Validation loss decreases then increases (overfitting)
        v_loss = 2.0 * np.exp(-0.04 * epoch) + 0.4 + 0.003 * max(0, epoch - 25) ** 1.5
        v_loss += np.random.randn() * 0.05
        # Accuracies
        t_acc = 1.0 - t_loss / 3.0 + np.random.randn() * 0.01
        v_acc = 1.0 - v_loss / 3.0 + np.random.randn() * 0.01

        train_losses.append(max(t_loss, 0.01))
        val_losses.append(max(v_loss, 0.01))
        train_accs.append(min(max(t_acc, 0), 1))
        val_accs.append(min(max(v_acc, 0), 1))

    # Full training curve visualization
    fig, axes = plt.subplots(1, 3, figsize=(18, 5))

    # Loss curves
    axes[0].plot(range(1, n_epochs + 1), train_losses, color=COLORS['blue'],
                 label='Train Loss', linewidth=1.5)
    axes[0].plot(range(1, n_epochs + 1), val_losses, color=COLORS['red'],
                 label='Val Loss', linewidth=1.5)
    best_epoch = np.argmin(val_losses) + 1
    axes[0].axvline(best_epoch, color='gray', linestyle='--', alpha=0.5,
                     label=f'Best: epoch {best_epoch}')
    # Shade overfitting region
    overfit_start = 25
    axes[0].axvspan(overfit_start, n_epochs, alpha=0.05, color='red', label='Overfitting')
    axes[0].set_xlabel('Epoch')
    axes[0].set_ylabel('Loss')
    axes[0].set_title('Loss Curves')
    axes[0].legend(fontsize=9)
    axes[0].grid(True, alpha=0.2)

    # Accuracy curves
    axes[1].plot(range(1, n_epochs + 1), train_accs, color=COLORS['blue'],
                 label='Train Acc', linewidth=1.5)
    axes[1].plot(range(1, n_epochs + 1), val_accs, color=COLORS['red'],
                 label='Val Acc', linewidth=1.5)
    axes[1].axvline(best_epoch, color='gray', linestyle='--', alpha=0.5)
    axes[1].set_xlabel('Epoch')
    axes[1].set_ylabel('Accuracy')
    axes[1].set_title('Accuracy Curves')
    axes[1].legend(fontsize=9)
    axes[1].grid(True, alpha=0.2)

    # Generalization gap
    gap = np.array(val_losses) - np.array(train_losses)
    axes[2].plot(range(1, n_epochs + 1), gap, color=COLORS['purple'], linewidth=1.5)
    axes[2].fill_between(range(1, n_epochs + 1), gap, alpha=0.2, color=COLORS['purple'])
    axes[2].axhline(0, color='gray', linestyle='--', alpha=0.3)
    axes[2].set_xlabel('Epoch')
    axes[2].set_ylabel('Val Loss - Train Loss')
    axes[2].set_title('Generalization Gap')
    axes[2].grid(True, alpha=0.2)

    fig.suptitle('Training Analysis Dashboard', fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()

    # Summary
    summary = pd.DataFrame({
        'Metric': ['Best Epoch', 'Best Val Loss', 'Best Val Acc',
                   'Final Train Loss', 'Final Gap'],
        'Value': [best_epoch, val_losses[best_epoch - 1],
                  val_accs[best_epoch - 1], train_losses[-1],
                  gap[-1]],
    })
    print('=== Training Summary ===')
    print(summary.to_string(index=False))


simulate_training_and_plot()

---## Part 4 — Evaluation & AnalysisLet's analyze visualization best practices, common mistakes, and builda comprehensive reference for the course.

### 4.1 Common Visualization Mistakes and FixesBad visualizations can actively mislead. Let's demonstrate common mistakesand their corrections.

In [None]:
def demonstrate_visualization_mistakes() -> None:
    """Show common visualization mistakes and their fixes."""
    np.random.seed(SEED)

    # Mistake 1: Truncated y-axis
    fig, axes = plt.subplots(1, 2, figsize=(14, 4))
    values = [0.89, 0.91, 0.90, 0.92]
    models_m = ['A', 'B', 'C', 'D']

    axes[0].bar(models_m, values, color=COLORS['blue'])
    axes[0].set_ylim(0.88, 0.93)  # Exaggerates differences
    axes[0].set_title('BAD: Truncated Y-axis\n(exaggerates differences)')
    axes[0].set_ylabel('Accuracy')

    axes[1].bar(models_m, values, color=COLORS['green'])
    axes[1].set_ylim(0, 1.0)  # Honest scale
    axes[1].set_title('GOOD: Full Y-axis\n(honest comparison)')
    axes[1].set_ylabel('Accuracy')
    for ax in axes:
        for bar, v in zip(ax.patches, values):
            ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.005,
                     f'{v:.2f}', ha='center', fontsize=9)

    plt.tight_layout()
    plt.show()

    # Mistake 2: Too many colors / no legend
    fig, axes = plt.subplots(1, 2, figsize=(14, 4))
    x_vals = np.linspace(0, 10, 100)

    for i in range(8):
        axes[0].plot(x_vals, np.sin(x_vals + i * 0.5) * (1 - i * 0.1),
                      linewidth=1)
    axes[0].set_title('BAD: 8 lines, no legend, default colors')

    for i, (ls, lw) in enumerate([
        ('-', 2.5), ('--', 2), ('-.', 1.5)]):
        axes[1].plot(x_vals, np.sin(x_vals + i * 0.5) * (1 - i * 0.1),
                      color=COLOR_LIST[i], linestyle=ls, linewidth=lw,
                      label=f'Model {i+1}')
    axes[1].set_title('GOOD: 3 lines, clear legend, distinct styles')
    axes[1].legend()

    plt.tight_layout()
    plt.show()

    # Mistake 3: Overplotting
    fig, axes = plt.subplots(1, 2, figsize=(14, 4))
    n_pts = 5000
    x_over = np.random.randn(n_pts)
    y_over = x_over + np.random.randn(n_pts) * 0.5

    axes[0].scatter(x_over, y_over, s=20, color=COLORS['blue'])
    axes[0].set_title('BAD: Overplotting\n(can\'t see density)')
    axes[0].set_xlabel('x')
    axes[0].set_ylabel('y')

    axes[1].scatter(x_over, y_over, s=5, alpha=0.1, color=COLORS['blue'])
    axes[1].set_title('GOOD: Small + transparent\n(reveals density)')
    axes[1].set_xlabel('x')
    axes[1].set_ylabel('y')

    plt.tight_layout()
    plt.show()

    print('Visualization Rules of Thumb:')
    rules = pd.DataFrame({
        'Rule': [
            'Start y-axis at zero for bar charts',
            'Max 5-6 lines per plot',
            'Use alpha < 0.5 for > 1000 points',
            'Always add labels and title',
            'Use tight_layout() to prevent clipping',
            'Use viridis/coolwarm, never jet',
        ],
        'Why': [
            'Prevents exaggerating small differences',
            'More lines become unreadable',
            'Reveals density in scatter plots',
            'Plot should be self-explanatory',
            'Prevents text/label overlap',
            'Perceptually uniform, colorblind-friendly',
        ],
    })
    print(rules.to_string(index=False))


demonstrate_visualization_mistakes()

### 4.2 Saving Figures for Reports and PapersSaving figures correctly matters for papers, presentations, and documentation.Let's cover the key options.

In [None]:
def demonstrate_saving() -> None:
    """Show how to save figures in different formats and resolutions."""
    fig, ax = plt.subplots(figsize=(8, 5))
    x = np.linspace(0, 2 * np.pi, 100)
    ax.plot(x, np.sin(x), color=COLORS['blue'], linewidth=2, label='sin(x)')
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_title('Example Figure for Saving')
    ax.legend()
    plt.tight_layout()

    # Show save options (don't actually save to keep notebook clean)
    print('=== Figure Saving Reference ===')
    save_ref = pd.DataFrame({
        'Format': ['PNG', 'PDF', 'SVG', 'EPS'],
        'Command': [
            'fig.savefig("plot.png", dpi=300)',
            'fig.savefig("plot.pdf")',
            'fig.savefig("plot.svg")',
            'fig.savefig("plot.eps")',
        ],
        'Use Case': [
            'Web, presentations, notebooks',
            'Papers, reports (vector)',
            'Web (vector, editable)',
            'LaTeX papers (vector)',
        ],
        'Key Options': [
            'dpi=300 for print quality',
            'bbox_inches="tight" to crop',
            'Scales perfectly at any size',
            'Legacy format, prefer PDF',
        ],
    })
    print(save_ref.to_string(index=False))
    print()
    print('Best practice: fig.savefig("plot.pdf", bbox_inches="tight", dpi=300)')

    plt.show()


demonstrate_saving()

### 4.3 Plot Type Decision GuideChoosing the right plot type is half the battle. Here's a decision guidemapping data characteristics to the best visualization.

In [None]:
def build_plot_decision_guide() -> None:
    """Create a comprehensive plot type decision guide."""
    guide = pd.DataFrame({
        'Data Type': [
            'One continuous variable',
            'Two continuous variables',
            'Continuous + categorical',
            'Two categorical variables',
            'Time series / sequence',
            'Matrix / 2D grid',
            'Comparison of groups',
            'Distribution comparison',
            'Geographic / spatial',
            'Images',
            'Model predictions vs actual',
            'Feature importance',
        ],
        'Plot Type': [
            'Histogram / KDE',
            'Scatter plot',
            'Box plot / Violin',
            'Heatmap / Mosaic',
            'Line plot',
            'Heatmap / imshow',
            'Bar chart',
            'Overlaid histograms / KDE',
            'Scatter with lat/lon',
            'Image grid (imshow)',
            'Scatter (45° line)',
            'Horizontal bar chart',
        ],
        'Example in ML': [
            'Target distribution',
            'Feature vs target',
            'Accuracy by model type',
            'Confusion matrix',
            'Training curves',
            'Attention weights, correlation',
            'Model comparison',
            'Train vs val loss distribution',
            'Housing prices on map',
            'MNIST samples, generated images',
            'Regression evaluation',
            'Permutation importance',
        ],
    })
    print('=== Plot Type Decision Guide ===')
    print(guide.to_string(index=False))
    print()
    print(f'Total plot types covered: {len(guide)}')


build_plot_decision_guide()

### 4.4 Error Analysis: Misleading VisualizationsLet's demonstrate how the same data can tell different stories dependingon visualization choices — and how to spot misleading plots.

In [None]:
def demonstrate_misleading_plots() -> None:
    """Show how visualization choices affect interpretation."""
    np.random.seed(SEED)

    # Same data, different binning
    data_hist = np.concatenate([np.random.normal(3, 0.5, 500),
                                 np.random.normal(5, 0.8, 300)])

    fig, axes = plt.subplots(1, 3, figsize=(16, 4))

    axes[0].hist(data_hist, bins=5, color=COLORS['blue'], alpha=0.7, edgecolor='white')
    axes[0].set_title('5 bins: hides bimodality')
    axes[0].set_xlabel('Value')
    axes[0].set_ylabel('Count')

    axes[1].hist(data_hist, bins=30, color=COLORS['green'], alpha=0.7, edgecolor='white')
    axes[1].set_title('30 bins: reveals two peaks')
    axes[1].set_xlabel('Value')

    axes[2].hist(data_hist, bins=100, color=COLORS['red'], alpha=0.7, edgecolor='white')
    axes[2].set_title('100 bins: too noisy')
    axes[2].set_xlabel('Value')

    plt.suptitle('Same Data, Different Binning', fontsize=13)
    plt.tight_layout()
    plt.show()

    # Simpson's paradox visualization
    group_a_x = np.random.uniform(1, 5, 100)
    group_a_y = -0.5 * group_a_x + 5 + np.random.randn(100) * 0.3
    group_b_x = np.random.uniform(4, 8, 100)
    group_b_y = -0.5 * group_b_x + 8 + np.random.randn(100) * 0.3

    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Combined: positive trend
    all_x = np.concatenate([group_a_x, group_b_x])
    all_y = np.concatenate([group_a_y, group_b_y])
    axes[0].scatter(all_x, all_y, s=15, color=COLORS['gray'], alpha=0.5)
    z = np.polyfit(all_x, all_y, 1)
    axes[0].plot(np.sort(all_x), np.polyval(z, np.sort(all_x)),
                  color=COLORS['red'], linewidth=2, label=f'Overall: slope={z[0]:.2f}')
    axes[0].set_title('Combined Data: Positive Trend')
    axes[0].set_xlabel('x')
    axes[0].set_ylabel('y')
    axes[0].legend()

    # Grouped: negative trends
    axes[1].scatter(group_a_x, group_a_y, s=15, color=COLORS['blue'], alpha=0.5, label='Group A')
    axes[1].scatter(group_b_x, group_b_y, s=15, color=COLORS['red'], alpha=0.5, label='Group B')
    za = np.polyfit(group_a_x, group_a_y, 1)
    zb = np.polyfit(group_b_x, group_b_y, 1)
    axes[1].plot(np.sort(group_a_x), np.polyval(za, np.sort(group_a_x)),
                  color=COLORS['blue'], linewidth=2)
    axes[1].plot(np.sort(group_b_x), np.polyval(zb, np.sort(group_b_x)),
                  color=COLORS['red'], linewidth=2)
    axes[1].set_title(f"Simpson's Paradox: Both Negative (A={za[0]:.2f}, B={zb[0]:.2f})")
    axes[1].set_xlabel('x')
    axes[1].set_ylabel('y')
    axes[1].legend()

    plt.tight_layout()
    plt.show()

    print('Key lesson: Always consider subgroups in your data.')
    print('A trend in aggregated data may reverse when you look at groups separately.')


demonstrate_misleading_plots()

---## Part 5 — Summary & Lessons Learned### Key Takeaways1. **Always use the object-oriented interface** (`fig, ax = plt.subplots()`). It gives   full control over every element and makes multi-panel figures straightforward.2. **Label everything.** Every plot needs axis labels, a title, and a legend (if   multiple series). Use `tight_layout()` to prevent clipping. A plot without labels   is useless.3. **Choose colormaps deliberately.** Sequential (`viridis`) for ordered data, diverging   (`coolwarm`) for centered data, qualitative for categories. Never use `jet`.4. **Match plot type to data type.** Scatter for two continuous variables, histograms   for distributions, bar charts for comparisons, heatmaps for matrices.5. **Beware of misleading visualizations.** Truncated axes, wrong bin counts, and   aggregation can hide or invent patterns. Always consider subgroups (Simpson's   paradox) and use appropriate scales.### What's Next→ **01-05 (Data Loading with PyTorch)** teaches the Dataset/DataLoader pattern for  efficiently feeding data to neural networks — the bridge between Pandas preprocessing  and model training.### Going Further- [Matplotlib Cheat Sheets](https://matplotlib.org/cheatsheets/) — Official quick  reference cards- [Ten Simple Rules for Better Figures](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003833) —  Publication-quality visualization principles- [Fundamentals of Data Visualization (Claus Wilke)](https://clauswilke.com/dataviz/) —  Free online book on visualization theory