# Autoencoders: Build, Train, Explore

**Module 6.1, Lesson 2** | CourseAI

In this notebook you will:

1. **Build a convolutional autoencoder** â€” encoder (CNN you already know) + decoder (ConvTranspose2d) with a 32-dimensional bottleneck
2. **Train it on Fashion-MNIST** and visualize reconstructions â€” see what the bottleneck preserves vs loses
3. **Experiment with bottleneck size** â€” compare 2D, 8D, and 32D latent spaces and observe the quality tradeoff
4. **Visualize the 2D latent space** â€” color-code encoded images by digit class, observe clustering
5. **Build a denoising autoencoder** â€” add noise to inputs, train the network to reconstruct clean images

**For each exercise, PREDICT the output before running the cell.**

---

## Setup

Run this cell to install dependencies, import libraries, and download Fashion-MNIST.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np

# Reproducible results
torch.manual_seed(42)
np.random.seed(42)

# Use GPU if available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')

# Nice plots
plt.style.use('dark_background')
plt.rcParams['figure.figsize'] = [10, 4]

# Download Fashion-MNIST
transform = transforms.Compose([
    transforms.ToTensor(),  # scales pixels to [0, 1]
])

train_dataset = torchvision.datasets.FashionMNIST(
    root='./data', train=True, download=True, transform=transform
)
test_dataset = torchvision.datasets.FashionMNIST(
    root='./data', train=False, download=True, transform=transform
)

train_loader = torch.utils.data.DataLoader(
    train_dataset, batch_size=128, shuffle=True
)
test_loader = torch.utils.data.DataLoader(
    test_dataset, batch_size=128, shuffle=False
)

# Class names for visualization
CLASS_NAMES = [
    'T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
    'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'
]

print(f'Training samples: {len(train_dataset)}')
print(f'Test samples:     {len(test_dataset)}')
print(f'Image shape:      {train_dataset[0][0].shape}')

---

## Shared Helpers

Utility functions used across multiple exercises.

In [None]:
def show_reconstructions(model, test_loader, n=8, title='Reconstructions'):
    """Show original images (top row) and reconstructions (bottom row)."""
    model.eval()
    images, _ = next(iter(test_loader))
    images = images[:n].to(device)

    with torch.no_grad():
        recon = model(images)

    images_np = images.cpu().numpy()
    recon_np = recon.cpu().numpy()

    fig, axes = plt.subplots(2, n, figsize=(n * 1.5, 3))
    for i in range(n):
        axes[0, i].imshow(images_np[i].squeeze(), cmap='gray')
        axes[0, i].axis('off')
        if i == 0:
            axes[0, i].set_title('Original', fontsize=10)

        axes[1, i].imshow(recon_np[i].squeeze(), cmap='gray')
        axes[1, i].axis('off')
        if i == 0:
            axes[1, i].set_title('Reconstructed', fontsize=10)

    fig.suptitle(title, fontsize=13)
    plt.tight_layout()
    plt.show()


def train_autoencoder(model, train_loader, num_epochs=10, lr=1e-3):
    """Train an autoencoder with MSE reconstruction loss. Returns loss history."""
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)
    history = []

    for epoch in range(num_epochs):
        model.train()
        running_loss = 0.0
        total = 0

        for images, _ in train_loader:  # labels ignored!
            images = images.to(device)

            recon = model(images)
            loss = criterion(recon, images)  # target IS the input

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            running_loss += loss.item() * images.size(0)
            total += images.size(0)

        epoch_loss = running_loss / total
        history.append(epoch_loss)
        print(f'  Epoch {epoch+1:2d}/{num_epochs}  Loss: {epoch_loss:.6f}')

    return history

print('Helpers loaded.')

---

## Exercise 1: Build a Simple Autoencoder [Guided]

The autoencoder has an **hourglass** shape: the encoder compresses the input through a bottleneck, and the decoder reconstructs it.

- **Encoder**: Conv2d layers (the CNN you already know) that shrink spatial dimensions and end at a small latent vector
- **Bottleneck**: A 32-dimensional vector â€” the latent code
- **Decoder**: Linear + Unflatten + ConvTranspose2d layers that expand back to 28x28

The code below is complete. Read through it carefully before running.

**Before running, predict:**
- How many parameters does this model have? (Hint: the encoder is like a small CNN ending at 32 outputs, and the decoder mirrors it.)
- What is the compression ratio? (784 input pixels down to 32 latent numbers)
- Why does the decoder end with `Sigmoid` instead of `ReLU`?

In [None]:
class Autoencoder(nn.Module):
    def __init__(self, bottleneck_size=32):
        super().__init__()

        # Encoder: same CNN pattern you already know
        # Spatial shrinks, channels grow â€” then flatten to the bottleneck
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 16, 3, stride=2, padding=1),   # 1x28x28 -> 16x14x14
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, stride=2, padding=1),  # 16x14x14 -> 32x7x7
            nn.ReLU(),
            nn.Flatten(),                                # 32*7*7 = 1568
            nn.Linear(32 * 7 * 7, bottleneck_size),      # 1568 -> bottleneck
        )

        # Decoder: reverse the encoder
        # Linear -> unflatten back to spatial -> ConvTranspose2d to upsample
        self.decoder = nn.Sequential(
            nn.Linear(bottleneck_size, 32 * 7 * 7),      # bottleneck -> 1568
            nn.Unflatten(1, (32, 7, 7)),                  # reshape to 32x7x7
            nn.ConvTranspose2d(32, 16, 3, stride=2,       # 32x7x7 -> 16x14x14
                               padding=1, output_padding=1),
            nn.ReLU(),
            nn.ConvTranspose2d(16, 1, 3, stride=2,        # 16x14x14 -> 1x28x28
                               padding=1, output_padding=1),
            nn.Sigmoid(),  # pixel values in [0, 1]
        )

    def forward(self, x):
        latent = self.encoder(x)     # compress
        recon = self.decoder(latent)  # reconstruct
        return recon


# Create the model and examine it
model = Autoencoder(bottleneck_size=32).to(device)

# Dimension check
test_input = torch.randn(1, 1, 28, 28).to(device)
test_output = model(test_input)
print(f'Input shape:  {test_input.shape}')    # [1, 1, 28, 28]
print(f'Output shape: {test_output.shape}')   # [1, 1, 28, 28] â€” same as input!
print()

# Count parameters
total_params = sum(p.numel() for p in model.parameters())
encoder_params = sum(p.numel() for p in model.encoder.parameters())
decoder_params = sum(p.numel() for p in model.decoder.parameters())
print(f'Encoder params: {encoder_params:,}')
print(f'Decoder params: {decoder_params:,}')
print(f'Total params:   {total_params:,}')
print()

# Compression ratio
input_size = 28 * 28  # 784 pixels
bottleneck_size = 32
print(f'Compression: {input_size} pixels -> {bottleneck_size} latent numbers')
print(f'Ratio: {input_size / bottleneck_size:.1f}x compression')

**What happened:**

The model takes a 1x28x28 image and returns a 1x28x28 reconstruction. The hourglass shape is visible in the parameter counts â€” the encoder and decoder are roughly symmetric.

The compression ratio is 24.5x: 784 pixels squeezed into 32 numbers. The network must decide what to keep. The `Sigmoid` at the end ensures output pixels are in [0, 1], matching the input range from `ToTensor()`.

---

## Exercise 2: Train and Visualize Reconstructions [Guided]

Now we train the autoencoder. The training loop is identical to every model you have trained, with one key difference: **the target IS the input**. Labels are ignored.

```python
loss = criterion(recon, images)  # not criterion(output, labels)!
```

The loss is MSE between input and reconstruction. It measures what the bottleneck fails to preserve.

**Before running, predict:**
- Will the reconstructions be sharp or blurry? (32 numbers to represent 784 pixels...)
- Will you be able to tell which clothing category each reconstruction belongs to?
- What kind of detail do you expect to be lost first â€” overall shape, or fine texture?

In [None]:
print('Training autoencoder (bottleneck=32)...')
print('=' * 45)
history_32 = train_autoencoder(model, train_loader, num_epochs=15)
print('=' * 45)
print('Done!')

In [None]:
# Plot the training loss curve
plt.figure(figsize=(8, 3))
plt.plot(range(1, len(history_32) + 1), history_32, 'o-', linewidth=2)
plt.xlabel('Epoch')
plt.ylabel('MSE Loss')
plt.title('Autoencoder Training Loss (bottleneck=32)')
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

In [None]:
# Visualize reconstructions
show_reconstructions(model, test_loader, n=10, title='Autoencoder Reconstructions (bottleneck=32)')

**What happened:**

The reconstructions are recognizable but **blurry**. You can clearly tell a trouser from a sneaker from a dress â€” the overall shape survives the bottleneck. But fine details like texture, stitching patterns, and sharp edges are lost.

This is the bottleneck at work: 32 numbers can capture the broad structure (silhouette, category, rough proportions) but cannot encode every pixel perfectly. The network learned to keep what matters most for reconstruction and discard the rest.

Notice: the loss compares the reconstruction to the **input**, not to any label. The data is its own target. Labels are completely ignored (the `_` in `for images, _ in train_loader`).

---

## Exercise 3: Experiment with Bottleneck Size [Supported]

The bottleneck size controls the **tradeoff between compression and reconstruction quality**:
- Too small â†’ the network loses important information, reconstructions are very blurry
- Too large â†’ the network can copy pixels without learning meaningful features (the overcomplete trap)

Your task: train autoencoders with **2D** and **8D** bottlenecks, then compare all three (2D, 8D, 32D) side by side.

**Think first:** With only 2 latent dimensions, what can possibly survive the compression? With 8?

In [None]:
# TODO: Create and train an autoencoder with bottleneck_size=2
# 1. Create the model: Autoencoder(bottleneck_size=2).to(device)
# 2. Train it: train_autoencoder(model_2d, train_loader, num_epochs=15)

model_2d = None   # TODO: replace
history_2d = None  # TODO: replace

print()

# TODO: Create and train an autoencoder with bottleneck_size=8
# Same pattern as above

model_8d = None   # TODO: replace
history_8d = None  # TODO: replace

<details>
<summary>ðŸ’¡ Solution</summary>

The key insight: we use the exact same `Autoencoder` class â€” the only thing that changes is the `bottleneck_size` argument. The architecture adapts automatically because the Linear layers connecting to the bottleneck adjust their dimensions.

```python
# 2D bottleneck
model_2d = Autoencoder(bottleneck_size=2).to(device)
print('Training autoencoder (bottleneck=2)...')
history_2d = train_autoencoder(model_2d, train_loader, num_epochs=15)

print()

# 8D bottleneck
model_8d = Autoencoder(bottleneck_size=8).to(device)
print('Training autoencoder (bottleneck=8)...')
history_8d = train_autoencoder(model_8d, train_loader, num_epochs=15)
```

Nothing new â€” we are reusing the architecture from Exercise 1 with different bottleneck sizes.

</details>

In [None]:
# Compare reconstructions across all three bottleneck sizes
# Get a fixed batch of test images for fair comparison
fixed_images, _ = next(iter(test_loader))
fixed_images = fixed_images[:8].to(device)

fig, axes = plt.subplots(4, 8, figsize=(12, 6))

# Row 0: originals
for i in range(8):
    axes[0, i].imshow(fixed_images[i].cpu().squeeze(), cmap='gray')
    axes[0, i].axis('off')
axes[0, 0].set_ylabel('Original', fontsize=10, rotation=0, labelpad=60)

# Row 1: bottleneck=2
with torch.no_grad():
    recon_2d = model_2d(fixed_images).cpu().numpy()
for i in range(8):
    axes[1, i].imshow(recon_2d[i].squeeze(), cmap='gray')
    axes[1, i].axis('off')
axes[1, 0].set_ylabel('2D', fontsize=10, rotation=0, labelpad=60)

# Row 2: bottleneck=8
with torch.no_grad():
    recon_8d = model_8d(fixed_images).cpu().numpy()
for i in range(8):
    axes[2, i].imshow(recon_8d[i].squeeze(), cmap='gray')
    axes[2, i].axis('off')
axes[2, 0].set_ylabel('8D', fontsize=10, rotation=0, labelpad=60)

# Row 3: bottleneck=32
with torch.no_grad():
    recon_32d = model(fixed_images).cpu().numpy()
for i in range(8):
    axes[3, i].imshow(recon_32d[i].squeeze(), cmap='gray')
    axes[3, i].axis('off')
axes[3, 0].set_ylabel('32D', fontsize=10, rotation=0, labelpad=60)

fig.suptitle('Reconstruction Quality vs Bottleneck Size', fontsize=13)
plt.tight_layout()
plt.show()

print('\nFinal MSE loss by bottleneck size:')
print(f'  2D:  {history_2d[-1]:.6f}')
print(f'  8D:  {history_8d[-1]:.6f}')
print(f'  32D: {history_32[-1]:.6f}')

**What happened:**

With **2 dimensions**: only the crudest shape survives â€” you might tell a trouser from a bag, but details are completely gone. Two numbers can only encode the broadest features.

With **8 dimensions**: recognizable category and rough shape, but still quite blurry. The network has more room to encode distinguishing features.

With **32 dimensions**: the best reconstruction of the three â€” outlines are clearer, proportions are more accurate. But it is still imperfect.

This is the bottleneck tradeoff: fewer dimensions forces the network to learn what truly matters (better compression, worse reconstruction). More dimensions allows better reconstruction but less meaningful compression. And if the bottleneck is too large (>= 784), the network can just copy â€” learning nothing. That is the **overcomplete trap**.

---

## Exercise 4: Visualize the 2D Latent Space [Supported]

The 2D autoencoder has a special property: we can **plot the latent space directly** on a scatter plot. Each image encodes to a point (z1, z2) in 2D space.

Your task: encode all test images through the 2D encoder and plot them, color-coded by clothing class. This reveals whether the autoencoder organizes similar items near each other.

**Think first:** Do you expect clear clusters by class? Remember â€” the autoencoder was never told about labels. It only learned to reconstruct.

In [None]:
# Encode all test images into the 2D latent space
model_2d.eval()
all_latents = []
all_labels = []

with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        # TODO: Use model_2d.encoder (not the full model) to get latent codes
        # Hint: latent = model_2d.encoder(images)
        latent = None  # TODO: replace

        all_latents.append(latent.cpu())
        all_labels.append(labels)

all_latents = torch.cat(all_latents).numpy()  # shape: [10000, 2]
all_labels = torch.cat(all_labels).numpy()     # shape: [10000]

print(f'Latent vectors shape: {all_latents.shape}')
print(f'Labels shape: {all_labels.shape}')

<details>
<summary>ðŸ’¡ Solution</summary>

The key insight: we use `model_2d.encoder` (just the encoder half), not `model_2d` (the full autoencoder). The full model returns the reconstruction; the encoder returns the latent code.

```python
latent = model_2d.encoder(images)
```

This gives us the 2D latent vector for each image â€” the compressed representation that the bottleneck learned.

</details>

In [None]:
# TODO: Create a scatter plot of the 2D latent space, color-coded by class
#
# Steps:
# 1. Create a figure: plt.figure(figsize=(10, 8))
# 2. For each class (0-9), plot its points with a different color:
#    mask = all_labels == class_idx
#    plt.scatter(all_latents[mask, 0], all_latents[mask, 1],
#                s=2, alpha=0.5, label=CLASS_NAMES[class_idx])
# 3. Add legend, title, axis labels
#
# Hint: loop over range(10) for the 10 classes

pass  # TODO: replace with your plotting code

print('\nNotice: the autoencoder was NEVER told about class labels.')
print('Any clustering emerged purely from learning to reconstruct.')

<details>
<summary>ðŸ’¡ Solution</summary>

The key insight: we loop over each class and plot its points separately so each class gets its own color and legend entry. The scatter plot reveals structure that the autoencoder discovered on its own.

```python
plt.figure(figsize=(10, 8))

for class_idx in range(10):
    mask = all_labels == class_idx
    plt.scatter(
        all_latents[mask, 0],
        all_latents[mask, 1],
        s=2, alpha=0.5,
        label=CLASS_NAMES[class_idx]
    )

plt.legend(markerscale=5, fontsize=9)
plt.xlabel('Latent Dimension 1')
plt.ylabel('Latent Dimension 2')
plt.title('2D Latent Space (colored by class)')
plt.grid(alpha=0.2)
plt.tight_layout()
plt.show()

print('\nNotice: the autoencoder was NEVER told about class labels.')
print('Any clustering emerged purely from learning to reconstruct.')
```

Common mistake: forgetting to set a small `s` (point size) and `alpha` (transparency). With 10,000 points, large opaque dots create an unreadable blob.

</details>

**What happened:**

You should see partial clustering â€” items of the same class tend to group together, but the boundaries are messy and overlapping. Trousers are probably in one region, bags in another, but T-shirts and shirts overlap heavily (because they look similar).

The remarkable thing: **the autoencoder was never told about labels**. It learned to organize similar-looking items near each other purely from the pressure to reconstruct. To compress a trouser into 2 numbers and get a trouser-shaped reconstruction, the network must encode "this is trouser-shaped" somewhere in those 2 numbers.

But notice the **gaps**. There are regions of the 2D space where no training image was encoded. If you fed a random 2D point from one of these gaps to the decoder, you would get garbage â€” not a recognizable image. This is exactly why the autoencoder is **not a generative model**. The latent space has structure where data was encoded, but uncharted gaps between the clusters.

---

## Exercise 5: Build a Denoising Autoencoder [Independent]

A **denoising autoencoder** receives a noisy version of the input and must reconstruct the **clean** original. This forces the network to learn even more robust features â€” it cannot just memorize or copy, because the noise is different every time.

**Your task:**

1. Write a function `add_noise(images, noise_factor=0.3)` that adds Gaussian noise to images and clamps the result to [0, 1]
2. Build a `DenoisingAutoencoder` with bottleneck_size=32 (same architecture as `Autoencoder`)
3. Write a training loop where:
   - The **input** to the model is `noisy_images = add_noise(images)`
   - The **target** for the loss is the **clean** `images`
   - `loss = criterion(model(noisy_images), images)`
4. Train for 15 epochs
5. Visualize: show noisy input (top row), denoised reconstruction (middle row), clean original (bottom row)

**Hint:** The architecture is identical to `Autoencoder`. The only change is in the training loop: you feed noisy images in but compute loss against clean images.

In [None]:
# Your code here â€” build and train a denoising autoencoder



<details>
<summary>ðŸ’¡ Solution</summary>

The key insight: the architecture does not change at all. The only difference from a regular autoencoder is **what you feed in** and **what you compare against**. The model receives noisy images but the loss is computed against the clean originals. This forces the bottleneck to learn features that are robust to noise â€” it must figure out what the "real" image looks like underneath the noise.

```python
def add_noise(images, noise_factor=0.3):
    """Add Gaussian noise to images and clamp to [0, 1]."""
    noise = torch.randn_like(images) * noise_factor
    noisy = images + noise
    return torch.clamp(noisy, 0.0, 1.0)


# Same architecture â€” the denoising behavior comes from the training loop
denoising_model = Autoencoder(bottleneck_size=32).to(device)

criterion = nn.MSELoss()
optimizer = optim.Adam(denoising_model.parameters(), lr=1e-3)

print('Training denoising autoencoder...')
print('=' * 45)

for epoch in range(15):
    denoising_model.train()
    running_loss = 0.0
    total = 0

    for images, _ in train_loader:
        images = images.to(device)
        noisy_images = add_noise(images)  # corrupt the input

        recon = denoising_model(noisy_images)  # feed noisy
        loss = criterion(recon, images)         # compare to CLEAN

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)
        total += images.size(0)

    epoch_loss = running_loss / total
    print(f'  Epoch {epoch+1:2d}/15  Loss: {epoch_loss:.6f}')

print('=' * 45)
print('Done!')

# Visualize: noisy input â†’ denoised â†’ clean original
denoising_model.eval()
test_images, _ = next(iter(test_loader))
test_images = test_images[:8].to(device)
noisy_test = add_noise(test_images, noise_factor=0.3)

with torch.no_grad():
    denoised = denoising_model(noisy_test)

fig, axes = plt.subplots(3, 8, figsize=(12, 5))
for i in range(8):
    axes[0, i].imshow(noisy_test[i].cpu().squeeze(), cmap='gray')
    axes[0, i].axis('off')

    axes[1, i].imshow(denoised[i].cpu().squeeze(), cmap='gray')
    axes[1, i].axis('off')

    axes[2, i].imshow(test_images[i].cpu().squeeze(), cmap='gray')
    axes[2, i].axis('off')

axes[0, 0].set_ylabel('Noisy', fontsize=10, rotation=0, labelpad=50)
axes[1, 0].set_ylabel('Denoised', fontsize=10, rotation=0, labelpad=50)
axes[2, 0].set_ylabel('Original', fontsize=10, rotation=0, labelpad=50)

fig.suptitle('Denoising Autoencoder Results', fontsize=13)
plt.tight_layout()
plt.show()
```

Common mistake: computing loss against the noisy images instead of the clean originals. That would just train a regular autoencoder on noisy data. The denoising objective specifically requires `criterion(recon, clean_images)`.

</details>

---

## Key Takeaways

1. **The autoencoder is an hourglass: compress through a bottleneck, then reconstruct.** The encoder is a CNN you already know. The decoder reverses it with ConvTranspose2d. The bottleneck forces the network to learn what matters about the input.

2. **Reconstruction loss = MSE between input and output. The target IS the input.** No labels needed. The data supervises itself. The loss measures what the bottleneck fails to preserve.

3. **Bottleneck size controls the compression-quality tradeoff.** Smaller bottleneck = more compression, blurrier reconstruction, but the latent code captures only what truly matters. Too large and the network just copies pixels (the overcomplete trap).

4. **The latent space has structure even without labels.** Similar items cluster together because the network must encode similar-looking images to similar latent codes to reconstruct them well.

5. **The autoencoder is NOT a generative model.** The latent space has gaps â€” random points produce garbage. Only points near real encoded images are meaningful. Making the latent space smooth enough to sample from is what the Variational Autoencoder does (next lesson).