# 6. Analysis of LeNet-5 Learned Features

We've seen that our LeNet model performs very well, but *what* has it learned? In this notebook, we'll peek inside the model by extracting embeddings from its intermediate layers. Unlike CLIP's general-purpose embeddings, these are feature representations specifically optimized for MNIST digit classification.

We will analyze these embeddings on the *training data* to understand how the model organizes the concepts it has learned and to identify interesting samples within our training set.

**Key concepts covered:**
*   Extracting embeddings from intermediate PyTorch model layers
*   Storing custom model embeddings in FiftyOne
*   Visualizing custom embeddings with PCA and UMAP
*   Analyzing uniqueness and representativeness of training samples

![](https://github.com/andandandand/fiftyone/blob/develop/docs/source/getting_started_experiences/Classification/assets/pca_lenet_embeddings.webp?raw=1)

## Setup

First, we'll set up our environment, reloading the necessary datasets, model definitions, and helper functions.

In [1]:
import os
from PIL import Image
import numpy as np
from tqdm import tqdm
from pathlib import Path

import torch
import torch.nn as nn
import torch.nn.functional as Fun
import torchvision.transforms.v2 as transforms
from torch.utils.data import Dataset

import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob
from fiftyone import ViewField as F

# Redefine the model architecture
class ModernLeNet5(nn.Module):
    def __init__(self, num_classes=10):
        super(ModernLeNet5, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, kernel_size=5)
        self.conv2 = nn.Conv2d(6, 16, kernel_size=5)
        self.conv3 = nn.Conv2d(16, 120, kernel_size=4)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(120, 84)
        self.fc2 = nn.Linear(84, num_classes)
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        x = self.pool(Fun.relu(self.conv1(x)))
        x = self.pool(Fun.relu(self.conv2(x)))
        x = Fun.relu(self.conv3(x))
        x = x.view(x.size(0), -1)
        x = Fun.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        return x

# Redefine the custom dataset class
class CustomTorchImageDataset(torch.utils.data.Dataset):
    def __init__(self, fiftyone_dataset, image_transforms=None, label_map=None, gt_field="ground_truth"):
        self.fiftyone_dataset = fiftyone_dataset
        self.image_paths = self.fiftyone_dataset.values("filepath")
        self.str_labels = self.fiftyone_dataset.values(f"{gt_field}.label")
        self.image_transforms = image_transforms
        self.label_map = label_map if label_map is not None else {str(i): i for i in range(10)}

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        image_path = self.image_paths[idx]
        image = Image.open(image_path).convert('L')
        if self.image_transforms: image = self.image_transforms(image)
        label_str = self.str_labels[idx]
        label_idx = self.label_map.get(label_str, -1)
        return image, torch.tensor(label_idx, dtype=torch.long)

## Creating Embeddings from the LeNet-5 Model

We'll use a **PyTorch hook** to capture the output of an intermediate layer during a forward pass. A hook is a function that can be registered to a module and will be executed during the forward or backward pass. We'll grab the 84-dimensional output of the `fc1` layer, which serves as a rich feature embedding just before the final classification.

In [2]:
def extract_lenet_embeddings(model, dataloader, device, layer_name='fc1'):
    """Extracts embeddings from a specified layer of the LeNet model using PyTorch hooks."""
    embeddings_dict = {}

    def hook_fn(module, input, output):
        if len(output.shape) > 2:
            embeddings_dict['embeddings'] = output.view(output.size(0), -1).cpu().detach()
        else:
            embeddings_dict['embeddings'] = output.cpu().detach()

    target_layer = getattr(model, layer_name)
    hook_handle = target_layer.register_forward_hook(hook_fn)

    model.eval()
    all_embeddings = []
    with torch.inference_mode():
        for images, _ in tqdm(dataloader, desc=f"Extracting {layer_name} embeddings"):
            _ = model(images.to(device))
            all_embeddings.append(embeddings_dict['embeddings'].numpy())
    
    hook_handle.remove()
    return np.concatenate(all_embeddings, axis=0)

Now, we'll apply this function to our training dataset. We need a `DataLoader` for the training set **without shuffling** to ensure the extracted embeddings correctly map back to the samples in our FiftyOne dataset.

In [3]:
# Load datasets and model
device = "cuda" if torch.cuda.is_available() else "cpu"
train_dataset = fo.load_dataset("mnist-training-set")
model_save_path = Path(os.getcwd()) / 'best_lenet.pth'
loaded_model = ModernLeNet5().to(device)
loaded_model.load_state_dict(torch.load(model_save_path, map_location=device))
loaded_model.eval()

# Recreate transforms
mean_intensity, std_intensity = 0.1307, 0.3081 # Pre-computed
image_transforms = transforms.Compose([
    transforms.ToImage(),
    transforms.ToDtype(torch.float32, scale=True),
    transforms.Normalize((mean_intensity,), (std_intensity,))
])
dataset_classes = sorted(train_dataset.distinct("ground_truth.label"))
label_map = {label: i for i, label in enumerate(dataset_classes)}

# Create a non-shuffled DataLoader for the training set
torch_train_set = CustomTorchImageDataset(train_dataset, image_transforms=image_transforms, label_map=label_map)
train_inference_loader = torch.utils.data.DataLoader(torch_train_set, batch_size=64, shuffle=False, num_workers=os.cpu_count())

# Extract embeddings
lenet_embeddings = extract_lenet_embeddings(loaded_model, train_inference_loader, device, 'fc1')

# Store embeddings in the FiftyOne dataset
train_dataset.set_values("lenet_embeddings", lenet_embeddings)
train_dataset.save()
print("LeNet embeddings stored in the training dataset.")

Extracting fc1 embeddings:  77%|███████▋  | 617/797 [00:07<00:01, 115.83it/s]

Extracting fc1 embeddings: 100%|██████████| 797/797 [00:09<00:00, 84.14it/s] 


LeNet embeddings stored in the training dataset.


## Visualizing the Learned Embeddings

Just as we did with CLIP, we can compute and visualize 2D projections of our new LeNet embeddings. This will show us how our custom-trained model has learned to separate the different digit classes in its own feature space.

In [8]:
fob.compute_visualization(
    train_dataset,
    embeddings="lenet_embeddings",
    num_dims=2,
    method="umap",
    brain_key="umap_lenet_embeddings"
)

fob.compute_visualization(
    train_dataset,
    embeddings="lenet_embeddings",
    num_dims=2,
    method="pca",
    brain_key="pca_lenet_embeddings"
)

session = fo.launch_app(train_dataset, auto=False)
print("UMAP and PCA visualizations for LeNet embeddings are ready in the App.")
print(session.url)

Generating visualization...
UMAP( verbose=True)
Sun Jul  6 16:11:59 2025 Construct fuzzy simplicial set
Sun Jul  6 16:11:59 2025 Finding Nearest Neighbors
Sun Jul  6 16:11:59 2025 Building RP forest with 16 trees
Sun Jul  6 16:11:59 2025 NN descent for 16 iterations
	 1  /  16
	 2  /  16
	 3  /  16
	Stopping threshold met -- exiting after 3 iterations
Sun Jul  6 16:12:10 2025 Finished Nearest Neighbor Search
Sun Jul  6 16:12:11 2025 Construct embedding


Epochs completed:   0%|            0/200 [00:00]

	completed  0  /  200 epochs
	completed  20  /  200 epochs
	completed  40  /  200 epochs
	completed  60  /  200 epochs
	completed  80  /  200 epochs
	completed  100  /  200 epochs
	completed  120  /  200 epochs
	completed  140  /  200 epochs
	completed  160  /  200 epochs
	completed  180  /  200 epochs
Sun Jul  6 16:14:58 2025 Finished embedding
Generating visualization...
Session launched. Run `session.show()` to open the App in a cell output.
UMAP and PCA visualizations for LeNet embeddings are ready in the App.
http://0.0.0.0:5151/


Open the **Embeddings** panel in the App and select `umap_lenet_embeddings`. Color the points by `ground_truth.label`. You should see remarkably clean and well-separated clusters for each digit, demonstrating that our model has learned very effective features for this task.

#### UMAP 2D Projection of the LeNet Embedding Space
![](https://github.com/andandandand/practical-computer-vision/blob/main/images/umap_lenet_embeddings.png?raw=true)

## Uniqueness and Representativeness

Using these new embeddings, we can compute **uniqueness** and **representativeness** scores for our training samples. 

- **Uniqueness**: Identifies outliers or edge cases that are far from any cluster center.
- **Representativeness**: Identifies samples that are archetypes of their class, sitting right at the center of their clusters.

These metrics are invaluable for data cleaning and for finding the most informative samples to use for tasks like data augmentation.

In [9]:
fob.compute_uniqueness(train_dataset, embeddings='lenet_embeddings')
fob.compute_representativeness(train_dataset, embeddings='lenet_embeddings')

session.refresh()
print("Uniqueness and representativeness scores computed.")

Computing uniqueness...
Uniqueness computation complete
Computing representativeness...
Computing clusters for 51000 embeddings; this may take awhile...
Representativeness computation complete
Uniqueness and representativeness scores computed.


Let's find the most unique samples in our training set. These are often the most interesting, sometimes revealing oddities or labeling errors.

In [10]:
uniqueness_quantiles = train_dataset.quantiles("uniqueness", [0.999])

most_unique_samples_view = train_dataset.match(
                             F("uniqueness") > uniqueness_quantiles[-1]
                             ).sort_by("uniqueness", reverse=True)

session.view = most_unique_samples_view
print(f"Displaying {len(most_unique_samples_view)} most unique training samples in the App: {session.url}")

Displaying 51 most unique training samples in the App: http://0.0.0.0:5151/


![](https://github.com/andandandand/fiftyone/blob/develop/docs/source/getting_started_experiences/Classification/assets/51_most_unique_samples.webp?raw=1)

## Exercises

1. Create views of highly unique and highly representative samples grouped by `ground_truth.label`, use descending sort by `uniqueness` and `representativeness` to produce them. 
2. Create a list of images that you would like to **remove** from the training set due to their quality. Use the `id` or `filepath` primitives to select them. 

## Next Steps

We have now deeply analyzed our training data and our model's learned features. We have identified unique, representative, and potentially problematic samples.

The final step is to use this knowledge to improve our model. We will perform targeted data augmentation on the samples our model struggles with and then fine-tune it.

Proceed to `7_retraining_with_augmentation.ipynb`.