<a href="https://colab.research.google.com/github/andandandand/fiftyone/blob/develop/docs/source/getting_started_experiences/Classification/step_1_mnist.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. MNIST Dataset Exploration with FiftyOne

Welcome to the first notebook in our series on image classification with FiftyOne and PyTorch!

In this step, we will load the classic MNIST dataset from the FiftyOne Dataset Zoo, explore its structure, and compute and visualize its metadata. This initial exploration is crucial for understanding the data we'll be working with in the subsequent steps.

**Key concepts covered:**
*   Loading datasets from the FiftyOne Dataset Zoo
*   Computing image metadata (size, width, height)
*   Using FiftyOne aggregations for data statistics
*   Visualizing dataset distributions in the FiftyOne App

## Installation

First, let's install the required packages.

In [1]:
# Remove > /dev/null if you encounter errors during installation
!pip install fiftyone torch torchvision numpy albumentations > /dev/null

## Imports

In [3]:
import os

# Set environment variables for reproducibility BEFORE importing torch
os.environ['PYTHONHASHSEED'] = '51'
os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8'

from PIL import Image
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as Fun
import torchvision.transforms.v2 as transforms
import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob
from torch.utils.data import Dataset, ConcatDataset
from fiftyone import ViewField as F
import fiftyone.utils.random as four
from tqdm import tqdm
from torch.optim import Adam
from pathlib import Path
import matplotlib.pyplot as plt
import gc
import albumentations as A
import cv2
import random
from typing import Optional, Dict, Tuple, Any

### Loading the MNIST Dataset from FiftyOne's Dataset Zoo

A FiftyOne dataset wraps together annotations and image data into a unified, queryable structure. Unlike traditional approaches where you manage separate files for images and labels, FiftyOne treats each sample as a rich object containing the image, ground truth labels, metadata, and any predictions or embeddings you add later.

Loading MNIST from the [FiftyOne Dataset Zoo](https://docs.voxel51.com/dataset_zoo/index.html) is straightforward. We'll start by loading the test split.

In [4]:
# We will load the test split from the dataset first
test_dataset = foz.load_zoo_dataset("mnist", split='test', dataset_name="mnist-test-set", persistent=True)
test_dataset

Downloading split 'test' to '/root/fiftyone/mnist/test'


INFO:fiftyone.zoo.datasets:Downloading split 'test' to '/root/fiftyone/mnist/test'
100%|██████████| 9.91M/9.91M [00:00<00:00, 55.9MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 1.72MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 12.7MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 3.10MB/s]

   1% |/------------|   143/10000 [105.4ms elapsed, 7.3s remaining, 1.4K samples/s] 




 100% |█████████████| 10000/10000 [4.6s elapsed, 0s remaining, 2.1K samples/s]      


INFO:eta.core.utils: 100% |█████████████| 10000/10000 [4.6s elapsed, 0s remaining, 2.1K samples/s]      


Dataset info written to '/root/fiftyone/mnist/info.json'


INFO:fiftyone.zoo.datasets:Dataset info written to '/root/fiftyone/mnist/info.json'


Loading 'mnist' split 'test'


INFO:fiftyone.zoo.datasets:Loading 'mnist' split 'test'


 100% |█████████████| 10000/10000 [7.1s elapsed, 0s remaining, 1.6K samples/s]      


INFO:eta.core.utils: 100% |█████████████| 10000/10000 [7.1s elapsed, 0s remaining, 1.6K samples/s]      


Dataset 'mnist-test-set' created


INFO:fiftyone.zoo.datasets:Dataset 'mnist-test-set' created


Name:        mnist-test-set
Media type:  image
Num samples: 10000
Persistent:  True
Tags:        []
Sample fields:
    id:               fiftyone.core.fields.ObjectIdField
    filepath:         fiftyone.core.fields.StringField
    tags:             fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:       fiftyone.core.fields.DateTimeField
    last_modified_at: fiftyone.core.fields.DateTimeField
    ground_truth:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)

Let's launch the FiftyOne App to visualize the test set.

In [5]:
session = fo.launch_app(test_dataset, auto=False)
print(session.url)

Session launched. Run `session.show()` to open the App in a cell output.


INFO:fiftyone.core.session.session:Session launched. Run `session.show()` to open the App in a cell output.



Welcome to

███████╗██╗███████╗████████╗██╗   ██╗ ██████╗ ███╗   ██╗███████╗
██╔════╝██║██╔════╝╚══██╔══╝╚██╗ ██╔╝██╔═══██╗████╗  ██║██╔════╝
█████╗  ██║█████╗     ██║    ╚████╔╝ ██║   ██║██╔██╗ ██║█████╗
██╔══╝  ██║██╔══╝     ██║     ╚██╔╝  ██║   ██║██║╚██╗██║██╔══╝
██║     ██║██║        ██║      ██║   ╚██████╔╝██║ ╚████║███████╗
╚═╝     ╚═╝╚═╝        ╚═╝      ╚═╝    ╚═════╝ ╚═╝  ╚═══╝╚══════╝ v1.7.0

If you're finding FiftyOne helpful, here's how you can get involved:

|
|  ⭐⭐⭐ Give the project a star on GitHub ⭐⭐⭐
|  https://github.com/voxel51/fiftyone
|
|  🚀🚀🚀 Join the FiftyOne Discord community 🚀🚀🚀
|  https://community.voxel51.com/
|



INFO:fiftyone.core.session.session:
Welcome to

███████╗██╗███████╗████████╗██╗   ██╗ ██████╗ ███╗   ██╗███████╗
██╔════╝██║██╔════╝╚══██╔══╝╚██╗ ██╔╝██╔═══██╗████╗  ██║██╔════╝
█████╗  ██║█████╗     ██║    ╚████╔╝ ██║   ██║██╔██╗ ██║█████╗
██╔══╝  ██║██╔══╝     ██║     ╚██╔╝  ██║   ██║██║╚██╗██║██╔══╝
██║     ██║██║        ██║      ██║   ╚██████╔╝██║ ╚████║███████╗
╚═╝     ╚═╝╚═╝        ╚═╝      ╚═╝    ╚═════╝ ╚═╝  ╚═══╝╚══════╝ v1.7.0

If you're finding FiftyOne helpful, here's how you can get involved:

|
|  ⭐⭐⭐ Give the project a star on GitHub ⭐⭐⭐
|  https://github.com/voxel51/fiftyone
|
|  🚀🚀🚀 Join the FiftyOne Discord community 🚀🚀🚀
|  https://community.voxel51.com/
|



https://5151-m-s-2fh7c0srv2q4q-d.us-east1-0.prod.colab.dev?polling=true


With `compute_metadata()`, we can easily add metadata like image size, width, height, and number of channels to each sample in our dataset.

In [6]:
test_dataset.compute_metadata()

Computing metadata...


INFO:fiftyone.core.metadata:Computing metadata...


 100% |█████████████| 10000/10000 [5.5s elapsed, 0s remaining, 1.8K samples/s]        


INFO:eta.core.utils: 100% |█████████████| 10000/10000 [5.5s elapsed, 0s remaining, 1.8K samples/s]        


We can perform [aggregations](https://docs.voxel51.com/user_guide/using_aggregations.html) on the dataset to explore its properties. For example, we can find the range, mean, and standard deviation of the image sizes in bytes.

In [7]:
print(f'File size (bytes) range: {test_dataset.bounds("metadata.size_bytes")}')
print(f'File size (bytes) mean: {test_dataset.mean("metadata.size_bytes"):.2f}')
print(f'File size (bytes) std dev: {test_dataset.std("metadata.size_bytes"):.2f}')

File size (bytes) range: (483, 1033)
File size (bytes) mean: 768.61
File size (bytes) std dev: 84.01


### Visualizing Distributions

In the FiftyOne App, you can visualize the distributions of any field. This is useful for checking class balance or understanding metadata distributions.

1. Click the `+` symbol next to the `Samples` tab.
2. Select `Histograms`.
3. Choose `ground_truth.label` from the dropdown to see the class distribution.
4. Add another histogram for `metadata.size_bytes`.

You should see that the classes are well-balanced, which is great for training and evaluation.

![](https://github.com/andandandand/practical-computer-vision/blob/main/images/ground_truh_distribution_mnist.png?raw=true)

In [8]:
session.refresh()

## Next Steps

Now that we have explored the basic properties of the MNIST dataset, we are ready for the next step: creating image embeddings with CLIP to understand the semantic relationships between images.

Proceed to `step_2_clip_embeddings.ipynb`.