# 1. MNIST Dataset Exploration with FiftyOne

Welcome to the first notebook in our series on image classification with FiftyOne and PyTorch!

In this step, we will load the classic MNIST dataset from the FiftyOne Dataset Zoo, explore its structure, and compute and visualize its metadata. This initial exploration is crucial for understanding the data we'll be working with in the subsequent steps.

**Key concepts covered:**
*   Loading datasets from the FiftyOne Dataset Zoo
*   Computing image metadata (size, width, height)
*   Using FiftyOne aggregations for data statistics
*   Visualizing dataset distributions in the FiftyOne App

## Installation

First, let's install the required packages.

In [None]:
# Remove > /dev/null if you encounter errors during installation
!pip install fiftyone==1.5.2 torch torchvision numpy albumentations > /dev/null

### FiftyOne Plug-ins

We'll also install FiftyOne plugins for model evaluation and data augmentation, which we will use in later notebooks.

In [None]:
# Plug-in to evaluate the performance of our classification models
!fiftyone plugins download \
    https://github.com/voxel51/fiftyone-plugins \
    --plugin-names @voxel51/evaluation

# Plug-in for image augmentations
!fiftyone plugins download https://github.com/jacobmarks/fiftyone-albumentations-plugin

## Imports

In [None]:
import os

# Set environment variables for reproducibility BEFORE importing torch
os.environ['PYTHONHASHSEED'] = '51'
os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8'

from PIL import Image
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as Fun
import torchvision.transforms.v2 as transforms
import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob
from torch.utils.data import Dataset, ConcatDataset
from fiftyone import ViewField as F
import fiftyone.utils.random as four
from tqdm import tqdm
from torch.optim import Adam
from pathlib import Path
import matplotlib.pyplot as plt
import gc
import albumentations as A
import cv2
import random
from typing import Optional, Dict, Tuple, Any

### Loading the MNIST Dataset from FiftyOne's Dataset Zoo

A FiftyOne dataset wraps together annotations and image data into a unified, queryable structure. Unlike traditional approaches where you manage separate files for images and labels, FiftyOne treats each sample as a rich object containing the image, ground truth labels, metadata, and any predictions or embeddings you add later.

Loading MNIST from the [FiftyOne Dataset Zoo](https://docs.voxel51.com/dataset_zoo/index.html) is straightforward. We'll start by loading the test split.

In [None]:
# We will load the test split from the dataset first
test_dataset = foz.load_zoo_dataset("mnist", split='test', dataset_name="mnist-test-set", persistent=True)
test_dataset

Let's launch the FiftyOne App to visualize the test set.

In [None]:
session = fo.launch_app(test_dataset, auto=False)
print(session.url)

With `compute_metadata()`, we can easily add metadata like image size, width, height, and number of channels to each sample in our dataset.

In [None]:
test_dataset.compute_metadata()

We can perform [aggregations](https://docs.voxel51.com/user_guide/using_aggregations.html) on the dataset to explore its properties. For example, we can find the range, mean, and standard deviation of the image sizes in bytes.

In [None]:
print(f'File size (bytes) range: {test_dataset.bounds("metadata.size_bytes")}')
print(f'File size (bytes) mean: {test_dataset.mean("metadata.size_bytes"):.2f}')
print(f'File size (bytes) std dev: {test_dataset.std("metadata.size_bytes"):.2f}')

### Visualizing Distributions

In the FiftyOne App, you can visualize the distributions of any field. This is useful for checking class balance or understanding metadata distributions.

1. Click the `+` symbol next to the `Samples` tab.
2. Select `Histograms`.
3. Choose `ground_truth.label` from the dropdown to see the class distribution.
4. Add another histogram for `metadata.size_bytes`.

You should see that the classes are well-balanced, which is great for training and evaluation.

![](https://github.com/andandandand/practical-computer-vision/blob/main/images/ground_truh_distribution_mnist.png?raw=true)

In [None]:
session.refresh()

## Next Steps

Now that we have explored the basic properties of the MNIST dataset, we are ready for the next step: creating image embeddings with CLIP to understand the semantic relationships between images.

Proceed to `2_clip_embeddings.ipynb`.