# FiftyOne: Using Image Embeddings

Original tutorial: [FiftyOne Embeddings Visualization](https://docs.voxel51.com/user_guide/brain.html#visualizing-embeddings).

Table of contents:

1. A

## 1. Setup

In [None]:
!pip install fiftyone

In [1]:
!pip install torch torchvision umap-learn

Collecting umap-learn
  Downloading umap-learn-0.5.4.tar.gz (90 kB)
     ---------------------------------------- 0.0/90.8 kB ? eta -:--:--
     ---------------------------------------- 90.8/90.8 kB 5.0 MB/s eta 0:00:00
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting pynndescent>=0.5 (from umap-learn)
  Using cached pynndescent-0.5.10-py3-none-any.whl
Building wheels for collected packages: umap-learn
  Building wheel for umap-learn (setup.py): started
  Building wheel for umap-learn (setup.py): finished with status 'done'
  Created wheel for umap-learn: filename=umap_learn-0.5.4-py3-none-any.whl size=86863 sha256=6c0bb9c0879e573af655e690a1f8681f7b32025033bc37e35e40e0737f7745e9
  Stored in directory: c:\users\msagardi\appdata\local\pip\cache\wheels\e1\8b\ec\51afd5b0c041b6a7dd5777ceb58cc0d645ba9454cc5a923e96
Successfully built umap-learn
Installing collected packages: pynndescent, umap-learn
Successfully installed pynndesc

In [3]:
!pip install 'ipywidgets>=8,<9'

The system cannot find the file specified.


## 2. Upload Dataset: MNIST

In [4]:
import fiftyone as fo
import fiftyone.zoo as foz

# Datasets downloaded to: C:\Users\Msagardi\fiftyone\mnist\
dataset = foz.load_zoo_dataset("mnist")

Downloading split 'train' to 'C:\Users\Msagardi\fiftyone\mnist\train'
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

 100% |█████████████| 60000/60000 [4.7m elapsed, 0s remaining, 391.7 samples/s]      
Downloading split 'test' to 'C:\Users\Msagardi\fiftyone\mnist\test'
 100% |█████████████| 10000/10000 [39.6s elapsed, 0s remaining, 294.5 samples/s]      
Dataset info written to 'C:\Users\Msagardi\fiftyone\mnist\info.json'
Loading 'mnist' split 'train'
 100% |█████████████| 60000/60000 [53.9s elapsed, 0s remaining, 1.2K samples/s]      
Loading 'mnist' split 'test'
 100% |█████████████| 10000/10000 [9.1s elapsed, 0s remaining, 1.1K samples/s]      
Dataset 'mnist' created


In [5]:
# We start working with the test split, which contais 10k images
test_split = dataset.match_tags("test")

In [6]:
print(test_split)

Dataset:     mnist
Media type:  image
Num samples: 10000
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
View stages:
    1. MatchTags(tags=['test'], bool=True, all=False)


In [None]:
import cv2
import numpy as np

import fiftyone.brain as fob

# Construct a ``num_samples x num_pixels`` array of images
embeddings = np.array([
    cv2.imread(f, cv2.IMREAD_UNCHANGED).ravel()
    for f in test_split.values("filepath")
])

# Compute 2D representation
results = fob.compute_visualization(
    test_split,
    embeddings=embeddings,
    num_dims=2,
    method="umap",
    brain_key="mnist_test",
    verbose=True,
    seed=51,
)