# FiftyOne: Using Image Embeddings

Original tutorial:

- [Using Image Embeddings](https://docs.voxel51.com/tutorials/image_embeddings.html)
- [FiftyOne Embeddings Visualization](https://docs.voxel51.com/user_guide/brain.html#visualizing-embeddings)

> Covered concepts:
> - Loading datasets from the FiftyOne Dataset Zoo
> - Using compute_visualization() to generate 2D representations of images
> - Providing custom embeddings to compute_visualization()
> - Visualizing embeddings via interactive plots connected to the FiftyOne App
>
> And we’ll demonstrate how to use embeddings to:
> - Identify anomolous/incorrect image labels
> - Find examples of scenarios of interest
> - Pre-annotate unlabeled data for training

In summary, the following animation from [FiftyOne](https://docs.voxel51.com/tutorials/image_embeddings.html#Using-Image-Embeddings) shows the use-case in favor of embeddings: we can plot in 2D image vectors, color them with the predicted class labels and select interactively the ones which seem to be incorrect. Then, those selected samples are filtered in the web UI of FiftyOne.

![MNIST embedding selection](../assets/mnist-interactive-fiftyone.gif)


Table of contents:

1. A

## 1. Setup

In [None]:
!pip install fiftyone

In [1]:
!pip install torch torchvision umap-learn

Collecting umap-learn
  Downloading umap-learn-0.5.4.tar.gz (90 kB)
     ---------------------------------------- 0.0/90.8 kB ? eta -:--:--
     ---------------------------------------- 90.8/90.8 kB 5.0 MB/s eta 0:00:00
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting pynndescent>=0.5 (from umap-learn)
  Using cached pynndescent-0.5.10-py3-none-any.whl
Building wheels for collected packages: umap-learn
  Building wheel for umap-learn (setup.py): started
  Building wheel for umap-learn (setup.py): finished with status 'done'
  Created wheel for umap-learn: filename=umap_learn-0.5.4-py3-none-any.whl size=86863 sha256=6c0bb9c0879e573af655e690a1f8681f7b32025033bc37e35e40e0737f7745e9
  Stored in directory: c:\users\msagardi\appdata\local\pip\cache\wheels\e1\8b\ec\51afd5b0c041b6a7dd5777ceb58cc0d645ba9454cc5a923e96
Successfully built umap-learn
Installing collected packages: pynndescent, umap-learn
Successfully installed pynndesc

In [3]:
!pip install 'ipywidgets>=8,<9'

The system cannot find the file specified.


## 2. Upload Dataset: MNIST

In [4]:
import fiftyone as fo
import fiftyone.zoo as foz

# Datasets downloaded to: C:\Users\Msagardi\fiftyone\mnist\
# Additionally, this command creates a dataset in the FiftyOne database
dataset = foz.load_zoo_dataset("mnist")

Downloading split 'train' to 'C:\Users\Msagardi\fiftyone\mnist\train'
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\train-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-images-idx3-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw\t10k-labels-idx1-ubyte.gz to C:\Users\Msagardi\fiftyone\mnist\tmp-download\MNIST\raw

 100% |█████████████| 60000/60000 [4.7m elapsed, 0s remaining, 391.7 samples/s]      
Downloading split 'test' to 'C:\Users\Msagardi\fiftyone\mnist\test'
 100% |█████████████| 10000/10000 [39.6s elapsed, 0s remaining, 294.5 samples/s]      
Dataset info written to 'C:\Users\Msagardi\fiftyone\mnist\info.json'
Loading 'mnist' split 'train'
 100% |█████████████| 60000/60000 [53.9s elapsed, 0s remaining, 1.2K samples/s]      
Loading 'mnist' split 'test'
 100% |█████████████| 10000/10000 [9.1s elapsed, 0s remaining, 1.1K samples/s]      
Dataset 'mnist' created


In [1]:
# Load existing dataset
# If we restart the notebook/session, we can load the dataset as follows
import fiftyone as fo

dataset = fo.load_dataset("mnist")
print(dataset)

Name:        mnist
Media type:  image
Num samples: 70000
Persistent:  False
Tags:        []
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)


In [2]:
# We start working with the test split, which contais 10k images
test_split = dataset.match_tags("test")

In [3]:
print(test_split)

Dataset:     mnist
Media type:  image
Num samples: 10000
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
View stages:
    1. MatchTags(tags=['test'], bool=True, all=False)


## 3. Compute Image Embeddings

In [4]:
import cv2
import numpy as np

import fiftyone.brain as fob

# Construct a ``num_samples x num_pixels`` array of images
# Usually, first we need to generate image vectors/embeddings with a DL model
# and them apply UMAP/T-SNE to project them to 2D
# However, since MNIST has so small images (28x28), we can ravel them and use them
# as embedding vectors, passed to UMAP/T-SNE
embeddings = np.array([
    cv2.imread(f, cv2.IMREAD_UNCHANGED).ravel()
    for f in test_split.values("filepath")
])

# Compute 2D representation
# We can select one of the default methods
# point to a model or a field in our dataset where vectors are stored
# https://docs.voxel51.com/api/fiftyone.brain.html#fiftyone.brain.compute_visualization
results = fob.compute_visualization(
    test_split,
    embeddings=embeddings,
    num_dims=2,
    method="umap", # "tsne", "pca", "manual"
    brain_key="mnist_test",
    verbose=True,
    seed=51,
)

Generating visualization...


  warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")


UMAP(random_state=51, verbose=True)
Sat Nov  4 13:29:40 2023 Construct fuzzy simplicial set
Sat Nov  4 13:29:40 2023 Finding Nearest Neighbors
Sat Nov  4 13:29:40 2023 Building RP forest with 10 trees
Sat Nov  4 13:29:50 2023 NN descent for 13 iterations
	 1  /  13
	 2  /  13
	 3  /  13
	 4  /  13
	Stopping threshold met -- exiting after 4 iterations
Sat Nov  4 13:30:17 2023 Finished Nearest Neighbor Search
Sat Nov  4 13:30:25 2023 Construct embedding


Epochs completed:   0%|            0/500 [00:00]

	completed  0  /  500 epochs
	completed  50  /  500 epochs
	completed  100  /  500 epochs
	completed  150  /  500 epochs
	completed  200  /  500 epochs
	completed  250  /  500 epochs
	completed  300  /  500 epochs
	completed  350  /  500 epochs
	completed  400  /  500 epochs
	completed  450  /  500 epochs
Sat Nov  4 13:31:03 2023 Finished embedding


In [5]:
print(type(results))
print(results.points.shape)

<class 'fiftyone.brain.visualization.VisualizationResults'>
(10000, 2)


## 4. Visualize Embeddings and Selected Samples

### Visualization and Selection in the Web IU

We can launch the FiftyOne UI app in several ways; the two most common ways:

1. With code in our environment:

    ```python
    # dataset = test_split
    session = fo.launch_app(view=test_split, desktop=True) # Browser: http://localhost:5151
    session = fo.launch_app(view=test_split, desktop=False) # Embedded in Jupyter
    ```

2. In the CLI:

    ```bash
    # fiftyone app launch <dataset_name>
    # Note that if we launch it via the CLI the entire dataset is loaded
    # not only the test_split
    (label) fiftyone app launch "mnist"
    # Browser: http://localhost:5151
    ```

Note that the main use-case of embeddings is to be able to visualize them in a 2D projection in order to select samples that have dubious locations/labels. In that sense, take into account that we have a subset of smaples `test_split = dataset.match_tags("test")` for which we have generated the `embedding` vectors and uploaded to the dataset with `fob.compute_visualization()`.

If we launch the the app via the CLI loading the entire dataset, we need to:

- Add a filter/stage `MatchTags(test)` to narrow down to the samples in `test_split`; i.e., we are reproducing the command  `test_split = dataset.match_tags("test")` but in the UI.
- Create an `Embeddings` view in the tabs of the main frame, using:
  - brain key: `mnist_test` - that was created with `fob.compute_visualization()`
  - color by: `ground_truth.label`

When the embeddings are visualized, we can select weird/bordeline samples and open the tab of the samples; there, we see which are those samples.

Recall the basic usage of the UI:

- Left frame: select tags / labels / primitives (features added in code)
- Main frame: we can visualize several **panels**
  - Samples: we can click on each of them and a detailed view is opened
  - Histograms: we can select which vaiables to plot: labels, scalar values, etc.
  - Embeddings: we can plot scatterplots that represent the dataset
- We can add stages or filters, e.g.:
  - `Limit(int)` takes the number of samples we specify
  - `MatchTags(str)`: 
  - `SortBy`
  - ...

![MNIST embedding and sample selection](../assets/mnist_embedding_selection.png)
![MNIST embedding and sample selection](../assets/mnist_embedding_selection_samples.png)

### Visualization and Selection in Interactive Notebook Plots

In [6]:
# Launch App instance from the code using the SDK
# The advantage of this approach is that we get the session object
# Recommendations:
# - Use Jupyter lab
# - Right click on this cell and select: "Create New View for Cell Output"
# - Place cell output side-by-side
# - Continue interacting with the UI using code! :)
session = fo.launch_app(view=test_split)


Could not connect session, trying again in 10 seconds



In [7]:
# Plot embeddings colored by ground truth label
# Using the SDK interaction approach
# we can plot a scatterplot of the embeddings (plotly)
# zoom and select the weird samples
# Then, the UI view is updated! :)
# We can make use of the tools in plotly plots:
# Zoom, lasso selection, label-based selection, etc.
plot = results.visualize(labels="ground_truth.label")
plot.show(height=720)

# Attach plot to session
session.plots.attach(plot)





FigureWidget({
    'data': [{'customdata': array(['6544cf856d0970ac84e38a12', '6544cf856d0970ac84e38a21',
                                   '6544cf856d0970ac84e38a24', ..., '6544cf8e6d0970ac84e3d804',
                                   '6544cf8e6d0970ac84e3d817', '6544cf8e6d0970ac84e3d821'], dtype=object),
              'hovertemplate': ('<b>label: %{text}</b><br>x, y ' ... ': %{customdata}<extra></extra>'),
              'line': {'color': '#3366CC'},
              'mode': 'markers',
              'name': '0 - zero',
              'showlegend': True,
              'text': array(['0 - zero', '0 - zero', '0 - zero', ..., '0 - zero', '0 - zero',
                             '0 - zero'], dtype='<U8'),
              'type': 'scattergl',
              'uid': 'ad49ab1a-adb7-4490-a039-88d04f91e4d0',
              'x': array([-0.66646504, -1.3560915 , -0.6667465 , ..., -1.6081772 , -1.595924  ,
                          -2.3888087 ], dtype=float32),
              'y': array([3.964175 , 3.24387

![MNIST: Selection of embeddings side-by-side](../assets/mnist_embedding_selection_jupyter.png)

In [None]:
session.freeze()  # screenshots App and plot for sharing

### Pre-Annoations of Samples

In [5]:
# Let’s see how compute_visualization() can be used to efficiently pre-annotate the train split with minimal effort.
# First, load existing dataset
# If we restart the notebook/session, we can load the dataset as follows
import fiftyone as fo

dataset = fo.load_dataset("mnist")
print(dataset)

Name:        mnist
Media type:  image
Num samples: 70000
Persistent:  False
Tags:        []
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)


In [7]:
import cv2
import numpy as np

import fiftyone.brain as fob

# Now, let's create embeddings for the entire dataset
# Since MNIST images are so small, we ravel them and take their 28x28 pixel values as the embedding vectors
# Construct a ``num_samples x num_pixels`` array of images
embeddings = np.array([
    cv2.imread(f, cv2.IMREAD_UNCHANGED).ravel()
    for f in dataset.values("filepath")
])

# Compute 2D representation
results = fob.compute_visualization(
    dataset,
    embeddings=embeddings,
    num_dims=2,
    method="umap",
    brain_key="mnist",
    verbose=True,
    seed=51,
)

Generating visualization...


  warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")


UMAP(random_state=51, verbose=True)
Sat Nov  4 14:37:40 2023 Construct fuzzy simplicial set
Sat Nov  4 14:37:40 2023 Finding Nearest Neighbors
Sat Nov  4 14:37:40 2023 Building RP forest with 18 trees
Sat Nov  4 14:37:49 2023 NN descent for 16 iterations
	 1  /  16
	 2  /  16
	 3  /  16
	 4  /  16
	Stopping threshold met -- exiting after 4 iterations
Sat Nov  4 14:38:10 2023 Finished Nearest Neighbor Search
Sat Nov  4 14:38:15 2023 Construct embedding


Epochs completed:   0%|            0/200 [00:00]

	completed  0  /  200 epochs
	completed  20  /  200 epochs
	completed  40  /  200 epochs
	completed  60  /  200 epochs
	completed  80  /  200 epochs
	completed  100  /  200 epochs
	completed  120  /  200 epochs
	completed  140  /  200 epochs
	completed  160  /  200 epochs
	completed  180  /  200 epochs
Sat Nov  4 14:39:34 2023 Finished embedding


In [8]:
from fiftyone import ViewField as F

# Of course, our dataset already has ground truth labels for the train split,
# but let’s pretend that’s not the case
# Label `test` split samples by their ground truth label
# Mark all samples in `train` split as `unlabeled`
expr = F("$tags").contains("test").if_else(F("label"), "unlabeled")
labels = dataset.values("ground_truth", expr=expr)

In [12]:
# Launch App instance from the code using the SDK
# The advantage of this approach is that we get the session object
# Recommendations:
# - Use Jupyter lab
# - Right click on this cell and select: "Create New View for Cell Output"
# - Place cell output side-by-side
# - Continue interacting with the UI using code! :)
session = fo.launch_app(dataset)

In [13]:
# Visualize results
# The samples from both splits are visualized: train & test
# The key is that the samples from the train split are labeled as "unlabeled"
# Meanwhile, the samples from test have their correct labels: 0, 1, 2, ...
# When we plot the embeddings using a UMAP projection
# the regions are the same!
# Therefore, we can select the "unlabeled" samples in each blob
# and label/annotate them!
# We use the plotly tools (un/select labels, box/lasso selection, etc.)
plot = results.visualize(labels=labels)
plot.show(height=720)

# Attach plot to session
session.plots.attach(plot)





FigureWidget({
    'data': [{'customdata': array(['6544cf856d0970ac84e38a12', '6544cf856d0970ac84e38a21',
                                   '6544cf856d0970ac84e38a24', ..., '6544cf8e6d0970ac84e3d804',
                                   '6544cf8e6d0970ac84e3d817', '6544cf8e6d0970ac84e3d821'], dtype=object),
              'hovertemplate': ('<b>label: %{text}</b><br>x, y ' ... ': %{customdata}<extra></extra>'),
              'line': {'color': '#AA0DFE'},
              'mode': 'markers',
              'name': '0 - zero',
              'showlegend': True,
              'text': array(['0 - zero', '0 - zero', '0 - zero', ..., '0 - zero', '0 - zero',
                             '0 - zero'], dtype='<U8'),
              'type': 'scattergl',
              'uid': '3cb25c79-36a8-422b-90e7-ae89cf7b9e3e',
              'x': array([14.642015, 14.908134, 14.404558, ..., 16.10947 , 15.313407, 16.262272],
                         dtype=float32),
              'y': array([6.539795 , 7.8326154, 7.379261 

When the plot is created, the samples from both splits are visualized: train & test. The key is that the samples from the train split are labeled as "unlabeled". Meanwhile, the samples from test have their correct labels: 0, 1, 2, ... When we plot the embeddings using a UMAP projection, the regions are the same! Therefore, we can select the "unlabeled" samples in each blob and label/annotate them! We use the plotly tools (un/select labels, box/lasso selection, etc.).

The annotation process is as follows:

- Plot both train & test
- Hide "unlabeled", see blob labels
- Hide labeled, visualize "unlabeled"
- Select a blob in plotly with lasso
- Samples are updated in the UI main frame
- Click on "Tag" icon, provide a tag (e.g., 6 for a blob containing images of 6)
- Apply to samples
- Now, the `sample_tags` should have another tag: "6"

In [14]:
# Take the train split that we pre-annotated
# and print some statistics
train_split = dataset.match_tags("train")

# Print state about labels that were added
print(train_split.count_sample_tags())

{'6': 6004, 'train': 60000}


In [15]:
# Converts the sample tags into Classification labels
# in a new hypothesis field of the dataset
# Add a new Classification field called `hypothesis` to store our guesses
# https://docs.voxel51.com/user_guide/using_datasets.html#classification
with fo.ProgressBar() as pb:
    for sample in pb(train_split):
        labels = [t for t in sample.tags if t != "train"]
        if labels:
            sample["hypothesis"] = fo.Classification(label=labels[0])
            sample.save()

# Print stats about the labels we created
print(train_split.count_values("hypothesis.label"))



 100% |█████████████| 60000/60000 [47.9s elapsed, 0s remaining, 1.2K samples/s]      
{None: 53996, '6': 6004}


In [16]:
# Imagine we do that for all classes
# We still would have some samples which are not labeled
# We can filter them and apply a tag manually!
no_hypothesis = train_split.exists("hypothesis.label", False)
print(no_hypothesis)

Dataset:     mnist
Media type:  image
Num samples: 53996
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    hypothesis:   fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
View stages:
    1. MatchTags(tags=['train'], bool=True, all=False)
    2. Exists(field='hypothesis.label', bool=False)



```python
# Export options:
# https://docs.voxel51.com/user_guide/export_datasets.html

# Export `hypothesis` labels as a classification directory tree format
# `exists()` ensures that we only export samples with a hypothesis
train_split.exists("hypothesis.label").export(
    export_dir="/path/for/dataset",
    dataset_type=fo.types.ImageClassificationDirectoryTree,
    label_field="hypothesis",
)

# Export **only** labels in the `hypothesis` field as classification label
# with absolute image filepathsotrain_split.exists("hypothesis.label").export(
    dataset_type=fo.types.FiftyOneImageClassificationDataset,
    labels_path="mnist_hypothesis.json",
    label_field="hypothesis",
    abs_paths=True
)sis",
)
```

In [19]:
# Export **only** labels in the `hypothesis` field as classification label
# with absolute image filepaths
train_split.exists("hypothesis.label").export(
    dataset_type=fo.types.FiftyOneImageClassificationDataset,
    labels_path="mnist_hypothesis.json",
    label_field="hypothesis",
    abs_paths=True
)



 100% |███████████████| 6004/6004 [4.5s elapsed, 0s remaining, 1.0K samples/s]        


### Re-Load Previous Visualizations

In [20]:
# If you provide the brain_key argument to compute_visualization(),
# then the visualization results that you generate will be saved
# and you can recall them later.
# List brain runs saved on the dataset
print(dataset.list_brain_runs())

# Load the results for a brain run
results = dataset.load_brain_results("mnist_test")
print(type(results))

# Load the dataset view on which the results were computed
results_view = dataset.load_brain_view("mnist_test")
print(len(results_view))

['mnist', 'mnist_test']
<class 'fiftyone.brain.visualization.VisualizationResults'>
10000
