# Tips and Tricks for Using FiftyOne inside Google Colab

Working with FiftyOne in Google Colab presents unique challenges: temporary instances that lose your data, installation overhead on every restart, and the need to manage datasets across sessions. This notebook shares battle-tested techniques to make your FiftyOne workflow in Colab faster, more reliable, and reproducible.

You'll learn how to:
- Install FiftyOne using modern tools like `uv`
- Persist your datasets, models, and configurations across sessions with Google Drive
- Share datasets with collaborators
- Leverage GPU acceleration for faster inference
- Control the FiftyOne App for a better development experience

Whether you're curating datasets, running computer vision experiments, or building ML pipelines, these tips will help you work more in the ephemeral Colab environment.

## Tip: Clean and fast FiftyOne installation with `uv` and `%%capture`

Using `uv` for package installation and `%%capture` to suppress output are great practices for creating clean and efficient Colab notebooks.

In Google Colab, the virtual machine instance you are using is temporary. When you close your browser tab or the notebook becomes idle for too long, the instance is recycled and all installed packages are lost. This is why having a quick way to re-install libraries like FiftyOne is very valuable, and `uv` helps significantly with this.

### Why use `uv`?

`uv` is a modern, fast Python package installer and resolver. It is designed to be significantly quicker than traditional tools like `conda`. Using `uv` can drastically reduce the time it takes to set up your environment, especially when dealing with many dependencies. This is useful on Colab notebooks, as they need to be re-configured each time that we close the browser window that runs them.

### Why use `%%capture`?

The `%%capture` magic command is used to suppress the standard output and standard error streams of a cell. When installing packages, the output can often be very verbose. Using `%%capture` keeps your notebook clean and focused on the results of your code, rather than the installation process details.

### Explicitly specifying package versions

It's always a good practice to make the version of the library you are using explicit in your notebook. This is crucial for reproducibility. Different versions of libraries like FiftyOne can have API changes or different dependencies. By specifying the version, you ensure that anyone running your notebook in the future will use the exact same environment you did, saving time and effort in reproducing your work.

Let's look at the code cells in this section:

This cell uses `uv` to install the `fiftyone` library with a specific version (`==1.8.0`). The `!` at the beginning indicates a shell command, and `%%capture` suppresses the installation output.

In [None]:
# We use %%capture to avoid polluting the notebook with the install trace
%%capture
!uv pip install fiftyone==1.8.0

This cell imports the FiftyOne library, aliasing it as `fo` for easier use in the notebook.

In [None]:
import fiftyone as fo

  return '(?ms)' + res + '\Z'


This cell prints the installed version of FiftyOne to confirm that the correct version was installed.

In [None]:
print(f"FiftyOne version installed: {fo.__version__}")

FiftyOne version installed: 1.8.0


## Tip: use Google Drive for persistent storage of data and models across sessions

This is an important workflow for any serious work in Colab. It ensures that your data, models, and datasets are not lost when the virtual machine instance is recycled.

Colab instances are typically recycled after a period of inactivity (usually around 90 minutes) or when you close your browser tab. Instances also have a maximum lifetime (currently 12 hours), after which they are also recycled. This means any data and work not saved to persistent storage will be lost.

We will use our Google Drive account as persistent storage so that we can save our work. It's important to know that if we share the notebook with collaborators or the outside world, they will only be able to access the data in our Google Drive to which we have granted them open access, we will explore this soon.

In [None]:
# You will be asked to authorize access to Drive after running this.
# Note that this connects to your own Google Drive account.
# It doesn't connect to folders from others unless they have been shared with you.
from google.colab import drive
drive.mount('/gdrive')
%cd /gdrive

Mounted at /gdrive
/gdrive


Export and import your persisted data using your Google Drive path.

In [None]:
import os
from pathlib import Path
# Notice that this is a Google Drive path
save_path = Path('/gdrive/MyDrive/fiftyone_dataset_curation')
os.makedirs(save_path, exist_ok=True)

We specify the location of our MongoDB database for FiftyOne in the folder that we have chosen.

In [None]:
# path to the MongoDB database
database_path = save_path / "mongodb"
os.makedirs(database_path, exist_ok=True)
fo.config.database_dir = str(database_path)

Let's demo this with a sample from the COCO-2017 dataset.

In [None]:
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    max_samples=200,
)

Downloading split 'validation' to '/root/fiftyone/coco-2017/validation' if necessary


INFO:fiftyone.zoo.datasets:Downloading split 'validation' to '/root/fiftyone/coco-2017/validation' if necessary


Downloading annotations to '/root/fiftyone/coco-2017/tmp-download/annotations_trainval2017.zip'


INFO:fiftyone.utils.coco:Downloading annotations to '/root/fiftyone/coco-2017/tmp-download/annotations_trainval2017.zip'


 100% |██████|    1.9Gb/1.9Gb [4.5s elapsed, 0s remaining, 518.7Mb/s]       


INFO:eta.core.utils: 100% |██████|    1.9Gb/1.9Gb [4.5s elapsed, 0s remaining, 518.7Mb/s]       


Extracting annotations to '/root/fiftyone/coco-2017/raw/instances_val2017.json'


INFO:fiftyone.utils.coco:Extracting annotations to '/root/fiftyone/coco-2017/raw/instances_val2017.json'


Downloading 200 images


INFO:fiftyone.utils.coco:Downloading 200 images


 100% |██████████████████| 200/200 [6.9s elapsed, 0s remaining, 31.6 images/s]       


INFO:eta.core.utils: 100% |██████████████████| 200/200 [6.9s elapsed, 0s remaining, 31.6 images/s]       


Writing annotations for 200 downloaded samples to '/root/fiftyone/coco-2017/validation/labels.json'


INFO:fiftyone.utils.coco:Writing annotations for 200 downloaded samples to '/root/fiftyone/coco-2017/validation/labels.json'


Dataset info written to '/root/fiftyone/coco-2017/info.json'


INFO:fiftyone.zoo.datasets:Dataset info written to '/root/fiftyone/coco-2017/info.json'


You are running the oldest supported major version of MongoDB. Please refer to https://deprecation.voxel51.com for deprecation notices. You can suppress this exception by setting your `database_validation` config parameter to `False`. See https://docs.voxel51.com/user_guide/config.html#configuring-a-mongodb-connection for more information




Loading 'coco-2017' split 'validation'


INFO:fiftyone.zoo.datasets:Loading 'coco-2017' split 'validation'


 100% |█████████████████| 200/200 [1.0s elapsed, 0s remaining, 198.7 samples/s]         


INFO:eta.core.utils: 100% |█████████████████| 200/200 [1.0s elapsed, 0s remaining, 198.7 samples/s]         


Dataset 'coco-2017-validation-200' created


INFO:fiftyone.zoo.datasets:Dataset 'coco-2017-validation-200' created


Export the dataset to a Google Drive path.

In [None]:
# Export the dataset to a folder
export_dir = save_path / "coco-2017"
dataset.export(
    export_dir=str(export_dir),
    dataset_type=fo.types.FiftyOneDataset,
     # set to True to export the original images alongside annotations
    export_media=True,
)

Directory '/gdrive/MyDrive/fiftyone_dataset_curation/coco-2017' already exists; export will be merged with existing files




Exporting samples...


INFO:fiftyone.utils.data.exporters:Exporting samples...


 100% |████████████████████| 200/200 [1.5m elapsed, 0s remaining, 2.8 docs/s]      


INFO:eta.core.utils: 100% |████████████████████| 200/200 [1.5m elapsed, 0s remaining, 2.8 docs/s]      


We can later reload the data using the same path.

In [None]:
reloaded_dataset = fo.Dataset.from_dir(
    dataset_dir=export_dir,
    dataset_type=fo.types.FiftyOneDataset,
)

Importing samples...


INFO:fiftyone.utils.data.importers:Importing samples...


 100% |█████████████████| 200/200 [12.6ms elapsed, 0s remaining, 15.9K samples/s]      


INFO:eta.core.utils: 100% |█████████████████| 200/200 [12.6ms elapsed, 0s remaining, 15.9K samples/s]      



## Tip: use `gdown` to download shared zipped folders

When a dataset is shared as a Google Drive folder, using a tool like `gdown` is more convenient than manually downloading and re-uploading the data to your Colab environment. `gdown` is a command-line utility that allows you to download files and folders directly from a public Google Drive link.

This approach is particularly useful because it automates the download process and can be included directly in your notebook, making your workflow fully reproducible.

First, let's install `gdown` using `uv`, keeping the notebook clean with `%%capture`.

In [None]:
# We use %%capture to avoid polluting the notebook with the install trace
%%capture
!uv pip install gdown==5.2.0

Next, we'll define the URL of the shared folder and specify a local path where we want to save the data. We will download the same COCO-2017 sample dataset from the link provided earlier.

I have created a shared link with public access here: https://drive.google.com/drive/folders/1G6JKGm0sy5d5ViEpDktXMxIqq5cl2Nrc?usp=drive_link


In [None]:
# This assumes 'save_path' is defined from a previous cell.
# Let's ensure it exists.
save_path = Path('/gdrive/MyDrive/fiftyone_dataset_curation')
download_output_path = save_path / "gdown_downloads"
os.makedirs(download_output_path, exist_ok=True)

# The public Google Drive folder URL for the COCO-2017 samples
folder_url = "https://drive.google.com/drive/folders/1G6JKGm0sy5d5ViEpDktXMxIqq5cl2Nrc?usp=drive_link"

# Use gdown to download the entire folder.
# The --folder flag specifies that we are downloading a folder.
# The -O flag sets the output directory.
!gdown --folder "{folder_url}" -O "{download_output_path}"

Notice that we have hit a limit:

```bash
The gdrive folder with url: https://drive.google.com/drive/folders/1fe
	UZBqLJm_2OoxLKTxV53SGPZvUYPa49?hl=en has more than 50 files, gdrive
	can't download more than this limit.
```

The file has to be zipped first to bypass this and we must use the shared link to this version.

In [None]:
import os
import zipfile

# Define the path to the folder you want to zip
folder_to_zip = '/gdrive/MyDrive/fiftyone_dataset_curation/coco-2017'

# Define the name and path for the output zip file
output_zip_file = '/gdrive/MyDrive/fiftyone_dataset_curation/coco-2017.zip'

# Create a ZipFile object in write mode
with zipfile.ZipFile(output_zip_file, 'w', zipfile.ZIP_DEFLATED) as zipf:
    # Walk through all the files and subdirectories in the folder
    for root, dirs, files in os.walk(folder_to_zip):
        for file in files:
            # Create the full path to the file
            file_path = os.path.join(root, file)
            # Add the file to the zip archive, preserving the directory structure
            zipf.write(file_path, os.path.relpath(file_path, folder_to_zip))

print(f"Folder '{folder_to_zip}' successfully zipped to '{output_zip_file}'")

We need to create the shared link again:

![](https://github.com/andandandand/practical-computer-vision/blob/main/images/create_sharing_link.png?raw=true)

You will get a "Link Copied" message, then you will need to click on `Manage Access`. Be sure to give `Everyone` the `Viewer` permission if you want this file to be shared with others.

![](https://github.com/andandandand/practical-computer-vision/blob/main/images/manage_access.png?raw=true)

Now try running the download again! You will need to unzip the file to get the loading to work.

In [None]:
# The public Google Drive folder URL for the COCO-2017 samples
folder_url = "https://drive.google.com/file/d/1UyL0clgoIPSYDWHlPjLP0IPZoFRQQJDE/view?usp=drive_link"

# Use gdown to download the entire folder.
# The --folder flag specifies that we are downloading a folder.
# The -O flag sets the output directory.
!gdown --folder "{folder_url}" -O "{download_output_path}"

Or in pure Python:

In [None]:
import gdown
import os
import zipfile
from pathlib import Path
import fiftyone as fo

# --- 1. Setup Paths ---
# This assumes 'save_path' is defined from a previous cell.
# Let's ensure it exists.
save_path = Path('/gdrive/MyDrive/fiftyone_dataset_curation')
download_dir = save_path / "gdown_downloads"
os.makedirs(download_dir, exist_ok=True)

# Define the full path for the downloaded zip file
zip_file_path = download_dir / "coco_samples_dataset.zip"

# Define the directory where the contents will be unzipped
unzip_dir = download_dir / "coco_samples_unzipped"
os.makedirs(unzip_dir, exist_ok=True)


# --- 2. Download the ZIP file from Google Drive ---
# The public Google Drive file URL for the zipped COCO-2017 samples
file_url = "https://drive.google.com/file/d/1UyL0clgoIPSYDWHlPjLP0IPZoFRQQJDE/view?usp=drive_link"

print(f"Downloading dataset from Google Drive to '{zip_file_path}'...")
# Use gdown.download to fetch the file
gdown.download(url=file_url, output=str(zip_file_path), quiet=False)
print("Download complete.")


# --- 3. Unzip the File using the `zipfile` library ---
print(f"Unzipping '{zip_file_path}' to '{unzip_dir}'...")

# Use a 'with' statement to safely open and extract the zip archive
try:
    with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
        zip_ref.extractall(unzip_dir)
    print("Unzipping successful.")
except zipfile.BadZipFile:
    print("Error: The downloaded file is not a valid zip file or is corrupted.")
except Exception as e:
    print(f"An error occurred during unzipping: {e}")


# --- 4. Load the dataset into FiftyOne ---
# The directory containing the unzipped FiftyOne dataset
# Note: The actual dataset might be in a subdirectory within `unzip_dir`.
# For this specific zip file, the data is in the root of the unzipped folder.
dataset_dir = unzip_dir

print(f"Loading FiftyOne dataset from '{dataset_dir}'...")

# Check if the directory exists and appears to be a FiftyOne dataset
if (Path(dataset_dir) / "metadata.json").exists():
    # Use Dataset.from_dir to load the dataset from its source directory
    dataset = fo.Dataset.from_dir(
        dataset_dir=str(dataset_dir),
        dataset_type=fo.types.FiftyOneDataset,
        name="coco-samples-from-drive" # Give the dataset a unique name
    )

    print("\nDataset loaded successfully!")
    print(dataset)

    # You can now work with the dataset, for example, launch the App
    # session = fo.launch_app(dataset)
    # print(session.url)
else:
    print(f"Error: Could not find a valid FiftyOne dataset at '{dataset_dir}'.")
    print("Please check the contents of the unzipped folder.")

**But then you might get a message like this:**

```bash
Cannot retrieve the folder information from the link. You may need to
	change the permission to 'Anyone with the link', or have had many
	accesses. Check FAQ in https://github.com/wkentaro/gdown?tab=readme-
	ov-file#faq.
```
**"...or have had many accesses"** what to do then?

## Tip: share datasets instantly by using `"Add Shortcut to Drive..."`

Adding a shortcut from a shared Google Drive folder to your own account allows you fast access to the data **without having to download it, or compress and decompress it**. It's a much more immediate option than using `gdown`.

Follow these steps to add a shortcut to a shared Google Drive folder in your own Drive for quick access.

You can try the following steps using this public access link to samples from the COCO-2017 validation set that I created: https://drive.google.com/drive/folders/1G6JKGm0sy5d5ViEpDktXMxIqq5cl2Nrc?usp=sharing




1. Open Google Drive

- Go to [drive.google.com](https://drive.google.com) and sign in with your Google account if you are not already signed in.

2. (Optional) Locate the Shared Folder

- If someone shared the folder with you through an invitation (instead of a link), look under the **"Shared with me"** section on the left sidebar.

3. Add the Shortcut

- Right-click (or two-finger click on a touchpad) on the folder name in Google Drive.
![](https://github.com/andandandand/practical-computer-vision/blob/main/images/add_shortcut.png?raw=true)

- Select **Organize > Add shortcut to Drive** from the menu.

4. Choose Shortcut Location

- A dialog box will appear. Select **My Drive** or any other folder in your Drive where you want the shortcut to appear.
- Click **Add shortcut** to confirm.

5. Access Your Shortcut

- Go to the location you selected (e.g., "My Drive").
- You will see the shortcut there, marked with a small arrow icon.
- You will be able to refer to this folder using the path specified:

```python
from pathlib import Path
import os
path = Path("/gdrive/MyDrive/<folder_name>")
# This should print the filenames inside the folder
os.listdir(path)
```




Shortcuts are just pointers to the original folder-they do not make a copy or use your storage quota. Any changes you make to files within the shortcut are reflected in the original folder. They are an instant way to share data and come in handy when

### Tip: Save your FiftyOne setup on Drive by modifying the `fo.config` file

We use the Google Drive folder for your downloads, models, and MongoDB database by modifying `fo.config`

In [None]:
# Check the state of fo.config before doing any modification
print(fo.config)

In [None]:
# Where we will download the data when using the FiftyOne dataset zoo
# https://docs.voxel51.com/dataset_zoo/index.html
dataset_zoo_path = save_path / "fo_dataset_zoo"
os.makedirs(dataset_zoo_path, exist_ok=True)
fo.config.dataset_zoo_dir = str(dataset_zoo_path)

# path to the MongoDB database
database_path = save_path / "mongodb"
os.makedirs(database_path, exist_ok=True)
fo.config.database_dir = str(database_path)

models_path = str(save_path / "models")
os.makedirs(models_path, exist_ok=True)
fo.config.model_zoo_dir = models_path

model = foz.load_zoo_model("clip-vit-base32-torch")

## Tip: add your HuggingFace token as a secret in Colab

Adding a HuggingFace token as a secret to your Google Colab notebook allows you to share your work using models and data from HF without exposing your personal information.

### Why You Need a Token

*   **Access Restricted Models:** Many powerful models on HuggingFace, like Meta's DINOv3 family, are "gated," meaning you must agree to their terms before you can download them. Your token verifies that you have the necessary permissions.
*   **Avoid Rate Limiting:** Without a token, you are an anonymous user and subject to strict download limits. Using a token identifies you and grants you significantly higher download rates, preventing interruptions in your workflow.
*   **Upload Your Own Work:** If you plan to share your fine-tuned models or datasets with the community, your token acts as your credential, allowing you to upload and manage content on the HuggingFace Hub.

---

### Step 1: Create a HuggingFace Account

1.  Navigate to the [HuggingFace website](https://huggingface.co/).
2.  Click the "**Sign Up**" button located in the top-right corner.
3.  You can register using your email or by linking your Google or GitHub account.
4.  Complete the process by verifying your email address.

### Step 2: Generate Your Access Token

1.  Once logged in, click on your profile picture at the top-right and select "**Settings**" from the dropdown menu.
2.  In the left sidebar, click on "**Access Tokens**."
3.  Click the "**New token**" button.
4.  Give your token a descriptive name (e.g., "Google Colab Projects").
5.  Assign it the "**Write**" role to enable both downloading and uploading.
6.  Click "**Generate a token**."
7.  Immediately copy the generated token and store it in a secure location, as you will not be able to see it again.

### Step 3: Use Your Token in Google Colab

The most secure way to use your token in Google Colab is with the built-in Secrets manager. This prevents your token from being exposed in your code.

1.  **Install Required Libraries:**
    In a Colab cell, run the following command to install the necessary HuggingFace libraries:
    ```python
    !pip install -q huggingface_hub transformers
    ```

2.  **Store the Token in Colab Secrets:**
    *   In your Colab notebook, click the **key icon** (🔑) in the left sidebar.
    *   Click "**Add a new secret**."
    *   For the **Name**, enter `HUGGINGFACE_TOKEN`.
    *   In the **Value** field, paste your copied access token.
    *   Ensure the "**Notebook access**" toggle is enabled.

3.  **Access the Token in Your Notebook:**
    Use the following code to securely load your token from Colab Secrets into your environment.
    ```python
    import os
    from google.colab import userdata

    # Load the token from Colab secrets
    os.environ["HUGGINGFACE_TOKEN"] = userdata.get('HUGGINGFACE_TOKEN')

    # You can now use the HuggingFace API
    from huggingface_hub import HfApi
    api = HfApi()

    # Example: List your models to verify the token is working
    # Replace "your-username" with your actual HuggingFace username
    my_models = api.list_models(author="your-username")
    print("Successfully connected and fetched models.")
    ```

---

### Security Best Practices

*   **Never expose your token** in plain text in your notebook cells or share it publicly.
*   **Use the Secrets manager** as the primary method for handling tokens in Colab.
*   If you believe your token has been compromised, go to your HuggingFace settings and revoke it immediately.


### Getting a dataset from HuggingFace's Hub

Having the HF token set up, interaction with the platform is much smoother.

In [None]:
from fiftyone.utils.huggingface import load_from_hub

other_datasets_path = save_path / "other_datasets"
os.makedirs(other_datasets_path, exist_ok=True)
fo.config.default_dataset_dir = str(other_datasets_path)

curated_minist_dataset = load_from_hub("Voxel51/curated-mnist",
                                       max_samples=100)

## Tip: control the launch of the FiftyOne app in a separate browser tab

Launching the FiftyOne app in a separate browser tab provides a full window view. This is often more convenient and provides a better user experience than launching the app directly within a notebook cell.

In [None]:
# Passing auto=False prevents the app from launch on its own notebook cell
session = fo.launch_app(dataset, auto=False)
# print session.url gives us a nice URL that we can click on ;)
print(f"Just click here to get to the app (whenever you want, no auto launch) {session.url}")


![](https://github.com/andandandand/practical-computer-vision/blob/main/images/full_window_view.png?raw=true)

`session.open_tab()` is another option, however it fires the app automatically, which can feel disruptive / distracting. I prefer to use `print(session.url)` to control when I access the FiftyOne app.


In [None]:
# session.open_tab() is another option, we just don't get to see the URL here
#session.open_tab()

## Tip: keep a single instance of the app running

Each time you call `fo.launch_app()`, FiftyOne launches a new server process in the background to power the app. While you can launch multiple apps, it's a good practice to manage a single instance to conserve resources and avoids confusion. If you see the FiftyOne app "flickering" it's often because we have two instances of it running at the same time.

When you launch the app, store the returned `session` object. You can use this object to programmatically control the app. If you have a session running and want to get a handle on it again, you can call `fo.launch_app()` again, and FiftyOne will simply give you a handle to the existing session without creating a new one.

Let's see this in action. First, we launch the app and get our session object.

In [None]:
# The dataset is already loaded from a previous cell
# dataset = foz.load_zoo_dataset("coco-2017", split="validation", max_samples=200)

# If it's flickering it's because we have already called fo.launch_app() inside the notebook
# This will gracefully shut down the App server associated with the previous session
session.close()

# Launch the app and store the session.
session = fo.launch_app(dataset)

Now you can perform other operations in your notebook. When you want to interact with the app again, for example, to load a new view, you can just use the `session` object.

In [None]:
from fiftyone import ViewField as F

dog_view = dataset.filter_labels("ground_truth", F("label") == "dog")
session.view = dog_view

If you close the App's cell or tab, the underlying Python process is still running. You can simply open a new tab to the App's URL. If you need to stop the App server completely, you can close the session.

## Tip: use the GPU to run inference faster

To significantly speed up inference and other computations in FiftyOne, you can leverage the power of a GPU. Google Colab provides free access to GPUs, which can we can enable by going to:

```
Runtime -> Change Runtime Type -> T4 GPU
```

This gives us access to an NVIDIA T4 GPU with about 16 GB of VRAM. Inference will run much faster with it enabled, this affects significantly how quickly we will be able to obtain predictions or embeddings out of our models. We can try this with a speed test.

In [None]:
import torch
# Check if the GPU is available
torch.cuda.is_available()

In [None]:
import time
import fiftyone.brain as fob

# Helper: measure time for computing embeddings into a field
def time_compute_embeddings_and_project(model, label="gpu"):
    start = time.time()
    # Use Brain to both compute embeddings and reduce dimensionality
    res = fob.compute_visualization(
        dataset,
        model=model,
        embeddings="clip_embeddings_" + label,
        brain_key="vis_" + label,
        method='pca',
        batch_size=16,
    )
    end = time.time()
    return end - start

# Load CLIP (OpenAI ViT-B/32) from the model zoo
clip_model = foz.load_zoo_model("clip-vit-base32-torch")
print("Model has embeddings:", getattr(clip_model, "has_embeddings", None))

# Benchmark on GPU (if available)
gpu_time = None
if torch.cuda.is_available():
    # Many FiftyOne zoo models run on GPU automatically if available;
    # timing reflects GPU execution.
    gpu_time = time_compute_embeddings_and_project(clip_model, label="gpu")
    print(f"GPU time (s): {gpu_time:.2f}")

# Force CPU run by moving model to CPU (and/or disabling CUDA)
# Depending on environment, you may need to ensure the model runs on CPU.
# Re-load a fresh model instance to avoid device cross-talk
clip_model_cpu = foz.load_zoo_model("clip-vit-base32-torch")
# If the wrapped model exposes .cuda()/.cpu(), it will be set appropriately by the integration.
# Here we simply assume CPU because no CUDA ops are used when CUDA is unavailable.
# If your environment auto-selects GPU, set CUDA_VISIBLE_DEVICES="" before launching Python
cpu_time = time_compute_embeddings_and_project(clip_model_cpu, label="cpu")
print(f"CPU time (s): {cpu_time:.2f}")

## Tip: use the extra RAM if you have it

When working with large datasets, especially with high-resolution images or videos, you may run into memory limitations on the standard Colab runtime (which typically provides around 12GB of RAM). Colab Pro offers a "High-RAM" runtime that provides significantly more memory (around 25GB or more), which can be essential for memory-intensive tasks.

To enable a high-RAM runtime, go to:
```
Runtime -> Change Runtime Type -> Runtime shape -> High-RAM
```
This option will give your notebook access to a machine with a larger memory capacity, allowing you to load and process more data without the session crashing. This is very useful when you need to load a large number of samples into FiftyOne, compute embeddings for your entire dataset, or perform other memory-heavy operations.

FiftyOne is snappy and quick even on low resources.
You can try loading a bigger version of the COCO-2017 dataset and compare the feel of the app when you have the extra RAM available.

In [None]:
import psutil

# Check available RAM
total_ram_bytes = psutil.virtual_memory().total
# Convert 16GB to bytes
sixteen_gb_bytes = 16 * 1024 * 1024 * 1024

if total_ram_bytes > sixteen_gb_bytes:
    try:
        big_dataset = foz.load_zoo_dataset(
            "coco-2017",
            split="validation",
            max_samples=5000,  # Loading 5000 samples requires more memory
        )
        print(f"Successfully loaded a larger dataset with {len(big_dataset)} samples.")
        # You can now launch the app with this larger dataset
        # session = fo.launch_app(big_dataset)
        # print(session.url)
    except Exception as e:
        print(f"An error occurred: {e}")
        print("This may be due to insufficient RAM. Try switching to a High-RAM runtime.")
else:
    print("Skipping loading the larger dataset: Insufficient RAM (less than 16GB) available.")

## Tip: ignore `Invalid Notebook` messages on GitHub using the `githubtocolab` URL trick

Many Jupyter and Colab notebooks produce artifacts that don't allow them to render properly on GitHub previews.

![](https://github.com/andandandand/practical-computer-vision/blob/main/images/invalid_notebook.png?raw=true)

Here we have an example: https://github.com/andandandand/practical-computer-vision/blob/main/notebooks/Food_Dataset_Curation_with_Fiftyone.ipynb

If we change the URL from `github.com` to `githubtocolab.com`, the same renders immediately. Try it out!

https://githubtocolab.com/andandandand/practical-computer-vision/blob/main/notebooks/Food_Dataset_Curation_with_Fiftyone.ipynb

![](https://github.com/andandandand/practical-computer-vision/blob/main/notebooks/working_colab.png?raw=true)

## Conclusion

You now have the tools to work with FiftyOne in Google Colab without fighting the environment. By using `uv` for fast installs, Google Drive for persistent storage, and GPU acceleration for inference, you can focus on your computer vision work instead of environment setup.

These patterns apply beyond FiftyOne. The techniques for managing data persistence, sharing resources, and optimizing compute translate to other ML workflows in Colab.

### Next Steps

- Explore the [FiftyOne documentation](https://docs.voxel51.com) for deeper dives into dataset curation and model evaluation
- Check out the [FiftyOne tutorials](https://docs.voxel51.com/tutorials/index.html) for domain-specific applications


Happy curating!