[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/harpreetsahota204/medsiglip/blob/main/using_medsiglip_model.ipynb)

Note: If using in colab, you need to install:

`pip install fiftyone huggingface-hub accelerate`

### ℹ️  Important! Be sure to request access to the model!

This is a gated model, so you will need to fill out the form on the model card: https://huggingface.co/google/medsiglip-448

Approval should be instantaneous.

You'll also have to set your Hugging Face in your enviornment:

```bash
export HF_TOKEN="your_token"
```

Or sign-in to Hugging Face via the CLI:

```bash
huggingface-cli login
```

# How to use MedSigLIP Model for Embeddings and Text Similarity Search

In [1]:
import fiftyone as fo

from fiftyone.utils.huggingface import load_from_hub

dataset = load_from_hub(
    "Voxel51/SLAKE",
    name="SLAKE",
    overwrite=True,
    max_samples=10
    )

  from .autonotebook import tqdm as notebook_tqdm


Downloading config file fiftyone.yml from Voxel51/SLAKE
Loading dataset
Importing samples...
 100% |███████████████████| 10/10 [2.9ms elapsed, 0s remaining, 3.4K samples/s]      


# Setup Zoo Model

In [2]:
import fiftyone.zoo as foz

foz.register_zoo_model_source("https://github.com/harpreetsahota204/medsiglip", overwrite=True)

Downloading https://github.com/harpreetsahota204/medsiglip...
  120.8Kb [17.5ms elapsed, ? remaining, 6.8Mb/s] 
Overwriting existing model source '/Users/paularamos/fiftyone/__models__/medsiglip'


In [3]:
foz.download_zoo_model(
    "https://github.com/harpreetsahota204/medsiglip",
    model_name="google/medsiglip-448",
)

Fetching 9 files: 100%|██████████| 9/9 [00:07<00:00,  1.26it/s]


(<fiftyone.zoo.models.RemoteZooModel at 0x12e865c90>,
 '/Users/paularamos/fiftyone/__models__/medsiglip/medsiglip-448')

In [None]:
#!pip install sentencepiece protobuf

In [5]:
import fiftyone.zoo as foz

model = foz.load_zoo_model(
    "google/medsiglip-448"
    )

Fetching 9 files: 100%|██████████| 9/9 [00:00<00:00, 6040.76it/s]


# Compute embeddings

In [6]:
dataset.compute_embeddings(
    model=model,
    embeddings_field="medsiglip_embeddings",
)

 100% |███████████████████| 10/10 [15.1s elapsed, 0s remaining, 1.1 samples/s]    


# Compute visualization of embeddings

Note requires that `umap-learn` is installed. Currently, `umap-learn` only supports `numpy<=2.1.0`  

In [7]:
import fiftyone.brain as fob

results = fob.compute_visualization(
    dataset,
    embeddings="medsiglip_embeddings",
    method="umap",
    brain_key="medsiglip_viz",
    num_dims=2,
)

Generating visualization...


  warn(


UMAP( verbose=True)
Fri Jul 11 12:33:21 2025 Construct fuzzy simplicial set
Fri Jul 11 12:33:21 2025 Finding Nearest Neighbors
Fri Jul 11 12:33:23 2025 Finished Nearest Neighbor Search
Fri Jul 11 12:33:24 2025 Construct embedding


Epochs completed: 100%| ██████████ 500/500 [00:00]

	completed  0  /  500 epochs
	completed  50  /  500 epochs
	completed  100  /  500 epochs
	completed  150  /  500 epochs
	completed  200  /  500 epochs
	completed  250  /  500 epochs
	completed  300  /  500 epochs
	completed  350  /  500 epochs
	completed  400  /  500 epochs
	completed  450  /  500 epochs
Fri Jul 11 12:33:24 2025 Finished embedding





# Build a similarity index for natural language search

You can [visit the docs](https://docs.voxel51.com/api/fiftyone.brain.html?highlight=compute_similarity#fiftyone.brain.compute_similarity) for more information on similarity search.

In [8]:
import fiftyone.brain as fob

text_img_index = fob.compute_similarity(
    dataset,
    model="google/medsiglip-448", #or just pass in the already instantiated model
    brain_key="medsiglip_sim",
)

Computing embeddings...
 100% |███████████████████| 10/10 [8.5s elapsed, 0s remaining, 1.2 samples/s]      


Verify that we can support text search:

In [9]:
print(text_img_index.config.supports_prompts)  # True

True


In [10]:
sims = text_img_index.sort_by_similarity(
    "healthy chest x-rays"
)

In [11]:
sims

Dataset:     SLAKE
Media type:  image
Num samples: 10
Sample fields:
    id:                   fiftyone.core.fields.ObjectIdField
    filepath:             fiftyone.core.fields.StringField
    tags:                 fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:             fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:           fiftyone.core.fields.DateTimeField
    last_modified_at:     fiftyone.core.fields.DateTimeField
    detections:           fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    segmentation:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Segmentation)
    location:             fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    modality:             fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    base_type:            fiftyone.core.fields.EmbeddedDocumentField(fift

Select your Dataset from the dropdown menu, open the embeddings panel by clicking the `+` next to the Samples viewer, and select the embeddings you want to display by selecting from the dropdown menu in the embeddings panel.

To search via natural language in the App click the `🔎` button and type in your query. The most similar samples to the query will be shown in decreasing order of similarity

In [12]:
fo.launch_app(dataset)

Connected to FiftyOne on port 5151 at localhost.
If you are not connecting to a remote session, you may need to start a new session and specify a port


Dataset:          SLAKE
Media type:       image
Num samples:      10
Selected samples: 0
Selected labels:  0
Session URL:      http://localhost:5151/