In [None]:
#

# Zero-Shot Classification in FiftyOne



# Download Dataset

In [None]:
import fiftyone as fo
import fiftyone.utils.huggingface as fouh


dataset = fouh.load_from_hub("Voxel51/ImageNet-O")

Let's grab the classes from this dataset using the [`distinct` method](beta-docs.voxel51.com/api/fiftyone.core.aggregations.distinct.html) of the [Dataset](https://beta-docs.voxel51.com/getting_started/basic/datasets_samples_fields/). These will be the classes we use for zero-shot classification

In [None]:
dataset_classes = dataset.distinct("ground_truth.label")

# FiftyOne Model Zoo

The [FiftyOne Model Zoo](https://beta-docs.voxel51.com/models/model_zoo/) provides a powerful interface for downloading models and applying them to your FiftyOne datasets. It provides native access to hundreds of pre-trained models, and it also supports downloading arbitrary public or private models whose definitions are provided via GitHub repositories or URLs.


all of these models accept a text_prompt keyword argument, which allows you to override the prompt template used to embed the class names. Zero-shot classification results can vary based on this text!

https://beta-docs.voxel51.com/api/fiftyone.zoo.models.html#load_zoo_model

In [None]:
import torch 
import fiftyone.zoo as foz

clip_zoo_model = foz.load_zoo_model(
    model_name="clip-vit-base-patch32",
    text_prompt="A photo of a ",
    classes=dataset_classes,
    device="cuda" if torch.cuda.is_available() else "cpu",
    # install_requirements=True # uncomment this line if you are running this code for the first time
)

dataset.apply_model(clip_zoo_model, label_field="clip_classification")

# Plugins

# Open CLIP Integration


https://beta-docs.voxel51.com/integrations/openclip/

In [None]:
!pip install open_clip_torch

In [None]:
import torch 
import fiftyone.zoo as foz

open_clip_model = foz.load_zoo_model(
    "open-clip-torch",
    text_prompt="A photo of a",
    classes=dataset_classes,
    install_requirements=True,
    device="cuda" if torch.cuda.is_available() else "cpu",
    # install_requirements=True # uncomment this line if you are running this code for the first time
)

dataset.apply_model(open_clip_model, label_field="open_clip_classification")

You can also specify different model architectures and pretrained weights by passing in optional parameters. Pretrained models can be loaded directly from OpenCLIP with the following syntax:


```python
meta_clip = foz.load_zoo_model(
    "open-clip-torch",
    clip_model="ViT-B-32-quickgelu",
    pretrained="metaclip_400m",
    text_prompt="A photo of a",
    classes=dataset_classes,
)
```


Alternatively you can also load a model from Hugging Face’s Model Hub with the following syntax:


```python
import fiftyone.zoo as foz

open_clip_model = foz.load_zoo_model(
    "open-clip-torch",
    clip_model="hf-hub:repo-name/model-name",
    pretrained="",
)
```

As a concrete example, if you were interested in the [StreetCLIP model](https://huggingface.co/geolocal/StreetCLIP) you would use:

```python
street_clip_model = foz.load_zoo_model(
    "open-clip-torch",
    pretrained="",
    clip_model="hf-hub:geolocal/StreetCLIP"
)
```

# Hugging Face Integration


"zero-shot-classification-transformer-torch" specifies that you want to a zero-shot image classification model from the Hugging Face Transformers library. You can then specify the model via the name_or_path argument, which should be the repository name or model identifier of the model you want to load.

https://beta-docs.voxel51.com/integrations/huggingface/#zero-shot-classification




In [None]:
import torch 
import fiftyone.zoo as foz

siglip_model = foz.load_zoo_model(
    "zero-shot-classification-transformer-torch",
    name_or_path="google/siglip2-base-patch32-256",
    classes=dataset_classes,
    device="cuda" if torch.cuda.is_available() else "cpu",
    # install_requirements=True # uncomment this line if you are running this code for the first time
    )

# Arbitrary Model

Any model that can be run in a Hugging Face pipeline for the `zero-shot-image-classification` task can be loaded as a Zoo model.

A good first entry point is to just do it and pass the model name into `name_or_path` in the [`load_zoo_model`](https://beta-docs.voxel51.com/api/fiftyone.zoo.models.html#fiftyone.zoo.models.load_zoo_model) method of the dataset. If a Hugging Face model is not compatible with the integration, you'll see an error to the effect of: 

```python
ValueError: Unrecognized model in <whatever-model-name>
```

In this case, you will need to run the model manually. All this means is that you need to instantiate the model, it's  processor, and write some logic to parse it the model output a FiftyOne Classification. 

Here's an example of how you can do this:



## Next Steps
https://beta-docs.voxel51.com/tutorials/zero_shot_classification/

https://voxel51.com/blog/a-history-of-clip-model-training-data-advances/

https://voxel51.com/blog/this-visual-illusions-benchmark-makes-me-question-the-power-of-vlms/