Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



20 Commits

Repository files navigation

Visual Question Answering Plugin



  • 2024-05-07: Major updates:

    • Added support for Moondream2 model.
    • Added support for reading question from field on the sample.
    • Added support for storing the answer in a field on the sample.
    • Added support for applying to all samples in the current view (one at a time).
    • Added support for delegated execution.
    • Added support for Python operator execution.
  • 2024-05-03: @harpreetsahota204 added support for Idefics-8b model from Replicate.

  • 2023-10-24: Added support for Llava-13b and Fuyu-8b models from Replicate.

Plugin Overview

This plugin is a Python plugin that allows you to answer visual questions about images in your dataset!

Supported Models

This version of the plugin supports the following models:

Feel free to fork this plugin and add support for other models!

Watch On Youtube

Video Thumbnail



  1. If you plan to use it, install the Hugging Face transformers library:
pip install transformers
  1. If you plan to use it, install the Replicate library:
pip install replicate

And add your Replicate API key to your environment:

export REPLICATE_API_TOKEN=<your-api-token>

Install the plugin

fiftyone plugins download



  • Applies the selected visual question answering model to the selected sample in your dataset and outputs the answer.


The recommended interactive way to use this plugin is in the FiftyOne App with exactly one sample selected.

Python Operator Execution

If you want to loop over samples in your dataset or view, you may be interested in using the Python operator execution mode.

import fiftyone as fo
import fiftyone.operators as foo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart", max_samples=5)

## Access the operator via its URI (plugin name + operator name)
vqa = foo.get_operator("@jacobmarks/vqa/answer_visual_question")

## Apply the operator to the dataset
    question="Describe the image",

## Print the answers