In [None]:
!pip install fiftyone

I've created plugins which allow you to easily use [🌔Moondream2](https://github.com/harpreetsahota204/moondream2-plugin) and [🐋Janus-Pro](https://github.com/harpreetsahota204/janus-vqa-fiftyone) with your FiftyOne dataset.

Let's start by downloading the plugins and installing their dependencies.

> The plugin framework lets you extend and customize the functionality of FiftyOne to suit your needs.  If you’re interested in learning more about plugins, you might be interested in attending one of our monthly workshops. You can [see the full schedule here](https://voxel51.com/computer-vision-events/) and look for the *Advanced Computer Vision Data Curation and Model Evaluation workshop*.

In [None]:
!fiftyone plugins download https://github.com/harpreetsahota204/janus-vqa-fiftyone

In [None]:
!fiftyone plugins requirements @harpreetsahota/janus_vqa --install

In [None]:
!fiftyone plugins download https://github.com/harpreetsahota204/moondream2-plugin

In [None]:
!fiftyone plugins requirements @harpreetsahota/moondream2 --install

We also need to set an enviornment variable.

In [1]:
import os

os.environ['FIFTYONE_ALLOW_LEGACY_ORCHESTRATORS'] = 'true'

I found [this webiste - scott.ai from Scott Penberthy](https://scott.ai/2019-08-06-memeified-ng) which had some awesome machine learning memes on it. I parsed these meme's into a FiftyOne dataset. You can download the dataset [from Hugging Face](https://huggingface.co/datasets/harpreetsahota/memes-dataset) as well.

In [None]:
import fiftyone as fo
from fiftyone.utils import huggingface as fouh

ml_memes_dataset = fouh.load_from_hub(
    "harpreetsahota/ml-memes",
    name="ml-memes",
    overwrite=True
    )

Let's quickly explore the dataset:

In [None]:
fo.launch_app(ml_memes_dataset)

Now, let's instantiate our plugions as operators via the FiftyOne SDK.

Alternatively, you can use the app and fill out the operator form. Just hit the backtick button (`) to open the operator menu. Type in “Moondream” or "Janus" and click on it. You'll be presented with a form to fill out, which takes the same information as what we will pass in via the SDK.

In [6]:
import fiftyone.operators as foo

janus_vqa = foo.get_operator("@harpreetsahota/janus_vqa/janus_vqa")

moondream = foo.get_operator("@harpreetsahota/moondream2/moondream")

Now let's kick off a delegated service by opening the terminal and running `fiftyone delegated launch`

# OCR

Using OCR with Janus

In [7]:
QUESTION = "What does the text on this image say? Respond only with the text on the image and nothing else."

await janus_vqa(
    ml_memes_dataset,
    model_path="deepseek-ai/Janus-Pro-1B",
    question=QUESTION,
    question_field="ocr_questions",
    answer_field="janus_ocr",
    delegate=True
    )

<fiftyone.operators.executor.ExecutionResult at 0x727d56dcf910>

And with Moondream2

In [None]:
await moondream(
    ml_memes_dataset,
    revision="2025-01-09",
    operation="query",
    output_field="moondream_ocr",
    query_text=QUESTION,
    delegate=True
    )

In [None]:
session = fo.launch_app(ocr_view, auto=False)
session.url

# Meme understanding

In [None]:
memeund_view = ml_memes_dataset.clone(name="meme_understanding")

In [None]:
MEME_UNDERSTANDING_QUESTION = """This image is a meme. Describe the scene of the meme,
its characters, what they are saying, and what the
target audience of this meme might find funny about it.
"""

await janus_vqa(
    memeund_view,
    model_path="deepseek-ai/Janus-Pro-1B",
    question=MEME_UNDERSTANDING_QUESTION,
    question_field="meme_understanding_question",
    answer_field="janus_meme_understanding",
    delegate=True
    )

In [None]:
await moondream(
    memeund_view,
    revision="2025-01-09",
    operation="query",
    output_field="moondream_meme_understanding",
    query_text=MEME_UNDERSTANDING_QUESTION,
    delegate=True
    )

In [None]:
session = fo.launch_app(memeund_view, auto=False)
session.url

## Now let's test these models on captioning

In [None]:
memes_dataset = fouh.load_from_hub(
    "harpreetsahota/memes-dataset",
    name="meme-captioning",
    overwrite=True
    )

uncaptioned_memes = memes_dataset.select_group_slices("template")
uncaptioned_memes = uncaptioned_memes.clone()

In [None]:
fo.launch_app(uncaptioned_memes)

In [None]:
MEME_GENERATE = """
This image is a meme. Write a caption for this meme that is
realted to deep learning and artificial intelligence.
Respond only with the caption and nothing else.
"""

In [None]:
await janus_vqa(
    uncaptioned_memes,
    model_path="deepseek-ai/Janus-Pro-1B",
    question=MEME_GENERATE,
    question_field="caption_prompt",
    answer_field="janus_caption",
    delegate=True
    )

In [None]:
await moondream(
    uncaptioned_memes,
    revision="2025-01-09",
    operation="query",
    query_text=MEME_GENERATE,
    output_field="moondream_caption",
    delegate=True
)

In [None]:
session = fo.launch_app(uncaptioned_memes, auto=False)
session.url