# BiModernVBERT in FiftyOne - Quick Start

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/harpreetsahota204/bimodernvbert/blob/main/bimodernvbert_in_fo.ipynb)

This notebook demonstrates how to use BiModernVBERT for document retrieval and zero-shot classification in FiftyOne.

**BiModernVBERT** generates 768-dimensional embeddings for images and text in a shared vector space, perfect for:
- Document retrieval with text queries
- Zero-shot classification
- Similarity search


## Installation


In [None]:
%pip install -q fiftyone torch transformers pillow
%pip install -q git+https://github.com/illuin-tech/colpali.git@vbert#egg=colpali-engine


## Setup


In [None]:
import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob
from fiftyone.utils.huggingface import load_from_hub

# Register BiModernVBERT as a zoo model
foz.register_zoo_model_source(
    "https://github.com/harpreetsahota204/bimodernvbert",
    overwrite=True
)

print("✓ Setup complete")


## Load Dataset

We'll use a sample document dataset from Hugging Face:


In [None]:
# Load document dataset (250 samples for quick demo)
dataset = load_from_hub(
    "Voxel51/document-haystack-10pages",
    overwrite=True,
    max_samples=250
)

print(f"Loaded {len(dataset)} samples")
print(dataset)


## Document Retrieval with Text Queries


In [None]:
# Load model
model = foz.load_zoo_model("ModernVBERT/bimodernvbert")

# Compute embeddings for all documents
dataset.compute_embeddings(
    model=model,
    embeddings_field="bimodernvbert_embeddings"
)

# Verify embedding shape
print(f"Embedding shape: {dataset.first()['bimodernvbert_embeddings'].shape}")


In [None]:
# Build similarity index
text_img_index = fob.compute_similarity(
    dataset,
    model="ModernVBERT/bimodernvbert",
    embeddings_field="bimodernvbert_embeddings",
    brain_key="bimodernvbert_sim"
)

print("✓ Similarity index created")


In [None]:
# Query for specific content
results = text_img_index.sort_by_similarity(
    "invoice from 2024",
    k=10
)

print(f"Found {len(results)} matching documents")


In [None]:
# Launch FiftyOne App to explore results
session = fo.launch_app(results, auto=False)


## Zero-Shot Document Classification


In [None]:
# Load model with classification classes
model = foz.load_zoo_model(
    "ModernVBERT/bimodernvbert",
    classes=["invoice", "receipt", "form", "contract", "other"],
    text_prompt="This document is a"
)

# Apply zero-shot classification
dataset.apply_model(
    model,
    label_field="document_type"
)

print("✓ Classification complete")


In [None]:
# View classification results
print(dataset.first()['document_type'])

# Count predictions by class
print("\nPrediction distribution:")
print(dataset.count_values("document_type.label"))


In [None]:
# Explore classifications in the App
session = fo.launch_app(dataset, auto=False)


## Visualize Embeddings with UMAP

Create interactive 2D visualizations to explore document relationships:


In [None]:
# Create UMAP visualization
results = fob.compute_visualization(
    dataset,
    method="umap",  # Also supports "tsne", "pca"
    brain_key="bimodernvbert_viz",
    embeddings="bimodernvbert_embeddings",
    num_dims=2
)

print("✓ UMAP visualization created")


In [None]:
# Open the App and explore the visualization in the embeddings panel
session = fo.launch_app(dataset, auto=False)


## Multiple Query Searches


In [None]:
# Try different queries
queries = [
    "financial report with charts",
    "contract agreement",
    "form with checkboxes"
]

for query in queries:
    results = text_img_index.sort_by_similarity(query, k=5)
    print(f"\nQuery: '{query}'")
    print(f"Found {len(results)} results")


## Next Steps

Explore more advanced features:
- **UMAP Visualization**: `fob.compute_visualization(dataset, embeddings="bimodernvbert_embeddings")`
- **Duplicate Detection**: `fob.compute_uniqueness(dataset, embeddings="bimodernvbert_embeddings")`
- **Dynamic Classification**: Change `model.classes` and `model.text_prompt` for different tasks

📚 Full documentation: [github.com/harpreetsahota204/bimodernvbert](https://github.com/harpreetsahota204/bimodernvbert)
