# Nanonets-OCR2 with FiftyOne

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/harpreetsahota204/nanonets_ocr2/blob/main/nanonets_ocr_example.ipynb)

This notebook demonstrates how to use Nanonets-OCR2 with FiftyOne for intelligent document OCR with semantic tagging.


## Installation


In [None]:
%pip install -q fiftyone


## Load Dataset

We'll use a scanned receipts dataset from Hugging Face.


In [None]:
import fiftyone as fo
from fiftyone.utils.huggingface import load_from_hub

# Load the dataset
dataset = load_from_hub(
    "Voxel51/scanned_receipts",
    max_samples=200
)

print(f"Loaded {len(dataset)} samples")


## Register and Load Model

Register the Nanonets-OCR2 model source and load it.


In [None]:
import fiftyone.zoo as foz

# Register the model source
foz.register_zoo_model_source(
    "https://github.com/harpreetsahota204/nanonets_ocr2",
    overwrite=True
)

# Load the model
model = foz.load_zoo_model("nanonets/Nanonets-OCR2-3B")

print("Model loaded successfully!")


## Apply OCR to Dataset

Process all images and extract structured text with semantic tags.


In [None]:
# Apply model to dataset
dataset.apply_model(model, label_field="ocr_text")

print("OCR processing complete!")


## View Results

Launch the FiftyOne App to explore the OCR results.


In [None]:
# install the plugin to view captions
!fiftyone plugins download https://github.com/mythrandire/caption-viewer

In [None]:
# Launch the App
session = fo.launch_app(dataset)

## Inspect Sample Results

Let's look at the extracted text from a few samples.


In [None]:
# Print OCR results from first 3 samples
for sample in dataset.take(3):
    print(f"\n{'='*80}")
    print(f"Sample: {sample.filepath}")
    print(f"{'='*80}")
    print(sample.ocr_text)
    print()


## What to Look For

The OCR output includes semantic tags:

- `<table>...</table>` - HTML formatted tables
- `$...$` or `$$...$$` - LaTeX equations
- `<img>...</img>` - Image descriptions
- `<watermark>...</watermark>` - Watermark text
- `<page_number>...</page_number>` - Page numbers
- `<signature>...</signature>` - Signatures
- `☐` `☑` `☒` - Checkbox states


## Filter and Search

Use FiftyOne's powerful filtering to find specific documents.


In [None]:
from fiftyone import ViewField as F

# Find samples containing tables
with_tables = dataset.match(F("ocr_text").contains_str("<table>"))
print(f"Found {len(with_tables)} samples with tables")

# Find samples with watermarks
with_watermarks = dataset.match(F("ocr_text").contains_str("<watermark>"))
print(f"Found {len(with_watermarks)} samples with watermarks")


## Resources

- [Nanonets-OCR2 Model Card](https://huggingface.co/nanonets/Nanonets-OCR2-3B)
- [GitHub Repository](https://github.com/harpreetsahota204/nanonets_ocr2)
- [FiftyOne Documentation](https://docs.voxel51.com/)
