ScreenParser for FiftyOne

Inference results in FiftyOne using ScreenParser on a sample from the ScreenSpot-Pro dataset.

A FiftyOne remote model zoo source for ScreenParser, a YOLO11-L object detector fine-tuned by the docling-project on the ScreenParse v2 dataset (~1.45M screenshots) to localize 55 UI element classes (buttons, tables, navigation bars, text inputs, icons, etc.) in application and web screenshots.

ScreenParser is a standard Ultralytics YOLO model, so this integration uses FiftyOne's built-in fiftyone.utils.ultralytics.FiftyOneYOLOModel wrapper, there is no custom inference code, only a manifest.json describing where to download the weights and how to deploy them.

Requirements

pip install "fiftyone>=1.0" "ultralytics>=8.3.0"

Usage

Register this repository as a remote zoo model source, then load and apply the model like any other zoo model:

import fiftyone as fo
import fiftyone.zoo as foz

# 1. Register the remote source (one time)
foz.register_zoo_model_source("https://github.com/Burhan-Q/screenparser")

# 2. Download the weights (153 MB); load_zoo_model does this for you
foz.download_zoo_model(
    "https://github.com/Burhan-Q/screenparser",
    model_name="docling-project/ScreenParser",
)

# 3. Load the model
model = foz.load_zoo_model("docling-project/ScreenParser")

# 4. Apply to a dataset of screenshots
dataset = fo.Dataset.from_images_dir("/path/to/screenshots")
dataset.apply_model(model, label_field="ui_elements")

session = fo.launch_app(dataset)

Predictions are stored as fiftyone.core.labels.Detections in the ui_elements field.

Inference settings

The model was trained at 1280px; the manifest sets the recommended defaults of imgsz=1280, conf=0.10, iou=0.10. You can override the confidence threshold and other Ultralytics arguments at load time:

model = foz.load_zoo_model(
    "docling-project/ScreenParser",
    confidence_thresh=0.25,
    overrides={"iou": 0.10, "imgsz": 1280},
)

Training Data & Detected Classes

The current main checkpoint was trained on ScreenParse v2, which provides 1,447,100 high-quality training screenshots and 25,575,213 UI element annotations. The dataset uses filtered leaf-element annotations to reduce noisy nested boxes and includes multiple viewport resolutions.

Limitations

Produces bounding boxes and element labels only; it does not produce text content for detected elements. Pair it with OCR or ScreenVLM when text extraction is needed.
The model is trained on rendered web screenshots, so performance may vary on native desktop, mobile, or application screenshots outside the training distribution.

Expand for the full class list

Table
Column/Browser
Button
Utility Button
App Icon
Navigation Bar
Status Bar
Search Field
Toolbar
Tooltip
Video
Tab Bar
Side Bar
Slider
Picker
ContextMenu
DockMenu
EditMenu
Image
Scroll
Switch
File Icon
Chart
Window
Screen
List
List Item
PopUp Menu
Steppers
Toggles
Text Input
Rating Indicator
Checkbox
Radiobox
Select
Avatar
Badge
Alert
Progress bar
Bottom navigation
Breadcrumb
Page control
Link
Menu
Pagination
Tab
Search Bar
Date-Time picker
Calendar
Text
Heading
Code snippet
Carousel
Notification
Logo

License

The ScreenParser FiftyOne integration source is released under the Apache-2.0 license. See the model card for details about the docling-project license of the model weights.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
manifest.json		manifest.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScreenParser for FiftyOne

Requirements

Usage

Inference settings

Training Data & Detected Classes

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ScreenParser for FiftyOne

Requirements

Usage

Inference settings

Training Data & Detected Classes

Limitations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages