Skip to content

Add Promptable Concept Segmentation pipeline#43612

Open
yonigozlan wants to merge 13 commits intohuggingface:mainfrom
yonigozlan:add-sam3-pipeline
Open

Add Promptable Concept Segmentation pipeline#43612
yonigozlan wants to merge 13 commits intohuggingface:mainfrom
yonigozlan:add-sam3-pipeline

Conversation

@yonigozlan
Copy link
Copy Markdown
Member

Add pipeline for SAM3's PCS task

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yonigozlan yonigozlan changed the title [WIP] Add Promptable Concept Segmentation pipeline Add Promptable Concept Segmentation pipeline Jan 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, sam3

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new promptable-concept-segmentation task and pipeline built around SAM3, wiring it through the auto-model machinery, pipeline registry, tests, and documentation.

Changes:

  • Added PromptableConceptSegmentationPipeline for SAM3, including task registration in SUPPORTED_TASKS, the pipeline() overloads, and the corresponding AutoModelForPromptableConceptSegmentation mapping.
  • Implemented a comprehensive test suite for the new pipeline and integrated SAM3 with the new task in the model tests.
  • Added user-facing documentation for the PCS task, the new AutoModel head, and the SAM3 model usage via the pipeline, plus toctree entries.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
utils/update_metadata.py Registers the new "promptable-concept-segmentation" pipeline tag and associated auto-model mapping for metadata export and checks.
utils/check_docstrings.py Adds PromptableConceptSegmentationPipeline to the temporary ignore list for docstring checking.
tests/pipelines/test_pipelines_promptable_concept_segmentation.py Adds slow, end-to-end tests covering text, box, combined prompts, batching, thresholds, top‑k, dict input format, error cases, ordering, and automatic SAM3 model/processor conversion.
tests/models/sam3/test_modeling_sam3.py Extends pipeline_model_mapping so SAM3 participates in both mask-generation and promptable-concept-segmentation pipeline tests.
src/transformers/pipelines/promptable_concept_segmentation.py Implements PromptableConceptSegmentationPipeline, including prompt validation, per-image preprocessing, SAM3-specific postprocessing to scores/boxes/masks, label propagation from text prompts, and a Sam3Video→Sam3Processor conversion in __init__.
src/transformers/pipelines/__init__.py Wires the new pipeline into imports, SUPPORTED_TASKS, default model (facebook/sam3), and the typed pipeline() overloads.
src/transformers/models/auto/modeling_auto.py Introduces MODEL_FOR_PROMPTABLE_CONCEPT_SEGMENTATION_MAPPING_NAMES, its lazy mapping, and AutoModelForPromptableConceptSegmentation, and exports the corresponding mapping and auto class.
src/transformers/__init__.py Exposes PromptableConceptSegmentationPipeline through the public pipelines namespace and type hints.
docs/source/en/tasks/promptable_concept_segmentation.md New PCS task guide explaining the concept, showing pipeline usage with text/box/combined prompts, visualization, manual model+processor flows, batching, and efficient multi-prompt inference.
docs/source/en/model_doc/sam3.md Adds a “Using the Pipeline” section for SAM3 demonstrating the promptable-concept-segmentation pipeline and clarifying its output format.
docs/source/en/model_doc/auto.md Documents the new AutoModelForPromptableConceptSegmentation auto class.
docs/source/en/main_classes/pipelines.md Adds PromptableConceptSegmentationPipeline to the multimodal pipelines API documentation.
docs/source/en/_toctree.yml Adds the PCS task page to the “Multimodal” task recipes toctree (though the path currently misses the tasks/ prefix).

)

# Should still get results, but filtered by negative box
self.assertGreaterEqual(len(outputs), 0)
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion self.assertGreaterEqual(len(outputs), 0) is trivially true and does not actually verify the behavior described in the comment (that there should still be results, filtered by the negative box). To make this test meaningful and catch regressions, the assertion should require at least one result (e.g. len(outputs) > 0) or otherwise validate that the negative box affected the outputs.

Suggested change
self.assertGreaterEqual(len(outputs), 0)
self.assertGreater(len(outputs), 0)

Copilot uses AI. Check for mistakes.
)

# Should get results
self.assertGreaterEqual(len(outputs), 0)
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion self.assertGreaterEqual(len(outputs), 0) is always true and therefore does not enforce the expectation from the preceding comment that the call "Should get results" when multiple positive boxes are provided. Consider tightening this check to assert that at least one instance is returned (for example, by requiring len(outputs) > 0) so the test can actually fail on regressions.

Suggested change
self.assertGreaterEqual(len(outputs), 0)
self.assertGreater(len(outputs), 0)

Copilot uses AI. Check for mistakes.
Comment on lines +199 to +206
if isinstance(image, str | Image.Image):
inputs = {
"image": image,
"text": text,
"input_boxes": input_boxes,
"input_boxes_labels": input_boxes_labels,
}
elif isinstance(image, list | tuple) and valid_images(image):
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isinstance is used with a union expression (str | Image.Image and list | tuple) instead of with a type or tuple of types, which will raise a TypeError at runtime and break even valid calls to the pipeline. You should use a tuple of types here (e.g. isinstance(image, (str, Image.Image)) and isinstance(image, (list, tuple))) to match the pattern used in ZeroShotObjectDetectionPipeline.__call__.

Suggested change
if isinstance(image, str | Image.Image):
inputs = {
"image": image,
"text": text,
"input_boxes": input_boxes,
"input_boxes_labels": input_boxes_labels,
}
elif isinstance(image, list | tuple) and valid_images(image):
if isinstance(image, (str, Image.Image)):
inputs = {
"image": image,
"text": text,
"input_boxes": input_boxes,
"input_boxes_labels": input_boxes_labels,
}
elif isinstance(image, (list, tuple)) and valid_images(image):

Copilot uses AI. Check for mistakes.
Comment thread docs/source/en/_toctree.yml
@Cyrilvallez
Copy link
Copy Markdown
Member

Hey @yonigozlan! Do we expect other models to ever use this pipeline? In general, I'm not super sure of introducing such new pipelines only for 1 model, as once the model kind of disappear in 1-2 years, the pipeline becomes useless...
cc @ArthurZucker @vasqu @Rocketknight1 as well

@Rocketknight1
Copy link
Copy Markdown
Member

I'm uncertain! When only one model is supported by a pipeline, it might make more sense to just add the pipeline functions as a snippet in the model card or something. Alternatively, we could bind some methods to the model class that do the same thing, which cuts the maintenance burden a lot compared to a whole pipeline class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants