Add Promptable Concept Segmentation pipeline#43612
Add Promptable Concept Segmentation pipeline#43612yonigozlan wants to merge 13 commits intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, sam3 |
There was a problem hiding this comment.
Pull request overview
This PR introduces a new promptable-concept-segmentation task and pipeline built around SAM3, wiring it through the auto-model machinery, pipeline registry, tests, and documentation.
Changes:
- Added
PromptableConceptSegmentationPipelinefor SAM3, including task registration inSUPPORTED_TASKS, thepipeline()overloads, and the correspondingAutoModelForPromptableConceptSegmentationmapping. - Implemented a comprehensive test suite for the new pipeline and integrated SAM3 with the new task in the model tests.
- Added user-facing documentation for the PCS task, the new AutoModel head, and the SAM3 model usage via the pipeline, plus toctree entries.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
utils/update_metadata.py |
Registers the new "promptable-concept-segmentation" pipeline tag and associated auto-model mapping for metadata export and checks. |
utils/check_docstrings.py |
Adds PromptableConceptSegmentationPipeline to the temporary ignore list for docstring checking. |
tests/pipelines/test_pipelines_promptable_concept_segmentation.py |
Adds slow, end-to-end tests covering text, box, combined prompts, batching, thresholds, top‑k, dict input format, error cases, ordering, and automatic SAM3 model/processor conversion. |
tests/models/sam3/test_modeling_sam3.py |
Extends pipeline_model_mapping so SAM3 participates in both mask-generation and promptable-concept-segmentation pipeline tests. |
src/transformers/pipelines/promptable_concept_segmentation.py |
Implements PromptableConceptSegmentationPipeline, including prompt validation, per-image preprocessing, SAM3-specific postprocessing to scores/boxes/masks, label propagation from text prompts, and a Sam3Video→Sam3Processor conversion in __init__. |
src/transformers/pipelines/__init__.py |
Wires the new pipeline into imports, SUPPORTED_TASKS, default model (facebook/sam3), and the typed pipeline() overloads. |
src/transformers/models/auto/modeling_auto.py |
Introduces MODEL_FOR_PROMPTABLE_CONCEPT_SEGMENTATION_MAPPING_NAMES, its lazy mapping, and AutoModelForPromptableConceptSegmentation, and exports the corresponding mapping and auto class. |
src/transformers/__init__.py |
Exposes PromptableConceptSegmentationPipeline through the public pipelines namespace and type hints. |
docs/source/en/tasks/promptable_concept_segmentation.md |
New PCS task guide explaining the concept, showing pipeline usage with text/box/combined prompts, visualization, manual model+processor flows, batching, and efficient multi-prompt inference. |
docs/source/en/model_doc/sam3.md |
Adds a “Using the Pipeline” section for SAM3 demonstrating the promptable-concept-segmentation pipeline and clarifying its output format. |
docs/source/en/model_doc/auto.md |
Documents the new AutoModelForPromptableConceptSegmentation auto class. |
docs/source/en/main_classes/pipelines.md |
Adds PromptableConceptSegmentationPipeline to the multimodal pipelines API documentation. |
docs/source/en/_toctree.yml |
Adds the PCS task page to the “Multimodal” task recipes toctree (though the path currently misses the tasks/ prefix). |
| ) | ||
|
|
||
| # Should still get results, but filtered by negative box | ||
| self.assertGreaterEqual(len(outputs), 0) |
There was a problem hiding this comment.
The assertion self.assertGreaterEqual(len(outputs), 0) is trivially true and does not actually verify the behavior described in the comment (that there should still be results, filtered by the negative box). To make this test meaningful and catch regressions, the assertion should require at least one result (e.g. len(outputs) > 0) or otherwise validate that the negative box affected the outputs.
| self.assertGreaterEqual(len(outputs), 0) | |
| self.assertGreater(len(outputs), 0) |
| ) | ||
|
|
||
| # Should get results | ||
| self.assertGreaterEqual(len(outputs), 0) |
There was a problem hiding this comment.
The assertion self.assertGreaterEqual(len(outputs), 0) is always true and therefore does not enforce the expectation from the preceding comment that the call "Should get results" when multiple positive boxes are provided. Consider tightening this check to assert that at least one instance is returned (for example, by requiring len(outputs) > 0) so the test can actually fail on regressions.
| self.assertGreaterEqual(len(outputs), 0) | |
| self.assertGreater(len(outputs), 0) |
| if isinstance(image, str | Image.Image): | ||
| inputs = { | ||
| "image": image, | ||
| "text": text, | ||
| "input_boxes": input_boxes, | ||
| "input_boxes_labels": input_boxes_labels, | ||
| } | ||
| elif isinstance(image, list | tuple) and valid_images(image): |
There was a problem hiding this comment.
isinstance is used with a union expression (str | Image.Image and list | tuple) instead of with a type or tuple of types, which will raise a TypeError at runtime and break even valid calls to the pipeline. You should use a tuple of types here (e.g. isinstance(image, (str, Image.Image)) and isinstance(image, (list, tuple))) to match the pattern used in ZeroShotObjectDetectionPipeline.__call__.
| if isinstance(image, str | Image.Image): | |
| inputs = { | |
| "image": image, | |
| "text": text, | |
| "input_boxes": input_boxes, | |
| "input_boxes_labels": input_boxes_labels, | |
| } | |
| elif isinstance(image, list | tuple) and valid_images(image): | |
| if isinstance(image, (str, Image.Image)): | |
| inputs = { | |
| "image": image, | |
| "text": text, | |
| "input_boxes": input_boxes, | |
| "input_boxes_labels": input_boxes_labels, | |
| } | |
| elif isinstance(image, (list, tuple)) and valid_images(image): |
|
Hey @yonigozlan! Do we expect other models to ever use this pipeline? In general, I'm not super sure of introducing such new pipelines only for 1 model, as once the model kind of disappear in 1-2 years, the pipeline becomes useless... |
|
I'm uncertain! When only one model is supported by a pipeline, it might make more sense to just add the pipeline functions as a snippet in the model card or something. Alternatively, we could bind some methods to the model class that do the same thing, which cuts the maintenance burden a lot compared to a whole pipeline class. |
Add pipeline for SAM3's PCS task