Add Promptable Visual Segmentation pipeline#43613
Add Promptable Visual Segmentation pipeline#43613yonigozlan wants to merge 9 commits intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Pull request overview
This PR adds a new "promptable-visual-segmentation" pipeline to support SAM-family models (SAM, SAM2, EdgeTAM, SAM3Tracker) for interactive object segmentation based on visual prompts like points and bounding boxes.
Changes:
- Implements a new
PromptableVisualSegmentationPipelineclass for interactive segmentation with point and box prompts - Adds comprehensive test coverage with 13 test methods covering various prompting scenarios
- Includes detailed documentation with usage examples and guides for both pipeline and direct model usage
- Integrates the new pipeline into the auto-model mapping system and updates model test configurations
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/transformers/pipelines/promptable_visual_segmentation.py | New pipeline implementation with preprocessing, forward pass, and postprocessing logic |
| tests/pipelines/test_pipelines_promptable_visual_segmentation.py | Test suite covering single/multiple points, boxes, multimask output, error handling, and batch processing |
| src/transformers/models/auto/modeling_auto.py | Auto-model mapping definitions for promptable visual segmentation |
| tests/models/sam/test_modeling_sam.py | Updated pipeline mapping to include promptable-visual-segmentation |
| tests/models/sam2/test_modeling_sam2.py | Updated pipeline mapping to include promptable-visual-segmentation |
| tests/models/edgetam/test_modeling_edgetam.py | Updated pipeline mapping to include promptable-visual-segmentation |
| tests/models/sam3_tracker/test_modeling_sam3_tracker.py | Updated pipeline mapping to include promptable-visual-segmentation |
| src/transformers/pipelines/init.py | Registered new pipeline in SUPPORTED_TASKS |
| src/transformers/init.py | Exported PromptableVisualSegmentationPipeline |
| docs/source/en/tasks/promptable_visual_segmentation.md | Comprehensive task guide with examples for various use cases |
| docs/source/en/model_doc/sam.md | Added pipeline usage examples for SAM |
| docs/source/en/model_doc/sam2.md | Added pipeline usage examples for SAM2 |
| docs/source/en/model_doc/edgetam.md | Added pipeline usage examples and fixed model checkpoint references |
| docs/source/en/model_doc/sam3_tracker.md | Added pipeline usage examples |
| docs/source/en/model_doc/auto.md | Added AutoModelForPromptableVisualSegmentation documentation |
| docs/source/en/main_classes/pipelines.md | Added PromptableVisualSegmentationPipeline documentation |
| utils/check_docstrings.py | Added pipeline to docstring exception list |
| final_results.append(image_results) | ||
|
|
||
| # If single image, return as list with one element (for consistency) | ||
| return final_results if batch_size > 1 or isinstance(pred_masks, (list, tuple)) else final_results |
There was a problem hiding this comment.
The return statement on line 393 is incorrect. The condition batch_size > 1 or isinstance(pred_masks, (list, tuple)) will always return final_results when batch_size is 1 because the isinstance check will be False for a tensor. This makes the conditional logic redundant. The function should always return final_results which is already a list, regardless of batch size. This line should simply be return final_results.
| return final_results if batch_size > 1 or isinstance(pred_masks, (list, tuple)) else final_results | |
| return final_results |
| if is_torch_available(): | ||
| pass | ||
|
|
There was a problem hiding this comment.
The empty pass statement serves no purpose and should be removed. If is_torch_available() is True, nothing needs to be executed in this block. This code block can be deleted entirely.
| if is_torch_available(): | |
| pass |
| @require_torch | ||
| @require_vision | ||
| class PromptableVisualSegmentationPipelineTests(unittest.TestCase): |
There was a problem hiding this comment.
The test class is missing the @is_pipeline_test decorator that is consistently used in other pipeline test files in this codebase. This decorator is used to mark pipeline tests for the test infrastructure. Additionally, the test class should define a model_mapping class variable (like model_mapping = MODEL_FOR_PROMPTABLE_VISUAL_SEGMENTATION_MAPPING), which is a standard pattern in pipeline tests for automatic model discovery. Reference examples: tests/pipelines/test_pipelines_mask_generation.py:64-68, tests/pipelines/test_pipelines_zero_shot_object_detection.py:44-48.
|
|
||
| @require_torch | ||
| @require_vision | ||
| class PromptableVisualSegmentationPipelineTests(unittest.TestCase): |
There was a problem hiding this comment.
The test class is missing required methods get_test_pipeline and run_pipeline_test that are standard in all pipeline test classes. These methods are required by the pipeline test infrastructure for running common pipeline tests. Reference examples: tests/pipelines/test_pipelines_mask_generation.py:70-94, tests/pipelines/test_pipelines_zero_shot_object_detection.py:50-90.
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, edgetam, sam, sam2, sam3_tracker |
Add pipeline for sam/sam2/edgetam/sam3_tracker task