Skip to content

Add Promptable Visual Segmentation pipeline#43613

Open
yonigozlan wants to merge 9 commits intohuggingface:mainfrom
yonigozlan:add-pvs-pipeline
Open

Add Promptable Visual Segmentation pipeline#43613
yonigozlan wants to merge 9 commits intohuggingface:mainfrom
yonigozlan:add-pvs-pipeline

Conversation

@yonigozlan
Copy link
Member

Add pipeline for sam/sam2/edgetam/sam3_tracker task

@yonigozlan yonigozlan changed the title Add Promptable Visual Segmentation pipeline [WIP] Add Promptable Visual Segmentation pipeline Jan 30, 2026
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yonigozlan yonigozlan requested review from Copilot and vasqu January 30, 2026 23:48
@yonigozlan yonigozlan changed the title [WIP] Add Promptable Visual Segmentation pipeline Add Promptable Visual Segmentation pipeline Jan 30, 2026
@yonigozlan yonigozlan requested a review from molbap January 30, 2026 23:48
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new "promptable-visual-segmentation" pipeline to support SAM-family models (SAM, SAM2, EdgeTAM, SAM3Tracker) for interactive object segmentation based on visual prompts like points and bounding boxes.

Changes:

  • Implements a new PromptableVisualSegmentationPipeline class for interactive segmentation with point and box prompts
  • Adds comprehensive test coverage with 13 test methods covering various prompting scenarios
  • Includes detailed documentation with usage examples and guides for both pipeline and direct model usage
  • Integrates the new pipeline into the auto-model mapping system and updates model test configurations

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/transformers/pipelines/promptable_visual_segmentation.py New pipeline implementation with preprocessing, forward pass, and postprocessing logic
tests/pipelines/test_pipelines_promptable_visual_segmentation.py Test suite covering single/multiple points, boxes, multimask output, error handling, and batch processing
src/transformers/models/auto/modeling_auto.py Auto-model mapping definitions for promptable visual segmentation
tests/models/sam/test_modeling_sam.py Updated pipeline mapping to include promptable-visual-segmentation
tests/models/sam2/test_modeling_sam2.py Updated pipeline mapping to include promptable-visual-segmentation
tests/models/edgetam/test_modeling_edgetam.py Updated pipeline mapping to include promptable-visual-segmentation
tests/models/sam3_tracker/test_modeling_sam3_tracker.py Updated pipeline mapping to include promptable-visual-segmentation
src/transformers/pipelines/init.py Registered new pipeline in SUPPORTED_TASKS
src/transformers/init.py Exported PromptableVisualSegmentationPipeline
docs/source/en/tasks/promptable_visual_segmentation.md Comprehensive task guide with examples for various use cases
docs/source/en/model_doc/sam.md Added pipeline usage examples for SAM
docs/source/en/model_doc/sam2.md Added pipeline usage examples for SAM2
docs/source/en/model_doc/edgetam.md Added pipeline usage examples and fixed model checkpoint references
docs/source/en/model_doc/sam3_tracker.md Added pipeline usage examples
docs/source/en/model_doc/auto.md Added AutoModelForPromptableVisualSegmentation documentation
docs/source/en/main_classes/pipelines.md Added PromptableVisualSegmentationPipeline documentation
utils/check_docstrings.py Added pipeline to docstring exception list

final_results.append(image_results)

# If single image, return as list with one element (for consistency)
return final_results if batch_size > 1 or isinstance(pred_masks, (list, tuple)) else final_results
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return statement on line 393 is incorrect. The condition batch_size > 1 or isinstance(pred_masks, (list, tuple)) will always return final_results when batch_size is 1 because the isinstance check will be False for a tensor. This makes the conditional logic redundant. The function should always return final_results which is already a list, regardless of batch size. This line should simply be return final_results.

Suggested change
return final_results if batch_size > 1 or isinstance(pred_masks, (list, tuple)) else final_results
return final_results

Copilot uses AI. Check for mistakes.
Comment on lines 29 to 31
if is_torch_available():
pass

Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty pass statement serves no purpose and should be removed. If is_torch_available() is True, nothing needs to be executed in this block. This code block can be deleted entirely.

Suggested change
if is_torch_available():
pass

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +39
@require_torch
@require_vision
class PromptableVisualSegmentationPipelineTests(unittest.TestCase):
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test class is missing the @is_pipeline_test decorator that is consistently used in other pipeline test files in this codebase. This decorator is used to mark pipeline tests for the test infrastructure. Additionally, the test class should define a model_mapping class variable (like model_mapping = MODEL_FOR_PROMPTABLE_VISUAL_SEGMENTATION_MAPPING), which is a standard pattern in pipeline tests for automatic model discovery. Reference examples: tests/pipelines/test_pipelines_mask_generation.py:64-68, tests/pipelines/test_pipelines_zero_shot_object_detection.py:44-48.

Copilot uses AI. Check for mistakes.

@require_torch
@require_vision
class PromptableVisualSegmentationPipelineTests(unittest.TestCase):
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test class is missing required methods get_test_pipeline and run_pipeline_test that are standard in all pipeline test classes. These methods are required by the pipeline test infrastructure for running common pipeline tests. Reference examples: tests/pipelines/test_pipelines_mask_generation.py:70-94, tests/pipelines/test_pipelines_zero_shot_object_detection.py:50-90.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, edgetam, sam, sam2, sam3_tracker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants