fix: enable single-scale input support in SAM3 mask decoder#45990
Closed
TheShreyanshiDwivedi wants to merge 5 commits into
Closed
fix: enable single-scale input support in SAM3 mask decoder#45990TheShreyanshiDwivedi wants to merge 5 commits into
TheShreyanshiDwivedi wants to merge 5 commits into
Conversation
Allow Sam3MaskDecoder.forward to receive a single tensor for backbone_features in addition to the standard multi-scale list. When a bare tensor is passed it is wrapped in a list internally and a UserWarning is emitted so callers are aware they are using the single-scale fallback path. The existing _embed_pixels method and multi-scale FPN path are unchanged — a single-element list already works correctly through _embed_pixels since it operates on the last element of the list. Adds test_mask_decoder_single_scale_input to verify: - single tensor input produces valid output and emits UserWarning - multi-scale list input continues to work unchanged Fixes huggingface#43043
Sam3LiteTextMaskDecoder inherits from Sam3MaskDecoder via the modular conversion system. Apply the same single-tensor normalization introduced in the parent class so the generated file stays consistent with what check_modular_conversion expects. Fixes check_repository_consistency CI failure caused by stale generated modeling_sam3_lite_text.py after the parent class was updated.
…tput Align docstring wording with what check_modular_conversion generates from the parent Sam3MaskDecoder class, resolving the remaining consistency check failure.
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: sam3, sam3_lite_text |
Member
|
Blocking for opening like 5 random Claude PRs |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
SAM3 and SAM3-Lite mask decoders rejected single-scale feature maps unconditionally. This fixes the forward path to accept single-scale inputs.
Root cause:
Sam3MaskDecoderassumed multi-scale input with no fallback path for single-scale feature maps. Same issue propagated toSam3LiteTextmask decoder.Changes
Sam3MaskDecoderSam3LiteTextmask decoderSam3LiteTextMaskDecoderto match corrected output shapeFiles changed
src/transformers/models/sam3/modeling_sam3.pysrc/transformers/models/sam3_lite_text/modeling_sam3_lite_text.py