Skip to content

fix: enable _supports_flash_attn_2 for EOMT#45993

Closed
TheShreyanshiDwivedi wants to merge 3 commits into
huggingface:mainfrom
TheShreyanshiDwivedi:fresh/eomt-flash-attn-support
Closed

fix: enable _supports_flash_attn_2 for EOMT#45993
TheShreyanshiDwivedi wants to merge 3 commits into
huggingface:mainfrom
TheShreyanshiDwivedi:fresh/eomt-flash-attn-support

Conversation

@TheShreyanshiDwivedi
Copy link
Copy Markdown

What does this PR do?

EOMT had full Flash Attention 2 wiring via ALL_ATTENTION_FUNCTIONS but the gate flag _supports_flash_attn_2 was missing. This one-line fix enables FA2 for EOMT.

Root cause: EomtPreTrainedModel was missing _supports_flash_attn_2 = True despite the attention layer already being correctly wired to the FA2 backend.

Changes

  • Add _supports_flash_attn_2 = True to EomtPreTrainedModel

Files changed

  • src/transformers/models/eomt/modular_eomt.py
  • src/transformers/models/eomt/modeling_eomt.py

Srijan Upadhyay and others added 3 commits May 15, 2026 15:19
EomtAttention already routes through ALL_ATTENTION_FUNCTIONS and accepts
attn_implementation at runtime, identical to other vision models that have
_supports_flash_attn = True (DINOv2, SigLIP, etc.).  The flag was missing,
causing flash_attention_2 to be rejected at the PreTrainedModel validation
level even though the attention forward is fully compatible.
…mt_dinov3)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: eomt, eomt_dinov3, videomt

@Rocketknight1
Copy link
Copy Markdown
Member

Hey, these Claude PRs that don't address an existing issue and are just kind of opened randomly are very annoying for us! It's not actually helpful, and we can run Claude ourselves to fix stuff if we need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants