Fix GraniteMoeHybrid _update_mamba_mask crash on attention-only models by tianhaocui · Pull Request #45514 · huggingface/transformers

tianhaocui · 2026-04-19T10:27:36Z

Summary

GraniteMoeHybridModel._update_mamba_mask calls past_key_values.has_previous_state() without checking whether the model actually has mamba layers. When all layers are attention-only (no mamba layers in config.layers_block_type), has_previous_state() fails to find a LinearAttentionCacheLayerMixin layer and raises ValueError.

Fix

Check config.layers_block_type for mamba layers before calling has_previous_state(). If no mamba layers exist, return the attention mask as-is since the mamba mask optimization is irrelevant.

Applied to both modeling_granitemoehybrid.py and modular_granitemoehybrid.py.

When all layers are attention layers (no mamba layers), _update_mamba_mask calls past_key_values.has_previous_state() which tries to find a LinearAttentionCacheLayerMixin layer. Since none exist, it raises ValueError. Skip the has_previous_state check entirely when the model has no mamba layers, as the mamba mask optimization is irrelevant in that case. Fixes huggingface#45507

github-actions · 2026-04-19T10:28:45Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: granitemoehybrid

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GraniteMoeHybrid _update_mamba_mask crash on attention-only models#45514

Fix GraniteMoeHybrid _update_mamba_mask crash on attention-only models#45514
tianhaocui wants to merge 1 commit intohuggingface:mainfrom
tianhaocui:fix-granitemoehybrid-mamba-mask

tianhaocui commented Apr 19, 2026

Uh oh!

github-actions bot commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tianhaocui commented Apr 19, 2026

Summary

Fix

Uh oh!

github-actions bot commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant