Skip to content

Fix TypeError in convert_rope_params_to_dict when ignore_keys is a list#44272

Merged
Rocketknight1 merged 1 commit intohuggingface:mainfrom
hangjun-ezra:fix/rope-ignore-keys-type-coercion
Feb 25, 2026
Merged

Fix TypeError in convert_rope_params_to_dict when ignore_keys is a list#44272
Rocketknight1 merged 1 commit intohuggingface:mainfrom
hangjun-ezra:fix/rope-ignore-keys-type-coercion

Conversation

@hangjun-ezra
Copy link
Contributor

What does this PR do?

Fixes a TypeError: unsupported operand type(s) for |: 'list' and 'set' in RotaryEmbeddingConfigMixin.convert_rope_params_to_dict when ignore_keys_at_rope_validation is a list instead of a set.

Root cause

In modeling_rope_utils.py line 649, the else branch passes through ignore_keys_at_rope_validation without type coercion. Line 651 then performs ignore_keys_at_rope_validation | {"partial_rotary_factor"}, which fails if the value is a list (since list | set is not supported in Python).

When does this happen?

Model configs define ignore_keys_at_rope_validation as a set literal in __init__, so the normal from_pretrained flow works. However, when third-party serving frameworks (e.g. vLLM) define their own config classes that pass this field through JSON deserialization, the value arrives as a list, triggering the crash.

Encountered in practice when serving Qwen3.5-27B via vLLM (which bundles its own Qwen3_5TextConfig that inherits from PretrainedConfig).

Fix

Wrap in set() to normalize any iterable. set(already_a_set) returns a copy, so existing behavior is preserved.

- set() if ignore_keys_at_rope_validation is None else ignore_keys_at_rope_validation
+ set() if ignore_keys_at_rope_validation is None else set(ignore_keys_at_rope_validation)

Traceback

File "transformers/modeling_rope_utils.py", line 651, in convert_rope_params_to_dict
    ignore_keys_at_rope_validation = ignore_keys_at_rope_validation | {"partial_rotary_factor"}
TypeError: unsupported operand type(s) for |: 'list' and 'set'

Full call chain: Qwen3_5TextConfig.__init__PretrainedConfig.__init__convert_rope_params_to_dict

In `convert_rope_params_to_dict`, the `ignore_keys_at_rope_validation`
parameter is expected to be a set but can arrive as a list when model
configs are deserialized from JSON (e.g. Qwen3.5 via vLLM). The union
operator `list | set` raises TypeError on Python 3.12+.

Wrap the value in `set()` to normalize all iterables before the union.
`set(already_a_set)` is a no-op copy, so existing behavior is preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Rocketknight1
Copy link
Member

hey @redpanda1995 please stop going around randomly approving PRs - it creates notifications for the maintainers, and also you're clearly not actually reviewing them because several of them were invalid! We'll just block you across all HF repos if this continues.

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird to add code to support malformed calls from external frameworks, but it's very harmless (since set(set) is a no-op) so I'm willing to accept this one!

@Rocketknight1 Rocketknight1 enabled auto-merge (squash) February 25, 2026 14:29
@Rocketknight1 Rocketknight1 merged commit c58e711 into huggingface:main Feb 25, 2026
25 checks passed
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

JohannesHa added a commit to PrimeIntellect-ai/prime-rl that referenced this pull request Mar 4, 2026
Bump vLLM to >=0.16.1.dev (nightly) which includes Qwen3.5 model support.
This requires torch 2.10 (resolved from the existing >=2.9.0 pin), an
updated flash-attn wheel built against torch 2.10, and version overrides
for nvidia-cutlass-dsl and quack-kernels.

Bump transformers pin to 5c1c72b which includes a rope validation fix
for Qwen3.5 (huggingface/transformers#44272).

Add a trainer monkey-patch for a transformers bug where Qwen3.5 passes
3D MRoPE position_ids to decoder layers instead of 2D text_position_ids,
which breaks flash attention and causes NaN gradients. The upstream fix
is pending: huggingface/transformers#44399

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants