Fix TypeError in convert_rope_params_to_dict when ignore_keys is a list#44272
Merged
Rocketknight1 merged 1 commit intohuggingface:mainfrom Feb 25, 2026
Merged
Conversation
In `convert_rope_params_to_dict`, the `ignore_keys_at_rope_validation` parameter is expected to be a set but can arrive as a list when model configs are deserialized from JSON (e.g. Qwen3.5 via vLLM). The union operator `list | set` raises TypeError on Python 3.12+. Wrap the value in `set()` to normalize all iterables before the union. `set(already_a_set)` is a no-op copy, so existing behavior is preserved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
redpanda1995
approved these changes
Feb 25, 2026
Member
|
hey @redpanda1995 please stop going around randomly approving PRs - it creates notifications for the maintainers, and also you're clearly not actually reviewing them because several of them were invalid! We'll just block you across all HF repos if this continues. |
Rocketknight1
approved these changes
Feb 25, 2026
Member
Rocketknight1
left a comment
There was a problem hiding this comment.
It's a bit weird to add code to support malformed calls from external frameworks, but it's very harmless (since set(set) is a no-op) so I'm willing to accept this one!
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
JohannesHa
added a commit
to PrimeIntellect-ai/prime-rl
that referenced
this pull request
Mar 4, 2026
Bump vLLM to >=0.16.1.dev (nightly) which includes Qwen3.5 model support. This requires torch 2.10 (resolved from the existing >=2.9.0 pin), an updated flash-attn wheel built against torch 2.10, and version overrides for nvidia-cutlass-dsl and quack-kernels. Bump transformers pin to 5c1c72b which includes a rope validation fix for Qwen3.5 (huggingface/transformers#44272). Add a trainer monkey-patch for a transformers bug where Qwen3.5 passes 3D MRoPE position_ids to decoder layers instead of 2D text_position_ids, which breaks flash attention and causes NaN gradients. The upstream fix is pending: huggingface/transformers#44399 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes a
TypeError: unsupported operand type(s) for |: 'list' and 'set'inRotaryEmbeddingConfigMixin.convert_rope_params_to_dictwhenignore_keys_at_rope_validationis alistinstead of aset.Root cause
In
modeling_rope_utils.pyline 649, theelsebranch passes throughignore_keys_at_rope_validationwithout type coercion. Line 651 then performsignore_keys_at_rope_validation | {"partial_rotary_factor"}, which fails if the value is alist(sincelist | setis not supported in Python).When does this happen?
Model configs define
ignore_keys_at_rope_validationas asetliteral in__init__, so the normalfrom_pretrainedflow works. However, when third-party serving frameworks (e.g. vLLM) define their own config classes that pass this field through JSON deserialization, the value arrives as alist, triggering the crash.Encountered in practice when serving Qwen3.5-27B via vLLM (which bundles its own
Qwen3_5TextConfigthat inherits fromPretrainedConfig).Fix
Wrap in
set()to normalize any iterable.set(already_a_set)returns a copy, so existing behavior is preserved.Traceback
Full call chain:
Qwen3_5TextConfig.__init__→PretrainedConfig.__init__→convert_rope_params_to_dict