Unwrap text_config in AutoModelFor*.from_config#45770
Conversation
| # TODO: Validate that copying the parent quantization config to the text sub-config preserves | ||
| # modules_to_not_convert and skip-module matching when composite-model module prefixes differ. |
| if model_class.config_class == config.sub_configs.get("text_config", None): | ||
| # TODO: Validate that copying the parent quantization config to the text sub-config preserves | ||
| # modules_to_not_convert and skip-module matching when composite-model module prefixes differ. | ||
| parent_config = config | ||
| config = config.get_text_config() | ||
| # Propagate quantization_config from the composite parent config so that | ||
| # `get_hf_quantizer` can correctly detect the model as pre-quantized. | ||
| if hasattr(parent_config, "quantization_config"): | ||
| config.quantization_config = parent_config.quantization_config | ||
| return model_class._from_config(config, **kwargs) |
There was a problem hiding this comment.
tbh i am not even sure we are actually using it, except for qwen3.5. From what I see in mapping, gemma models will load a VLM class. And this makes sense if a complete config is passed/downloaded
Not sure what was the idea with qwen3.5 as I wasn't there when shipping the model, but a VLM class should be completely capable of running text-only samples
There was a problem hiding this comment.
Fwiw I just re-ran the minimal reproducer from #45759 for Qwen/Qwen3.6-35B-A3B, the crash I reported does happen. This makes sense because qwen3_5_moe backs Qwen/Qwen3.6-… too.
a VLM class should be completely capable of running text-only samples
My motivation mainly is for SkyRL's FSDP setup it wants AutoModelForCausalLM to give back a text-only class. Mainly this PR just brings from_config in line with from_pretrained's current behavior.
Are you thinking something else? I think I may not be fully appreciating your comment here.
|
Hi, Severity: action required | Category: correctness How to fix: Guard quantization_config propagation Agent prompt to fix - you can give this to your LLM of choice:
Qodo code review - free for open-source. |
Mirror the existing from_pretrained text_config unwrap into from_config so the composite-config -> text-only-model "VLM compatibility" mapping (e.g. Qwen3.5 / Qwen3.5-MoE) works for meta-init paths too. Without this, AutoModelForCausalLM.from_config(Qwen3_5MoeConfig(...)) fails with AttributeError because the text-only model class receives the composite config. Also propagate quantization_config from the parent so pre-quantized detection still works, mirroring PR huggingface#45494. Add a parametrized regression test covering both from_pretrained and from_config in the qwen3_5 and qwen3_5_moe model test files, and extend the existing gemma3 test_automodelforcausallm to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The unwrap branch in AutoModelFor*.from_config never triggers for gemma3 (its MODEL_FOR_CAUSAL_LM_MAPPING entry routes Gemma3Config to the VLM class Gemma3ForConditionalGeneration, whose config_class is Gemma3Config itself, not Gemma3TextConfig). The new from_config parameterization therefore exercised only the pre-existing path and added no coverage of the fix.
80fd48f to
42d9f7a
Compare
`hasattr(parent_config, "quantization_config")` returns True even when the value is `None` (e.g. a config.json containing `"quantization_config": null`, or a dequantization step that cleared the attribute). The previous propagation copied that None onto the unwrapped text sub-config, and `from_pretrained`'s downstream `get_hf_quantizer` keys off `hasattr` to set `pre_quantized=True`, then crashes inside `AutoHfQuantizer.supports_quant_method(None)` on `None.get(...)`. Switch to `getattr(parent_config, "quantization_config", None) is not None` at both call sites so absent and None collapse into the same "don't propagate" case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, qwen3_5, qwen3_5_moe |
Sure @Qodo-Free-For-OSS , just pushed a commit adding this to both call sites in # Check both `quantization_config` being present and also not null,
# as a `config.json` can have `"quantization_config": null` in it
parent_quant = getattr(parent_config, "quantization_config", None)
if parent_quant is not None:
config.quantization_config = parent_quant |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Overall I don't mind it, since we already support loading only LM part of Qwen3-5 from_pretrained
My comments aren't really directed to this PR. The fact that we allowed and support loading one model type inside a different one wasn't a good idea imo. As said, slipped off my radar so let's get it in for consistency in from_config/from_pretrained
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Fixes #45759
@ArthurZucker @Cyrilvallez @zucchini-nlp