Skip to content

Fix shared config mutation issue in flash_attn_from_config#45678

Open
kaixuanliu wants to merge 2 commits intohuggingface:mainfrom
kaixuanliu:config-copy
Open

Fix shared config mutation issue in flash_attn_from_config#45678
kaixuanliu wants to merge 2 commits intohuggingface:mainfrom
kaixuanliu:config-copy

Conversation

@kaixuanliu
Copy link
Copy Markdown
Contributor

@kaixuanliu kaixuanliu commented Apr 28, 2026

What does this PR do?

Fixes a test isolation bug where test_from_pretrained_no_checkpoint and test_load_save_without_tied_weights fail when run after test_flash_attn_2_from_config in the same session (with RUN_SLOW=1).

Root cause

_from_config() mutates sub-configs in-place when setting dtype:

for sub_config_key in config.sub_configs:    sub_config.dtype = dtype
Some model testers (e.g. Phi4MultimodalModelTester) use mutable default arguments for sub-configs like vision_config=Phi4MultimodalVisionConfig(...). These objects are created once at class definition time and shared across all calls to prepare_config_and_inputs_for_common().

When flash_attn_from_config calls _from_config(config, dtype=torch.bfloat16), it permanently sets vision_config.dtype = torch.bfloat16 on the shared sub-config object. All subsequent tests then create models with bfloat16 vision weights instead of float32, causing weight mismatches.

Fix

deepcopy the config before passing it to _from_config in flash_attn_from_config, preventing the mutation from leaking across tests. This is a general fix that protects all models.

Tests

RUN_SLOW=1 pytest tests/models/phi4_multimodal/test_modeling_phi4_multimodal.py -k "not deepspeed"
# Before: 2 failed (test_from_pretrained_no_checkpoint, test_load_save_without_tied_weights)
# After: all pass

@ydshieh pls help review, thx!

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Apr 28, 2026

We should avoid using mutable object in arguments. But to fix this failing test, let's use

class Phi4MultimodalModelTester:
        ...
        self.audio_config = deepcopy(audio_config)
        self.vision_config = deepcopy(vision_config)

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu
Copy link
Copy Markdown
Contributor Author

Thx for your advice. Looks much better now.

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: phi4_multimodal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants