Fix shared config mutation issue in flash_attn_from_config#45678
Open
kaixuanliu wants to merge 2 commits intohuggingface:mainfrom
Open
Fix shared config mutation issue in flash_attn_from_config#45678kaixuanliu wants to merge 2 commits intohuggingface:mainfrom
kaixuanliu wants to merge 2 commits intohuggingface:mainfrom
Conversation
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Collaborator
|
We should avoid using mutable object in arguments. But to fix this failing test, let's use |
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Contributor
Author
|
Thx for your advice. Looks much better now. |
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: phi4_multimodal |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes a test isolation bug where test_from_pretrained_no_checkpoint and test_load_save_without_tied_weights fail when run after
test_flash_attn_2_from_configin the same session (with RUN_SLOW=1).Root cause
_from_config()mutates sub-configs in-place when setting dtype:for sub_config_key in config.sub_configs: sub_config.dtype = dtypeSome model testers (e.g.
Phi4MultimodalModelTester) use mutable default arguments for sub-configs likevision_config=Phi4MultimodalVisionConfig(...). These objects are created once at class definition time and shared across all calls toprepare_config_and_inputs_for_common().When
flash_attn_from_configcalls_from_config(config, dtype=torch.bfloat16), it permanently setsvision_config.dtype = torch.bfloat16on the shared sub-config object. All subsequent tests then create models with bfloat16 vision weights instead of float32, causing weight mismatches.Fix
deepcopy the config before passing it to _from_config in flash_attn_from_config, preventing the mutation from leaking across tests. This is a general fix that protects all models.
Tests
@ydshieh pls help review, thx!