Skip to content

Fix shard finalization issue to prevent skipping non-LLM tail layers#1548

Merged
lvliang-intel merged 5 commits intomainfrom
lvl/fix_shard_finalization_skip
Mar 17, 2026
Merged

Fix shard finalization issue to prevent skipping non-LLM tail layers#1548
lvliang-intel merged 5 commits intomainfrom
lvl/fix_shard_finalization_skip

Conversation

@lvliang-intel
Copy link
Copy Markdown
Contributor

Description

Root Cause

ShardWriter.finalize() uses get_lm_head_name() to identify the tied embedding head and skip it. For diffusion model like FLUX, this returns "proj_out" (the last leaf module). Combined with tie_word_embeddings defaulting to True when the attribute is absent from the model config, proj_out.weight and proj_out.bias were silently skipped and never written to disk.

This bug was latent in the old commit but never triggered because FLUX RTN quantization disabled low_cpu_mem_usage on that code path, so finalize() was never called. PR #1386 changed that branch to keep low_cpu_mem_usage=True, which activated the is_immediate_saving path and exposed the bug.

Fix

Change the default value of tie_word_embeddings from True to False, consistent with the same logic in utils.py:364.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
…ix_shard_finalization_skip

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Copilot AI review requested due to automatic review settings March 15, 2026 12:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a sharded-weight finalization bug where ShardWriter.finalize() could mistakenly treat the last leaf module (e.g., diffusion proj_out) as a tied LM head and silently skip writing its tensors when model.config.tie_word_embeddings is missing.

Changes:

  • Change the default tie_word_embeddings assumption in ShardWriter.finalize() from True to False.
  • Read model.config.tie_word_embeddings only when the attribute is explicitly present.

You can also share your feedback on Copilot code review. Take the survey.

@chensuyue chensuyue modified the milestones: 0.10.3, 0.12.0 Mar 16, 2026
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
@lvliang-intel lvliang-intel requested a review from yiliu30 March 17, 2026 13:09
Copy link
Copy Markdown
Contributor

@yiliu30 yiliu30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvliang-intel lvliang-intel merged commit 81e79a1 into main Mar 17, 2026
29 checks passed
@lvliang-intel lvliang-intel deleted the lvl/fix_shard_finalization_skip branch March 17, 2026 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants