Fix regex capture for Megatron KD PP layer renaming#1355
Conversation
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughUpdated Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
modelopt/torch/distill/plugins/megatron.py (1)
166-176:⚠️ Potential issue | 🟠 MajorReplace only the matched numeric segment, not all identical substrings.
At Line 175,
submodule_name.replace(match.group(0), str(new_layer_idx))replaces all occurrences of the matched digit sequence globally. With the new regex on Line 166 (now matching end-of-string indices), this corrupts unrelated numeric parts—for example,block12.layers.12becomesblock8.layers.8(both12s replaced instead of just the second one).Use span-based replacement instead:
Fix
- new_submodule_name = submodule_name.replace(match.group(0), str(new_layer_idx)) + start, end = match.span(0) + new_submodule_name = f"{submodule_name[:start]}{new_layer_idx}{submodule_name[end:]}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/distill/plugins/megatron.py` around lines 166 - 176, The replacement currently uses submodule_name.replace(match.group(0), ...) which replaces all identical digit sequences; instead use the match span to replace only the matched segment: get the start/end via match.span(), slice submodule_name into prefix + new index string + suffix, and assign that to new_submodule_name (keeping the existing logic that computes offset via TransformerLayer._get_layer_offset(model_cfg) and new_layer_idx); this ensures only the specific matched numeric part is changed rather than every occurrence of match.group(0).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@modelopt/torch/distill/plugins/megatron.py`:
- Around line 166-176: The replacement currently uses
submodule_name.replace(match.group(0), ...) which replaces all identical digit
sequences; instead use the match span to replace only the matched segment: get
the start/end via match.span(), slice submodule_name into prefix + new index
string + suffix, and assign that to new_submodule_name (keeping the existing
logic that computes offset via TransformerLayer._get_layer_offset(model_cfg) and
new_layer_idx); this ensures only the specific matched numeric part is changed
rather than every occurrence of match.group(0).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: bc20afe5-7ae0-4bc1-a159-cf01547d97e6
📒 Files selected for processing (1)
modelopt/torch/distill/plugins/megatron.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1355 +/- ##
==========================================
+ Coverage 75.72% 75.82% +0.10%
==========================================
Files 471 471
Lines 50375 50375
==========================================
+ Hits 38146 38199 +53
+ Misses 12229 12176 -53
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
|
/ok to test c0312ef |
What does this PR do?
Type of change: Bug fix
Previously the regex we had looked for a dot after the integer layer number, but it might not exist sometimes.
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit