[NVBug 6102977] Disable use_cache during export dummy forward for custom configs#1402
[NVBug 6102977] Disable use_cache during export dummy forward for custom configs#1402
Conversation
The export step's collect_shared_input_modules() runs a dummy forward pass to trace shared input modules for resmoothing/requantization. This forward call hits the same AttributeError as PR #1324 on configs that don't assign use_cache (e.g., stepfun-ai/Step-3.5-Flash's Step3p5Config): AttributeError: 'Step3p5Config' object has no attribute 'use_cache' Wrap the dummy forward with the _disable_use_cache context manager so it sets/restores config.use_cache around the call, mirroring the calibration fix from #1324. Covers all callsites since both unified_export_hf and plugins/vllm_fakequant_hf funnel through collect_shared_input_modules. Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA single import is added for a cache-disabling utility, and the dummy forward probe in ChangesCache Disabling During Input Module Collection Probe
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1402 +/- ##
==========================================
+ Coverage 76.73% 76.82% +0.09%
==========================================
Files 476 476
Lines 51306 51307 +1
==========================================
+ Hits 39368 39419 +51
+ Misses 11938 11888 -50
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
cjluo-nv
left a comment
There was a problem hiding this comment.
Bot review — DM the bot to share feedback.
Tiny, well-targeted bug fix (+5 -1). Reuses the existing _disable_use_cache context manager from dataset_utils to wrap the dummy forward pass in collect_shared_input_modules, mirroring the calibration fix from PR #1324. The context manager correctly handles configs that lack use_cache (sets it to False, restores/deletes on exit). No new code to speak of — just adding one more context manager to an existing with block.
cjluo-nv
left a comment
There was a problem hiding this comment.
Bot review — DM the bot to share feedback.
Tiny, correct bug fix (+5 -1). Reuses the existing _disable_use_cache context manager to wrap the dummy forward in collect_shared_input_modules, matching the established pattern from the calibration fix in PR #1324. Handles configs lacking use_cache correctly. No new code, no design concerns.
The export step's collect_shared_input_modules() runs a dummy forward pass to trace shared input modules for resmoothing/requantization. This forward call hits the same AttributeError as PR #1324 on configs that don't assign use_cache (e.g., stepfun-ai/Step-3.5-Flash's Step3p5Config):
AttributeError: 'Step3p5Config' object has no attribute 'use_cache'
Wrap the dummy forward with the _disable_use_cache context manager so it sets/restores config.use_cache around the call, mirroring the calibration fix from #1324. Covers all callsites since both unified_export_hf and plugins/vllm_fakequant_hf funnel through collect_shared_input_modules.
What does this PR do?
Type of change: ?
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit