Ensure removal of temp files on error in ONNX INT4 quantization#1359
Ensure removal of temp files on error in ONNX INT4 quantization#1359vishalpandya1990 merged 2 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds a Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/onnx/quantization/int4.py`:
- Around line 541-543: In both _quantize_awq_clip and _quantize_awq_lite,
initialize session = None before the try that calls
create_inference_session(augmented_onnx_path, calibration_eps,
input_shapes_profile), and in the finally block explicitly delete the session
(del session; session = None) and call gc.collect() to release file locks; also
change any logger.warn(...) calls to logger.warning(...). Replace the combined
try/except around the two os.remove(...) calls with independent removals: for
each temp artifact call os.remove(...) inside its own try that only catches
FileNotFoundError (silently ignore) and logs any other OSError with contextual
info (filename and function name) so the second file still gets attempted even
if the first fails.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 24ad79a7-275f-4e8e-bc64-2b285661c188
📒 Files selected for processing (1)
modelopt/onnx/quantization/int4.py
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1359 +/- ##
==========================================
- Coverage 76.93% 76.88% -0.05%
==========================================
Files 471 471
Lines 50401 50414 +13
==========================================
- Hits 38777 38762 -15
- Misses 11624 11652 +28
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: vipandya <vipandya@nvidia.com>
…nd data file Signed-off-by: vipandya <vipandya@nvidia.com>
76d393e to
ce5214b
Compare
ajrasane
left a comment
There was a problem hiding this comment.
The fix looks good and CI is green, but consider adding a regression test for the cleanup-on-error path — that's the actual behavior this PR introduces, and right now nothing exercises it (codecov shows the new finally branches are uncovered).
A small unit test in tests/unit/onnx/quantization/test_quantize_zint4.py would do it: monkeypatch modelopt.onnx.quantization.int4.create_inference_session to raise, call _quantize_awq_clip / _quantize_awq_lite (or the public quantize entry point with awq_clip / awq_lite), assert the exception propagates, and assert no *.onnx / *.onnx_data files are left behind in tempfile.gettempdir() matching the augmented-model pattern. Bonus: parametrize over use_external_data_format=True/False so both helper branches are exercised.
Without this, future refactors of the augmented-graph flow could silently regress the leak this PR is fixing.
Yes, will follow-up on it separately. |
What does this PR do?
Type of change: Minor bug fix
Testing
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit