llm_sparsity: Set warmup_steps 0 instead of 0.0 for transformers 5.x compat#1393
Conversation
…compat Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA shell script for launching model finetuning updates the ChangesScript Configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1393 +/- ##
==========================================
+ Coverage 75.74% 77.07% +1.32%
==========================================
Files 476 476
Lines 51057 51057
==========================================
+ Hits 38672 39350 +678
+ Misses 12385 11707 -678
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
#1416 (#1426) ## Cherry-picked PRs - #1393 - #1389 - #1268 - #1397 - #1402 - #1411 - #1410 - #1419 - #1408 - #1416 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * SPEEDBench now uses stratified sampling for deterministic, balanced dataset selection. * Added legacy quantization conversion shims for INT4, MXFP8 and FP4→2DQ workflows. * AWQ Lite: fallback handling for uncalibrated per-expert quantizers during export. * **Bug Fixes** * Clamp FP8 scales in NVFP4 quantization to avoid NaNs. * Fixed warmup steps formatting in finetune launch script. * **Improvements** * LM-Eval integration updated for v0.4.10+ compatibility. * TensorRT execution routed through a dedicated trtexec helper. * **Tests** * Added/regressed tests covering quantization shims, FP8 scale behavior, export fallbacks, and LM eval. [](https://app.coderabbit.ai/change-stack/NVIDIA/Model-Optimizer/pull/1426) <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com> Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com> Signed-off-by: Chenjie Luo <chenjiel@nvidia.com> Signed-off-by: weimingc <17592131+meenchen@users.noreply.github.com> Signed-off-by: weimingc <weimingc@nvidia.com> Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com> Co-authored-by: milesial <milesial@users.noreply.github.com> Co-authored-by: Gwena Cunha <4861122+gcunhase@users.noreply.github.com> Co-authored-by: Chenjie Luo <108829653+cjluo-nv@users.noreply.github.com> Co-authored-by: Wei-Ming Chen <17592131+meenchen@users.noreply.github.com> Co-authored-by: sugunav14 <178320438+sugunav14@users.noreply.github.com> Co-authored-by: Ajinkya Rasane <131806219+ajrasane@users.noreply.github.com>
Fix for NVBug 6120631 to fix
Summary by CodeRabbit