Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdded a dedicated ONNX compatibility shim module and invoked it on import; removed unpinned Changes
Sequence Diagram(s)sequenceDiagram
participant Importer as Test / Module Import
participant Shim as modelopt.onnx._onnx_compat
participant ONNX as onnx.helper
participant NumPy as numpy / ml_dtypes
Importer->>Shim: import (side-effect)
Shim->>ONNX: try import `onnx.helper`
alt helpers missing
Shim->>NumPy: import `numpy`, `ml_dtypes`
NumPy-->>Shim: provide conversion dtypes/utilities
Shim->>ONNX: define `float32_to_bfloat16` and `float32_to_float8e4m3`
else imports fail
Shim-->>Importer: no patch applied (silent)
end
Importer->>ONNX: subsequent code/tests may use helpers
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
9356aec to
4dfe4c1
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/onnx/_onnx_compat.py`:
- Around line 36-54: The shimmed helper functions use incomplete signatures
causing TypeError when callers pass legacy kwargs; update _float32_to_bfloat16
and _float32_to_float8e4m3 to match ONNX 1.19.x: change
_float32_to_bfloat16(value) to _float32_to_bfloat16(fval, truncate: bool =
False) and change _float32_to_float8e4m3(value, fn=True, uz=False) to
_float32_to_float8e4m3(fval, scale: float = 1.0, fn: bool = True, uz: bool =
False, saturate: bool = True); inside each function use fval, apply scale for
float8 (e.g., np.float32(fval) * scale) and accept but safely ignore
truncate/saturate if not needed, then perform the existing dtype conversion and
return the same integer representation; update the assignments
onnx.helper.float32_to_bfloat16 and onnx.helper.float32_to_float8e4m3 to
reference the renamed/updated functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 68e21319-7789-4c62-a7c0-a6d5a73e29d5
📒 Files selected for processing (7)
examples/windows/onnx_ptq/genai_llm/requirements.txtexamples/windows/onnx_ptq/whisper/requirements.txtmodelopt/onnx/__init__.pymodelopt/onnx/_onnx_compat.pypyproject.tomltests/unit/onnx/conftest.pytests/unit/onnx/quantization/test_quant_utils.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/onnx/__init__.py`:
- Around line 21-22: The import of the ONNX compatibility shim
(patch_onnx_helper_removed_apis) is happening before the optional-deps
try/except guard and will raise if onnx/ml_dtypes are missing; wrap the shim
import in the same optional-deps try/except (or move it inside that try block)
so ImportError is caught and handled by your normalized error reporting logic,
i.e., import modelopt.onnx._onnx_compat.patch_onnx_helper_removed_apis inside
the existing try/except that checks for onnx and ml_dtypes rather than at module
top-level.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 584f28d9-70d9-4199-88eb-5662e6e25976
📒 Files selected for processing (7)
examples/windows/onnx_ptq/genai_llm/requirements.txtexamples/windows/onnx_ptq/whisper/requirements.txtmodelopt/onnx/__init__.pymodelopt/onnx/_onnx_compat.pypyproject.tomltests/unit/onnx/conftest.pytests/unit/onnx/quantization/test_quant_utils.py
💤 Files with no reviewable changes (2)
- examples/windows/onnx_ptq/whisper/requirements.txt
- examples/windows/onnx_ptq/genai_llm/requirements.txt
✅ Files skipped from review due to trivial changes (2)
- pyproject.toml
- tests/unit/onnx/conftest.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/unit/onnx/quantization/test_quant_utils.py
- modelopt/onnx/_onnx_compat.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1207 +/- ##
==========================================
+ Coverage 75.42% 77.34% +1.92%
==========================================
Files 353 353
Lines 40603 40595 -8
==========================================
+ Hits 30623 31398 +775
+ Misses 9980 9197 -783
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
kevalmorabia97
left a comment
There was a problem hiding this comment.
LGTM. PTAL at Coderabbit comments as well
ONNX 1.20+ removed deprecated helper functions (float32_to_bfloat16, float32_to_float8e4m3, pack_float32_to_4bit) that onnx_graphsurgeon 0.5.x still references at import time. This adds a compatibility shim that restores these functions using ml_dtypes before any onnx_graphsurgeon imports occur. Changes: - Bump onnx~=1.19.0 to onnx~=1.21.0 in pyproject.toml - Add modelopt/onnx/_onnx_compat.py compatibility shim for removed APIs - Import shim in modelopt/onnx/__init__.py and test conftest.py - Update test_quant_utils.py to remove usage of removed onnx.helper.pack_float32_to_4bit; validate against hardcoded expected values instead - Update example requirements (genai_llm, whisper) to onnx==1.21.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
4dfe4c1 to
68ed5a2
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
modelopt/onnx/__init__.py (1)
21-22:⚠️ Potential issue | 🟠 MajorMove the compatibility shim import inside the try/except guard.
The import of
patch_onnx_helper_removed_apisat line 22 occurs before the dependency guard (lines 26–30). Ifonnxorml_dtypesis missing, this could fail early and bypass the normalized optional-dependency error message.As noted in the previous review, move the import inside the existing try/except block to ensure consistent error handling:
Proposed fix
-# Apply ONNX compatibility shim before any onnx_graphsurgeon imports. -from modelopt.onnx._onnx_compat import patch_onnx_helper_removed_apis - MIN_PYTHON_VERSION = (3, 10) try: + # Apply ONNX compatibility shim before any onnx_graphsurgeon imports. + from modelopt.onnx._onnx_compat import patch_onnx_helper_removed_apis + patch_onnx_helper_removed_apis() from . import quantization from .logging_config import configure_logging, logger except ImportError as e: raise ImportError(f"{e}\nPlease install optional ``[onnx]`` dependencies.")As per coding guidelines:
modelopt/**/*.pyshould "gate optional dependencies" rather than importing at module top-level before the guard.,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/onnx/__init__.py` around lines 21 - 22, The import of patch_onnx_helper_removed_apis is executed before the optional-dependency guard, which can raise errors before the normalized message; move the import statement into the existing try/except that checks for onnx and ml_dtypes so the compatibility shim is applied only when those deps are present and any ImportError is handled by the guard. Locate the top-level import of patch_onnx_helper_removed_apis in modelopt.onnx.__init__ and relocate it inside the same try block that imports onnx and ml_dtypes (the dependency guard), ensuring the except block still raises the normalized optional-dependency error if imports fail. Ensure no other code references the shim before the try block so behavior remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@modelopt/onnx/__init__.py`:
- Around line 21-22: The import of patch_onnx_helper_removed_apis is executed
before the optional-dependency guard, which can raise errors before the
normalized message; move the import statement into the existing try/except that
checks for onnx and ml_dtypes so the compatibility shim is applied only when
those deps are present and any ImportError is handled by the guard. Locate the
top-level import of patch_onnx_helper_removed_apis in modelopt.onnx.__init__ and
relocate it inside the same try block that imports onnx and ml_dtypes (the
dependency guard), ensuring the except block still raises the normalized
optional-dependency error if imports fail. Ensure no other code references the
shim before the try block so behavior remains unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8161e52e-c1d5-4f7f-b478-a92ae3efd27c
📒 Files selected for processing (7)
examples/windows/onnx_ptq/genai_llm/requirements.txtexamples/windows/onnx_ptq/whisper/requirements.txtmodelopt/onnx/__init__.pymodelopt/onnx/_onnx_compat.pypyproject.tomltests/unit/onnx/conftest.pytests/unit/onnx/quantization/test_quant_utils.py
💤 Files with no reviewable changes (2)
- examples/windows/onnx_ptq/whisper/requirements.txt
- examples/windows/onnx_ptq/genai_llm/requirements.txt
✅ Files skipped from review due to trivial changes (3)
- pyproject.toml
- tests/unit/onnx/conftest.py
- modelopt/onnx/_onnx_compat.py
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
05c1a3e to
3fd3336
Compare
|
Seems like this test is failing But looks unrelated to the change, since i saw multiple PR's fail yesterday at this same test. |
|
tests/unit/torch/quantization/test_quantize_cpu.py::test_save_restore is disabled on Windows now in main branch as its flaky |
… onnx 1.21.0 onnx-graphsurgeon 0.6.1 no longer references the removed onnx.helper functions (float32_to_bfloat16, float32_to_float8e4m3), so the compatibility shim and its tests are no longer needed. Pin onnx-graphsurgeon>=0.6.1 in pyproject.toml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
tests/unit/onnx/conftest.py
Outdated
cjluo-nv
left a comment
There was a problem hiding this comment.
Please address @kevalmorabia97 's comment before landing
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
…_template (#1225) ## Summary - `apply_chat_template(..., return_tensors="pt")` returns a `BatchEncoding` in transformers 4.46+, which no longer subclasses `dict` - The old guard `isinstance(tokenized, dict)` evaluates to `False` for `BatchEncoding`, so `input_ids` was set to the whole `BatchEncoding` object - Calling `.shape[1]` on a `BatchEncoding` triggers `__getattr__("shape")` → `AttributeError` - Fix: check `isinstance(tokenized, torch.Tensor)` instead, which correctly handles both old transformers (plain tensor) and new transformers (BatchEncoding) This is causing `test_collect_hidden_states` to fail in the speculative decoding CI for all open PRs (#1207, #1210, #1221). ## Test plan - [ ] `torch-pr (speculative_decoding, 26.01)` CI passes - [ ] Verify fix handles both `torch.Tensor` return (old transformers) and `BatchEncoding` return (new transformers 4.46+) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Ye Yu <yeyu@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…_template (#1225) ## Summary - `apply_chat_template(..., return_tensors="pt")` returns a `BatchEncoding` in transformers 4.46+, which no longer subclasses `dict` - The old guard `isinstance(tokenized, dict)` evaluates to `False` for `BatchEncoding`, so `input_ids` was set to the whole `BatchEncoding` object - Calling `.shape[1]` on a `BatchEncoding` triggers `__getattr__("shape")` → `AttributeError` - Fix: check `isinstance(tokenized, torch.Tensor)` instead, which correctly handles both old transformers (plain tensor) and new transformers (BatchEncoding) This is causing `test_collect_hidden_states` to fail in the speculative decoding CI for all open PRs (#1207, #1210, #1221). ## Test plan - [ ] `torch-pr (speculative_decoding, 26.01)` CI passes - [ ] Verify fix handles both `torch.Tensor` return (old transformers) and `BatchEncoding` return (new transformers 4.46+) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Ye Yu <yeyu@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
What does this PR do?
Type of change: new feature
Upgrade ONNX dependency from
~=1.19.0to~=1.21.0. ONNX 1.20+ removed several deprecatedhelper functions (
float32_to_bfloat16,float32_to_float8e4m3,pack_float32_to_4bit) thatonnx_graphsurgeon0.5.x still references at import time. This PR adds a compatibility shim(
modelopt/onnx/_onnx_compat.py) that restores these functions usingml_dtypesbefore anyonnx_graphsurgeonimport occurs. This supersedes the partial inline fix from #1204 by alsohandling
float32_to_float8e4m3.Changes:
onnx~=1.19.0toonnx~=1.21.0inpyproject.tomlmodelopt/onnx/_onnx_compat.pycompatibility shim for removed ONNX APIsmodelopt/onnx/__init__.pyandtests/unit/onnx/conftest.pyonnx.helper.pack_float32_to_4bitintest_quant_utils.pygenai_llm,whisper) toonnx==1.21.0TensorRT Compatibility: TRT 10.16-GA supports opsets 9–24. ModelOpt quantization modes
use opsets 19–23, all within range. ONNX 1.21 does not force opset 26.
Usage
Testing
nvcr.io/nvidia/tensorrt:25.06-py3(1 pre-existing ORTCopyTensorAsyncEP issue, not ONNX-related)torch_onnxintegration tests pass (fp8, int8, nvfp4, mxfp8, int4_awq, auto)torch_onnx→ TRT engine build → ImageNet eval: 85.3% top-1, 97.8% top-5onnx_ptq→ TRT engine build succeedsBefore your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅Additional Information
Related: #1204 (partial fix for
float32_to_bfloat16only — this PR supersedes it with full coverage)🤖 Generated with Claude Code
Summary by CodeRabbit
Dependencies
Refactor
Tests