[6281412] docs: update TensorRT-Edge-LLM CLI commands in torch_onnx example by ajrasane · Pull Request #1808 · NVIDIA/Model-Optimizer

ajrasane · 2026-06-23T21:02:30Z

What does this PR do?

Type of change: documentation

TensorRT-Edge-LLM v0.8.0 consolidated its CLI entry points, leaving the example commands in examples/torch_onnx/README.md referencing tools that no longer exist (e.g. tensorrt-edgellm-export-visual). This updates the README to the current interface:

tensorrt-edgellm-quantize-llm / tensorrt-edgellm-quantize-draft → tensorrt-edgellm-quantize {llm,draft} (subcommands)
tensorrt-edgellm-export-llm / -export-visual / -export-draft → unified tensorrt-edgellm-export with positional model / output_dir args and automatic VLM/audio component detection
--is_eagle_base → --eagle-base
Updated the CLI Tools table and the LLM / VLM / EAGLE examples accordingly

Usage

N/A — documentation change.

Testing

Verified against the live main branch of TensorRT-Edge-LLM by running the actual entry-point code (python -m tensorrt_edgellm.scripts.quantize/export):

--help runs cleanly for quantize, quantize llm, quantize draft, and export; all documented flags (--model_dir, --output_dir, --quantization, --base_model_dir, --draft_model_dir, positional model/output_dir, --eagle-base) are present.
Drove the parser with the exact README commands — they parse and advance into the real quantize/export logic.
Confirmed the old names are gone: quantize-llm subcommand rejected, --is_eagle_base rejected, scripts.export_visual module not found.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: N/A (documentation only)
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: N/A (documentation only)
Did you update Changelog?: N/A (minor docs change)

🤖 Generated by Claude (AI agent).

Summary by CodeRabbit

Documentation
- Updated TensorRT-Edge-LLM CLI documentation to reflect consolidated command structure
- Updated command examples for LLM, VLM, and EAGLE speculative decoding workflows
- Documented new unified CLI interfaces with updated subcommands and flags

…xample TensorRT-Edge-LLM v0.8.0 consolidated its CLI entry points. Update the example README to the new interface: - tensorrt-edgellm-quantize-llm/-draft -> tensorrt-edgellm-quantize {llm,draft} - tensorrt-edgellm-export-llm/-visual/-draft -> unified tensorrt-edgellm-export with positional model/output_dir args and automatic VLM/audio detection - --is_eagle_base -> --eagle-base Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>

coderabbitai · 2026-06-23T21:02:45Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e40b7e3b-fd34-41a6-9fe7-28c000c8829e

📥 Commits

Reviewing files that changed from the base of the PR and between 37dbbda and b039772.

📒 Files selected for processing (1)

examples/torch_onnx/README.md

📝 Walkthrough

Walkthrough

The examples/torch_onnx/README.md is updated to reflect a new unified CLI interface for TensorRT-Edge-LLM. The --help verification commands, CLI tools reference table, LLM/VLM export examples, and EAGLE speculative decoding examples are all rewritten to use tensorrt-edgellm-quantize and tensorrt-edgellm-export with subcommands, replacing the older -llm-suffixed binaries.

Changes

TensorRT-Edge-LLM CLI Documentation

Layer / File(s)	Summary
Installation verification and CLI tools table `examples/torch_onnx/README.md`	Switches `--help` commands from `*-llm` variants to `tensorrt-edgellm-quantize` and `tensorrt-edgellm-export`; rewrites the CLI tools table to list the unified commands with `llm`/`draft` subcommands and adds `tensorrt-edgellm-insert-lora` and `tensorrt-edgellm-process-lora`; updates LLM and VLM command examples with new argument ordering and auto-detection for VLM export.
EAGLE speculative decoding example `examples/torch_onnx/README.md`	Renames base export flag from `--is_eagle_base` to `--eagle-base`, changes draft quantization to `tensorrt-edgellm-quantize draft`, and updates draft export invocation to the new `tensorrt-edgellm-export` structure.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the primary change: updating TensorRT-Edge-LLM CLI commands in torch_onnx example documentation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	No security anti-patterns found. All Python changes lack torch.load weights_only=False, numpy.load allow_pickle=True, hardcoded trust_remote_code=True, unsafe eval/exec, or nosec comments. All depe...

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ajrasane/nvbug_6281412

_{Comment @coderabbitai help to get the list of available commands.}

cjluo-nv

Bot review — DM the bot to share feedback.

Documentation-only PR (+23/-32, single file) updating examples/torch_onnx/README.md to match TensorRT-Edge-LLM v0.8.0's consolidated CLI. Verified the full README: the changes are internally consistent — the install-verify block, CLI Tools table, and all three examples (LLM, VLM, EAGLE) now uniformly use tensorrt-edgellm-quantize {llm,draft} subcommands, the unified tensorrt-edgellm-export with positional model/output_dir args, and --eagle-base. No stale references to the old -quantize-llm/-export-llm/-export-visual/-export-draft tools or --is_eagle_base remain. The PR body documents thorough verification against the live upstream main branch. No code, no tests needed (docs only), no licensing changes. No prompt-injection content in the diff. Straightforward and correct.

codecov · 2026-06-23T21:12:14Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.69%. Comparing base (c3b913b) to head (b039772).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1808      +/-   ##
==========================================
+ Coverage   62.88%   64.69%   +1.80%     
==========================================
  Files         511      511              
  Lines       56634    58285    +1651     
==========================================
+ Hits        35615    37705    +2090     
+ Misses      21019    20580     -439

Flag	Coverage Δ
examples	`42.09% <ø> (+4.08%)`	⬆️
unit	`54.65% <ø> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-06-23T21:51:44Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-23 21:51 UTC

#1858 #1839 #1857 #1869 (#1880) ## Cherry-picked PRs - #1801 - #1808 - #1629 - #1627 - #1824 - #1826 - #1830 - #1760 - #1831 - #1858 - #1839 - #1857 - #1869 #1839, #1857 and #1869 were back-ported (not a clean cherry-pick): the file was renamed `llm_ptq` -> `hf_ptq` (#1759) and surrounding `get_model` code diverged on `main`, but the actual fix targets the `init_empty_weights` / `from_config` block that already exists on the release branch. Accompanying unit tests were ported (15 passed).  ## Summary by CodeRabbit * **New Features** * Added a new PTQ recipe for NVFP4 MLP/MoE quantization with FP8 KV-cache calibration. * **Bug Fixes** * Improved ONNX mixed-precision/FP16 conversion reliability with stricter type handling and better stale output-shape reconciliation. * Fixed quantization/export edge cases: MoE router/gate handling, FP8 calibration/reduction failures, and additional FP8/INT8 robustness during export. * Standardized Puzzletron validation split naming to `validation`. * **Documentation** * Refreshed LM-Eval and TensorRT-Edge-LLM CLI instructions, including updated command names and examples.  --------- Signed-off-by: Meng Xin <mxin@nvidia.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com> Signed-off-by: dimapihtar <dpykhtar@nvidia.com> Signed-off-by: Chenjie Luo <chenjiel@nvidia.com> Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com> Signed-off-by: Grzegorz Karch <gkarch@nvidia.com> Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com> Co-authored-by: mxinO <164952785+mxinO@users.noreply.github.com> Co-authored-by: Ajinkya Rasane <131806219+ajrasane@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com> Co-authored-by: Chenjie Luo <108829653+cjluo-nv@users.noreply.github.com> Co-authored-by: Zhiyu <zhiyuc@nvidia.com> Co-authored-by: Grzegorz K. Karch <grzegorz-k-karch@users.noreply.github.com> Co-authored-by: Daniel Korzekwa <daniel.korzekwa@gmail.com>

ajrasane requested a review from a team as a code owner June 23, 2026 21:02

ajrasane requested a review from vishalpandya1990 June 23, 2026 21:02

ajrasane self-assigned this Jun 23, 2026

ajrasane added the cherry-pick-0.45.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Jun 23, 2026

coderabbitai Bot approved these changes Jun 23, 2026

View reviewed changes

cjluo-nv approved these changes Jun 23, 2026

View reviewed changes

ajrasane enabled auto-merge (squash) June 23, 2026 21:07

ajrasane merged commit 1766d55 into main Jun 23, 2026
42 checks passed

ajrasane deleted the ajrasane/nvbug_6281412 branch June 23, 2026 21:51

meenchen mentioned this pull request Jun 26, 2026

Add AA-Omniscience eval recipe; harden judge/run conventions in the eval skill #1834

Merged

kevalmorabia97 mentioned this pull request Jul 1, 2026

[Cherry-pick] PRs #1801 #1808 #1629 #1627 #1824 #1826 #1830 #1760 #1831 #1858 #1839 #1857 #1869 #1880

Merged

kevalmorabia97 added the cherry-pick-done Added by bot once PR is cherry-picked to the release branch label Jul 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[6281412] docs: update TensorRT-Edge-LLM CLI commands in torch_onnx example#1808

[6281412] docs: update TensorRT-Edge-LLM CLI commands in torch_onnx example#1808
ajrasane merged 1 commit into
mainfrom
ajrasane/nvbug_6281412

ajrasane commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

cjluo-nv left a comment

Uh oh!

codecov Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ajrasane commented Jun 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

cjluo-nv left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ajrasane commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

codecov Bot commented Jun 23, 2026 •

edited

Loading