Fix spec dec example tests by kevalmorabia97 · Pull Request #1183 · NVIDIA/Model-Optimizer

kevalmorabia97 · 2026-04-06T19:30:29Z

What does this PR do?

Type of change: Test fix

Fix tests/examples/speculative_decoding - previously silently skipped
Avoid pulling nemotron-post-training-dataset-v2 in tests to reduce chances of HF loading timeout in CICD
Make slow and redundant tests manual to speed up CICD

Testing

Tests passing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: ✅
Did you update Changelog?: N/A

Summary by CodeRabbit

Chores
- Removed git‑LFS install step from CI and deleted an automated branch‑cleanup workflow
- Trimmed example environment dependencies and relaxed transformers compatibility; added an optional tokenization dependency
Tests
- Switched tests to generate datasets dynamically and improved fixture handling
- Standardized PTQ test parameters (explicit calibration dataset) and refined GPU/test selection
Bug Fixes
- Improved device-awareness and numeric handling in speculative decoding attention paths

coderabbitai · 2026-04-06T19:30:46Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 52a90650-75eb-418c-ac8b-96b9e3113c48

📥 Commits

Reviewing files that changed from the base of the PR and between 361b4a4 and 0c7f0ed.

📒 Files selected for processing (1)

modelopt/torch/speculative/plugins/transformers.py

🚧 Files skipped from review as they are similar to previous changes (1)

modelopt/torch/speculative/plugins/transformers.py

📝 Walkthrough

Walkthrough

Removed git-lfs from CI, adjusted example dependency lists, relaxed a transformers constraint, added tiktoken to hf extras, updated speculative/transformers plugin rope/cache/dtype handling, refactored PTQ test utilities and tests to use centralized command + explicit calibration dataset, and made speculative-decoding tests generate datasets at runtime.

Changes

Cohort / File(s)	Summary
CI workflow change `\.github/workflows/_example_tests_runner.yml`	Removed installation of `git-lfs` from the example tests runner.
Workflow removal `\.github/workflows/delete_outdated_pr_branches.yml`	Deleted the workflow that pruned remote `pull-request/<num>` branches.
Example requirements `examples/llm_eval/requirements.txt`, `examples/llm_ptq/requirements.txt`, `examples/speculative_decoding/requirements.txt`	Removed `tiktoken` from llm_eval and llm_ptq, removed `torchvision` from llm_eval, and changed `transformers==5.0.0rc1` → `transformers<5.4` in speculative_decoding.
Project extras `pyproject.toml`	Added `tiktoken` to the `hf` optional-dependencies extra.
Speculative/Transformers plugin `modelopt/torch/speculative/plugins/transformers.py`	Replaced helper cache factory with direct `DynamicCache(config=...)`; extended `_maybe_init_rope(device=...)` and updated callers to pass the device; rebuilt `eagle_config` from a mutable arch config and injected top-level `rope_theta` into `rope_scaling` when missing; compute TTT attention mask using an effective dtype resolved from base config or layer weights.
PTQ test utilities `tests/_test_utils/examples/llm_ptq_utils.py`	Replaced local subprocess invocation with `run_llm_ptq_command(...)`; removed `trust_remote_code` flag handling; added `calib_dataset: str = "cnn_dailymail"` to `PTQCommand`; stop forwarding `max_sm`; extract `quant` separately; removed unreachable returns after pytest skips.
PTQ test update `tests/examples/llm_ptq/test_llm_ptq.py`	Now passes `calib_dataset="peoples_speech"` when constructing the Whisper PTQ `PTQCommand`.
Speculative decoding tests `tests/examples/speculative_decoding/conftest.py`, `tests/examples/speculative_decoding/test_eagle.py`	Fixture now generates a temporary dataset via YAML + `make_dataset.py` instead of using a static `daring-anteater.jsonl`; `test_eagle.py` imports `AutoConfig` directly, adds `num_gpus` param to `test_llama_eagle3` and uses it for GPU-skip logic; two remote model params marked manual; use `cfg.text_config` when present.
Export test tweak `tests/gpu/torch/export/test_unified_hf_export_and_check_safetensors.py`	Added explicit `dataset="cnn_dailymail"` argument to the `hf_ptq.py` invocation.
Misc. tests `tests/...`	Various test parameter forwarding and dataset-handling adjustments to align with centralized command and dataset generation changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix spec dec example tests' directly aligns with the main change objective—fixing tests in tests/examples/speculative_decoding that were previously skipped.
Security Anti-Patterns	✅ Passed	The PR complies with all security coding practices outlined in SECURITY.md. Security-sensitive patterns are only in test files which are exempt from these rules. The new tiktoken dependency is MIT-licensed.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kmorabia/spec-dec-tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

github-actions · 2026-04-06T19:36:08Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-07 05:24 UTC

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

pyproject.toml (1)
85-85: Consider adding a version constraint for tiktoken.

While several dependencies in the hf group are unpinned (e.g., nltk, wonderwords), many critical ones have version constraints (e.g., transformers>=4.56,<5.0, peft>=0.17.0, sentencepiece>=0.2.1). Adding a minimum version constraint for tiktoken would help ensure compatibility and prevent potential issues with older versions.
📌 Example version constraint
-    "tiktoken",
+    "tiktoken>=0.5.0",
Note: The specific version should be chosen based on the minimum version required by your codebase.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 85, The dependency entry "tiktoken" in the hf extras
of pyproject.toml is unpinned; update the hf extras to include a minimum version
constraint for tiktoken (for example "tiktoken>=X.Y.Z" or a range like
"tiktoken>=X.Y.Z,<NextMajor") so your code is protected from incompatible older
releases; modify the "tiktoken" entry in the hf extras list to the chosen
constraint string and run dependency resolution to verify compatibility with
existing constraints like transformers and peft.
tests/examples/speculative_decoding/test_eagle.py (2)
284-284: Same trust_remote_code=True pattern - consider documenting.

Same observation as line 234. An inline comment would clarify why this is needed for the Kimi checkpoint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/examples/speculative_decoding/test_eagle.py` at line 284, The call to
AutoConfig.from_pretrained(checkpoint_dir, trust_remote_code=True) needs an
inline comment explaining why trust_remote_code=True is required for the Kimi
checkpoint; update the invocation site (the AutoConfig.from_pretrained call
where checkpoint_dir is used) to add a short comment like “// required for Kimi
checkpoint because model code is provided in the checkpoint and must be trusted”
(or similar) so readers understand the reason.
234-236: Hardcoded trust_remote_code=True - acceptable for test code but consider documenting.

Per coding guidelines, trust_remote_code=True should not be hardcoded. However, this is test code (excluded from Bandit checks) testing specific remote models (Kimi, MiniMax) that require remote code execution.

Consider adding an inline comment explaining why this is necessary:
Suggested documentation
+    # trust_remote_code=True required for moonshotai/MiniMaxAI models that use custom modeling code
     cfg = AutoConfig.from_pretrained(model_path, trust_remote_code=True)
As per coding guidelines: "Do not hardcode trust_remote_code=True when loading Hugging Face Transformers models."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/examples/speculative_decoding/test_eagle.py` around lines 234 - 236,
The test currently hardcodes trust_remote_code=True when calling
AutoConfig.from_pretrained(model_path, trust_remote_code=True) (assigning to
cfg) which violates the general guideline; update the test to keep
trust_remote_code=True but add a clear inline comment adjacent to the
AutoConfig.from_pretrained call explaining that this is test-only, that the
tests exercise remote-models (e.g., Kimi, MiniMax) which require remote code
execution, and that this file is excluded from Bandit checks—so leave the flag
as-is for these specific models and do not change runtime behavior elsewhere.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/torch/speculative/plugins/transformers.py`:
- Around line 750-753: The code assumes self._base_llm_config has attribute
dtype but standard HF configs use torch_dtype; update the dtype selection logic
used before computing dtypemin so it first checks getattr(self._base_llm_config,
"dtype", None) then getattr(self._base_llm_config, "torch_dtype", None) and only
then falls back to self.eagle_module.layers[0].input_layernorm.weight.dtype;
ensure the variable named dtype is set from that prioritized lookup so
torch.finfo(dtype).min remains valid for dtypemin computation.

In `@pyproject.toml`:
- Line 85: Remove the "tiktoken" dependency entry from pyproject.toml (the
current string "tiktoken") and ensure the dependency remains declared only in
the example-specific requirements files (e.g.,
examples/specdec_bench/requirements.txt and examples/llm_eval/requirements.txt);
update those requirements files if missing and then regenerate any
lock/installed environment artifacts as needed (e.g., update poetry lock or CI
deps) so core package modelopt has no direct tiktoken dependency.

---

Nitpick comments:
In `@pyproject.toml`:
- Line 85: The dependency entry "tiktoken" in the hf extras of pyproject.toml is
unpinned; update the hf extras to include a minimum version constraint for
tiktoken (for example "tiktoken>=X.Y.Z" or a range like
"tiktoken>=X.Y.Z,<NextMajor") so your code is protected from incompatible older
releases; modify the "tiktoken" entry in the hf extras list to the chosen
constraint string and run dependency resolution to verify compatibility with
existing constraints like transformers and peft.

In `@tests/examples/speculative_decoding/test_eagle.py`:
- Line 284: The call to AutoConfig.from_pretrained(checkpoint_dir,
trust_remote_code=True) needs an inline comment explaining why
trust_remote_code=True is required for the Kimi checkpoint; update the
invocation site (the AutoConfig.from_pretrained call where checkpoint_dir is
used) to add a short comment like “// required for Kimi checkpoint because model
code is provided in the checkpoint and must be trusted” (or similar) so readers
understand the reason.
- Around line 234-236: The test currently hardcodes trust_remote_code=True when
calling AutoConfig.from_pretrained(model_path, trust_remote_code=True)
(assigning to cfg) which violates the general guideline; update the test to keep
trust_remote_code=True but add a clear inline comment adjacent to the
AutoConfig.from_pretrained call explaining that this is test-only, that the
tests exercise remote-models (e.g., Kimi, MiniMax) which require remote code
execution, and that this file is excluded from Bandit checks—so leave the flag
as-is for these specific models and do not change runtime behavior elsewhere.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1e1939be-54e5-460a-a4cd-e7b373cc56ad

📥 Commits

Reviewing files that changed from the base of the PR and between 4a5ef01 and c3526bb.

📒 Files selected for processing (11)

.github/workflows/_example_tests_runner.yml
examples/llm_eval/requirements.txt
examples/llm_ptq/requirements.txt
examples/speculative_decoding/requirements.txt
modelopt/torch/speculative/plugins/transformers.py
pyproject.toml
tests/_test_utils/examples/llm_ptq_utils.py
tests/examples/llm_ptq/test_llm_ptq.py
tests/examples/speculative_decoding/conftest.py
tests/examples/speculative_decoding/test_eagle.py
tests/gpu/torch/export/test_unified_hf_export_and_check_safetensors.py

💤 Files with no reviewable changes (3)

examples/llm_ptq/requirements.txt
examples/llm_eval/requirements.txt
.github/workflows/_example_tests_runner.yml

shengliangxu

LGTM

codecov · 2026-04-06T20:02:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.25%. Comparing base (df80a0f) to head (0c7f0ed).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1183      +/-   ##
==========================================
+ Coverage   74.77%   76.25%   +1.47%     
==========================================
  Files         351      351              
  Lines       40072    41891    +1819     
==========================================
+ Hits        29964    31943    +1979     
+ Misses      10108     9948     -160

Flag	Coverage Δ
examples	`45.20% <100.00%> (+4.98%)`	⬆️
gpu	`56.93% <6.25%> (-0.17%)`	⬇️
unit	`54.83% <75.00%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: h-guo18 <67671475+h-guo18@users.noreply.github.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 requested review from h-guo18, realAsma, shengliangxu and yeyu-nvidia April 6, 2026 19:30

kevalmorabia97 requested review from a team as code owners April 6, 2026 19:30

Fix spec dec example tests

c3526bb

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 force-pushed the kmorabia/spec-dec-tests branch from 2a82a07 to c3526bb Compare April 6, 2026 19:31

cleanup

a422268

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

shengliangxu reviewed Apr 6, 2026

View reviewed changes

Comment thread modelopt/torch/speculative/plugins/transformers.py

fix

3dcff9a

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 requested a review from shengliangxu April 6, 2026 19:44

h-guo18 reviewed Apr 6, 2026

View reviewed changes

Comment thread modelopt/torch/speculative/plugins/transformers.py Outdated

coderabbitai Bot reviewed Apr 6, 2026

View reviewed changes

Comment thread modelopt/torch/speculative/plugins/transformers.py

Comment thread pyproject.toml

h-guo18 reviewed Apr 6, 2026

View reviewed changes

Comment thread modelopt/torch/speculative/plugins/transformers.py

kevalmorabia97 requested a review from h-guo18 April 6, 2026 19:49

shengliangxu approved these changes Apr 6, 2026

View reviewed changes

h-guo18 reviewed Apr 6, 2026

View reviewed changes

Comment thread modelopt/torch/speculative/plugins/transformers.py Outdated

Update modelopt/torch/speculative/plugins/transformers.py

0c7f0ed

Co-authored-by: h-guo18 <67671475+h-guo18@users.noreply.github.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 force-pushed the kmorabia/spec-dec-tests branch from 361b4a4 to 0c7f0ed Compare April 7, 2026 03:22

kevalmorabia97 requested a review from h-guo18 April 7, 2026 03:22

h-guo18 approved these changes Apr 7, 2026

View reviewed changes

kevalmorabia97 enabled auto-merge (squash) April 7, 2026 04:31

kevalmorabia97 requested review from meenchen and sugunav14 April 7, 2026 05:13

sugunav14 approved these changes Apr 7, 2026

View reviewed changes

kevalmorabia97 merged commit 80d2f02 into main Apr 7, 2026
45 checks passed

kevalmorabia97 deleted the kmorabia/spec-dec-tests branch April 7, 2026 05:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix spec dec example tests#1183

Fix spec dec example tests#1183
kevalmorabia97 merged 4 commits intomainfrom
kmorabia/spec-dec-tests

kevalmorabia97 commented Apr 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shengliangxu left a comment

Uh oh!

codecov Bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kevalmorabia97 commented Apr 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shengliangxu left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kevalmorabia97 commented Apr 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading

github-actions Bot commented Apr 6, 2026 •

edited

Loading

codecov Bot commented Apr 6, 2026 •

edited

Loading