Skip to content

example demonstrating how to train CosmosReason2 Eagle3#965

Merged
h-guo18 merged 7 commits intoNVIDIA:mainfrom
skierat:skierat/CR2_guide
Mar 5, 2026
Merged

example demonstrating how to train CosmosReason2 Eagle3#965
h-guo18 merged 7 commits intoNVIDIA:mainfrom
skierat:skierat/CR2_guide

Conversation

@skierat
Copy link
Contributor

@skierat skierat commented Mar 3, 2026

What does this PR do?

Type of change: New example

Overview:
Adds examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb, a step-by-step Jupyter notebook that walks through the full EAGLE3 draft-head training workflow for nvidia/Cosmos-Reason2-8B. The notebook covers:

  1. Installing dependencies
  2. Authenticating with Hugging Face
  3. Preparing training data from the Nemotron-Post-Training-Dataset-v2 (chat split) using a curated row-selection mapping (guides/nemotron_mapping.csv)
  4. Inspecting the bundled EAGLE3 config (guides/CR2_eagle_config.json) tuned for Cosmos-Reason2 (YaRN RoPE, FlexAttention, reduced draft vocabulary)
  5. (Optional) Calibrating the draft vocabulary to 32k tokens for faster training and inference
  6. Launching training via launch_train.sh with FSDP2 multi-GPU support
  7. Exporting the checkpoint to HF format and serving with vLLM

Also includes guides/nemotron_mapping.csv and guides/CR2_eagle_config.json as companion files.

Usage

Open and run examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb cell by cell. After training, serve the exported checkpoint with:

vllm serve nvidia/Cosmos-Reason2-8B \    --host 0.0.0.0 \    --port 8000 \    --speculative-model export/cosmos-reason2-8b-eagle3 \    --num-speculative-tokens 3 \    --dtype bfloat16

Testing

Tested end-to-end on a 4xB100 GPUs. The exported checkpoint was validated with specdec_bench

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: No
  • Did you add or update any necessary documentation?: Yes (the notebook is self-documenting)
  • Did you update Changelog?: No

Additional Information

Cosmos-Reason2-8B requires at least one 80 GB GPU (H100/A100)

Summary by CodeRabbit

  • New Features

    • Added a speculative-decoding configuration for draft model and rotary/attention behavior.
    • Added an end-to-end training notebook demonstrating training, export, and deployment of a speculative-decoding draft head on Cosmos-Reason2.
    • Added a data-preparation tool to download, normalize, and convert Nemotron chat conversations into a standardized conversation format.
  • Documentation

    • Notebook documents environment setup, data prep, training/validation cadence, export, and deployment steps.

@skierat skierat requested a review from a team as a code owner March 3, 2026 12:23
@skierat skierat requested a review from h-guo18 March 3, 2026 12:23
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 3, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

📝 Walkthrough

Walkthrough

Adds EAGLE3 speculative-decoding support for Cosmos-Reason2-8B: a model config JSON, a training notebook documenting end-to-end training/export/deployment, and a script to download and normalize Nemotron chat conversations into the repository's conversation format.

Changes

Cohort / File(s) Summary
EAGLE3 Configuration
examples/speculative_decoding/guides/CR2_eagle_config.json
New JSON config adding draft vocab and model hyperparameters, attention implementation set to flex_attention, RoPE/rope_scaling settings (beta_fast, beta_slow, factor, original_max_position_embeddings, rope_type, truncate) and rope_theta.
Training Notebook
examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb
New Jupyter notebook documenting dependency setup, HF auth, Nemotron data preparation and mapping, draft vocab calibration, training launch (FSDP2/multi-GPU notes), export to HF format, and vLLM deployment guidance.
Data Preparation Script
examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
New CLI script to discover Nemotron parquet shard URLs on HF Hub, stream or load by index mapping, parse/normalize chat roles and content fields into standardized conversation messages, generate deterministic conversation IDs, and write output split files.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant HF_Hub as "Hugging Face Hub"
    participant Parser as "add_nemotron_chat.py"
    participant Trainer as "Training Launcher / Trainer"
    participant Export as "Checkpoint Export"
    participant vLLM as "vLLM Deployment"

    User->>HF_Hub: request Nemotron parquet shard URLs / authenticate
    HF_Hub-->>Parser: return parquet shard URLs
    User->>Parser: run parser (mapping or streaming mode)
    Parser-->>Parser: normalize roles/fields, build conversations, write dataset
    Parser->>Trainer: provide dataset + config + draft vocab
    Trainer->>Trainer: train EAGLE3 draft head
    Trainer->>Export: export HF-compatible checkpoint
    Export->>vLLM: deploy exported checkpoint
    User->>vLLM: run inference with speculative decoding
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main contribution: a new example demonstrating how to train EAGLE3 on Cosmos-Reason2.
Security Anti-Patterns ✅ Passed Pull request adds three files with no security anti-patterns detected against SECURITY.md guidelines. Python script safely uses np.fromfile() with explicit dtype specification.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb (2)

121-127: Avoid hard-coded expected row count in a reusable guide.

Line 127 hard-codes 89511; this breaks when the mapping file evolves. Derive expected count from guides/nemotron_mapping.csv instead.

🔧 Proposed fix
-# Verify the output: the file should contain exactly 89 511 conversations.
+# Verify the output against the current mapping file.
 data_path = "input_conversations/nemotron-chat.jsonl"
 
-with open(data_path) as f:
+with open(data_path, encoding="utf-8") as f:
     line_count = sum(1 for _ in f)
 
-assert line_count == 89511, f"Expected 89511 lines, got {line_count}"
+with open("guides/nemotron_mapping.csv", encoding="utf-8") as f:
+    expected_count = sum(1 for line in f if line.strip())
+
+assert line_count == expected_count, f"Expected {expected_count} lines, got {line_count}"
 print(f"✓ {line_count} conversations written to {data_path}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`
around lines 121 - 127, Replace the hard-coded expected row count check for
data_path ("input_conversations/nemotron-chat.jsonl") by computing the expected
number of conversations from the mapping CSV; open and read
"guides/nemotron_mapping.csv" (or use an existing loader) to count its rows
(e.g., excluding header) and store that as expected_count, then assert
line_count == expected_count using that computed value so the guide remains
correct when the mapping file changes; update the variables and the assert near
the line_count calculation to reference expected_count instead of 89511.

45-46: Pin training dependencies in requirements.txt instead of using -U.

Line 45 uses -U, which pulls the latest nvidia-modelopt version each time. For reproducibility in a guide, either remove -U and pin the exact version in requirements.txt, or ensure requirements.txt includes all pinned dependencies—currently it only specifies transformers==5.0.0rc1.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`
around lines 45 - 46, The notebook currently runs "!pip install -U
nvidia-modelopt[hf]" which forces the latest nvidia-modelopt and breaks
reproducibility; replace this by removing the -U flag in the notebook and pin
the nvidia-modelopt[hf] version (and any other runtime deps) inside
requirements.txt (which currently only pins transformers==5.0.0rc1) so the
notebook installs from the pinned requirements; update the "!pip install -r
../requirements.txt" usage to be the single source of truth and ensure
requirements.txt contains exact versions for nvidia-modelopt[hf] and any other
training/runtime packages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 85-86: Update the step wording to match the executed mode: replace
the claim that the script “streams only the required parquet shards” with a
conditional description that when run normally it streams shards but when
invoked with the --mapping-file option it uses the non-streaming random-access
path and writes the conversation jsonl for launch_train.sh; reference the
notebook cell text around the current explanation and mention the --mapping-file
flag and launch_train.sh so readers understand the different behaviors.

In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`:
- Around line 130-133: Guard against non-string msg fields before calling
.lower() or treating as text: check that msg is a dict and that msg.get("from")
or msg.get("role") is an instance of str before assigning role =
msg["from"].lower() / role = msg["role"].lower(); if not, set a safe fallback
(e.g., role = "unknown") or skip the message and log a warning. Do the same for
message content used later (ensure msg.get("content") is str before using it,
otherwise coerce to str or skip with a warning). Apply these checks around the
role and content handling blocks (the code that assigns role and later reads
content) so malformed rows (None, list, object) don't raise exceptions and stop
processing.
- Around line 92-97: The help text for the --mapping-file option claims CSV
input but the parser treats each line as a plain integer (calls int(...) on
whole lines), which breaks on header/column CSVs; update the parsing logic in
add_nemotron_chat (the --mapping-file handling code) to robustly read CSVs (use
csv.reader or split on commas), skip empty lines and an optional header, and
extract the integer index from the first column (stripping whitespace) before
converting to int so both "one-integer-per-line" files and standard CSVs work;
alternatively, if you prefer not to accept CSVs, change the --mapping-file help
text to say "plain text with one 0-based integer index per line" to match the
current int(...) parsing.
- Around line 55-67: The filtered parquet shard list (split_files) can be
returned in nondeterministic order because list_repo_files/list_repo_tree is
unordered; sort split_files before converting to hub URLs so mapping mode loads
shards deterministically. Update the logic around split_files in the function
that builds the shard URLs (use sorted(split_files, key=Path(f).name) or
similar) prior to calling hf_hub_url for each file so hf_hub_url(dataset_id, f,
repo_type="dataset") is produced in a stable, filename-ordered sequence.

---

Nitpick comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 121-127: Replace the hard-coded expected row count check for
data_path ("input_conversations/nemotron-chat.jsonl") by computing the expected
number of conversations from the mapping CSV; open and read
"guides/nemotron_mapping.csv" (or use an existing loader) to count its rows
(e.g., excluding header) and store that as expected_count, then assert
line_count == expected_count using that computed value so the guide remains
correct when the mapping file changes; update the variables and the assert near
the line_count calculation to reference expected_count instead of 89511.
- Around line 45-46: The notebook currently runs "!pip install -U
nvidia-modelopt[hf]" which forces the latest nvidia-modelopt and breaks
reproducibility; replace this by removing the -U flag in the notebook and pin
the nvidia-modelopt[hf] version (and any other runtime deps) inside
requirements.txt (which currently only pins transformers==5.0.0rc1) so the
notebook installs from the pinned requirements; update the "!pip install -r
../requirements.txt" usage to be the single source of truth and ensure
requirements.txt contains exact versions for nvidia-modelopt[hf] and any other
training/runtime packages.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 860d0b4 and 5fb37f7.

⛔ Files ignored due to path filters (1)
  • examples/speculative_decoding/guides/nemotron_mapping.csv is excluded by !**/*.csv
📒 Files selected for processing (3)
  • examples/speculative_decoding/guides/CR2_eagle_config.json
  • examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb
  • examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py

Comment on lines +85 to +86
"The script streams only the required parquet shards and writes a conversation file in the\n",
"standard `jsonl` format expected by `launch_train.sh`."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Step 3 wording mismatches the executed mode.

Lines 85–86 say the script “streams,” but Line 110 passes --mapping-file, which triggers the non-streaming random-access path in the script.

Also applies to: 110-112

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`
around lines 85 - 86, Update the step wording to match the executed mode:
replace the claim that the script “streams only the required parquet shards”
with a conditional description that when run normally it streams shards but when
invoked with the --mapping-file option it uses the non-streaming random-access
path and writes the conversation jsonl for launch_train.sh; reference the
notebook cell text around the current explanation and mention the --mapping-file
flag and launch_train.sh so readers understand the different behaviors.

Comment on lines +130 to +133
if "from" in msg:
role = msg["from"].lower()
elif "role" in msg:
role = msg["role"].lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard role/content type assumptions to prevent run-wide failure.

Lines 131/133 and Line 159 assume string values. A single malformed row (None, list, object) will raise and stop processing.

🔧 Proposed fix
-        if "from" in msg:
-            role = msg["from"].lower()
-        elif "role" in msg:
-            role = msg["role"].lower()
+        if "from" in msg:
+            raw_role = msg["from"]
+        elif "role" in msg:
+            raw_role = msg["role"]
         else:
             continue
+        if not isinstance(raw_role, str):
+            continue
+        role = raw_role.lower()
@@
-        content = content.strip()
+        if not isinstance(content, str):
+            continue
+        content = content.strip()

Also applies to: 150-160

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`
around lines 130 - 133, Guard against non-string msg fields before calling
.lower() or treating as text: check that msg is a dict and that msg.get("from")
or msg.get("role") is an instance of str before assigning role =
msg["from"].lower() / role = msg["role"].lower(); if not, set a safe fallback
(e.g., role = "unknown") or skip the message and log a warning. Do the same for
message content used later (ensure msg.get("content") is str before using it,
otherwise coerce to str or skip with a warning). Apply these checks around the
role and content handling blocks (the code that assigns role and later reads
content) so malformed rows (None, list, object) don't raise exceptions and stop
processing.

Signed-off-by: Slawek Kierat <skierat@nvidia.com>
@skierat skierat force-pushed the skierat/CR2_guide branch from 5fb37f7 to a1b7c7a Compare March 3, 2026 12:49
Signed-off-by: Slawek Kierat <skierat@nvidia.com>
Copy link
Contributor

@h-guo18 h-guo18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@h-guo18 h-guo18 enabled auto-merge (squash) March 3, 2026 23:16
@codecov
Copy link

codecov bot commented Mar 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.12%. Comparing base (e8f9687) to head (717948f).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #965   +/-   ##
=======================================
  Coverage   72.12%   72.12%           
=======================================
  Files         209      209           
  Lines       23628    23628           
=======================================
  Hits        17042    17042           
  Misses       6586     6586           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Slawek Kierat <skierat@nvidia.com>
auto-merge was automatically disabled March 4, 2026 08:52

Head branch was pushed to by a user without write access

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py (2)

90-93: ⚠️ Potential issue | 🟡 Minor

--mapping-file help text does not match the parser contract.

The help says CSV, but parsing expects one integer token per line. Either parse CSV first-column values or update wording to plain text indices.

🔧 Minimal wording fix
-            "Path to a CSV file containing one 0-based dataset row index per line, "
+            "Path to a text file containing one 0-based dataset row index per line, "

Also applies to: 179-182

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`
around lines 90 - 93, The help text for the --mapping-file option is misleading
(it says CSV) while the parser expects one integer index per line; update the
help string(s) referencing "--mapping-file" (both occurrences around the
existing help text blocks) to clearly state "Path to a plain text file
containing one 0-based dataset row index per line, e.g.:\n  429801\n 
271023\nRows are loaded in the order they appear" so the wording matches the
actual parsing logic (or alternatively adjust the parser to accept CSV
first-column values if you prefer CSV behavior).

126-156: ⚠️ Potential issue | 🟠 Major

Harden parsing against malformed message objects/content types.

A non-dict msg or non-string content will still raise (msg.get, content.strip) and can stop the whole run. Guard both before access.

🔧 Proposed fix
 def parse_nemotron_conversation(raw_conversations: list) -> list[dict] | None:
@@
-    msgs = []
-    for msg in raw_conversations:
+    if not isinstance(raw_conversations, list):
+        return None
+
+    msgs = []
+    for msg in raw_conversations:
+        if not isinstance(msg, dict):
+            continue
@@
-        content = content.strip()
+        if not isinstance(content, str):
+            continue
+        content = content.strip()
         if content:
             msgs.append({"role": role, "content": content})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`
around lines 126 - 156, The loop over raw_conversations assumes each msg is a
dict and that content is a string; add guards so malformed messages don't crash
the run: at the start of the loop check isinstance(msg, dict) and continue if
not, then keep using msg.get("from")/msg.get("role") as before; after resolving
content (from "value"/"content"/"text") verify isinstance(content, str) (or
coerce safe-to-string types if desired) before calling content.strip(), and
continue for non-string/None content. Reference: raw_conversations loop,
variables msg, raw_role, role, and content.
examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb (1)

79-80: ⚠️ Potential issue | 🟡 Minor

Step 3 text mismatches the executed mode.

This cell uses --mapping-file, which triggers non-streaming random-access mode in the script, not streaming mode.

🔧 Proposed wording update
-        "The script streams only the required parquet shards and writes a conversation file in the\n",
-        "standard `jsonl` format expected by `launch_train.sh`."
+        "Without `--mapping-file`, the script streams chat parquet shards.\n",
+        "With `--mapping-file` (used below), it loads chat shards in non-streaming mode for index-based selection,\n",
+        "then writes the standard `jsonl` file expected by `launch_train.sh`."

Also applies to: 90-93

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`
around lines 79 - 80, The Step 3 description incorrectly states streaming
behavior while the command uses the --mapping-file flag which enables
non-streaming random-access mode; update the text that mentions "streams only
the required parquet shards" and the expected format to instead state that using
--mapping-file triggers non-streaming random-access access and still writes the
conversation file in the jsonl format expected by launch_train.sh; apply the
same wording fix to the corresponding occurrences around the later block (lines
90-93) so both descriptions match the actual behavior when --mapping-file is
used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 104-106: The echo message prints a different path than the one
being counted, which is misleading; update the echo to reference the same path
used by the wc command (i.e., change echo \"${count} conversations in
../input_conversations/nemotron-chat.jsonl\" to echo \"${count} conversations in
input_conversations/nemotron-chat.jsonl\"), or refactor to assign the path to a
variable (e.g., file=input_conversations/nemotron-chat.jsonl) and use that
variable in both the count= and echo lines so they remain consistent.
- Around line 217-223: The export commands use paths that are relative to the
repo root but the notebook runs from guides/, so update the CKPT_DIR,
EXPORT_PATH and the script invocation to use correct relative paths (e.g.,
prefix with ../ or change to absolute paths) so that
scripts/export_hf_checkpoint.py and ckpts/... resolve properly when executed
from the notebook; locate the block setting CKPT_DIR, EXPORT_PATH and the python
call (the lines referencing CKPT_DIR, EXPORT_PATH and
scripts/export_hf_checkpoint.py) and change their paths accordingly or add a
preliminary cd to the repository root before invoking the script.

---

Duplicate comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 79-80: The Step 3 description incorrectly states streaming
behavior while the command uses the --mapping-file flag which enables
non-streaming random-access mode; update the text that mentions "streams only
the required parquet shards" and the expected format to instead state that using
--mapping-file triggers non-streaming random-access access and still writes the
conversation file in the jsonl format expected by launch_train.sh; apply the
same wording fix to the corresponding occurrences around the later block (lines
90-93) so both descriptions match the actual behavior when --mapping-file is
used.

In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`:
- Around line 90-93: The help text for the --mapping-file option is misleading
(it says CSV) while the parser expects one integer index per line; update the
help string(s) referencing "--mapping-file" (both occurrences around the
existing help text blocks) to clearly state "Path to a plain text file
containing one 0-based dataset row index per line, e.g.:\n  429801\n 
271023\nRows are loaded in the order they appear" so the wording matches the
actual parsing logic (or alternatively adjust the parser to accept CSV
first-column values if you prefer CSV behavior).
- Around line 126-156: The loop over raw_conversations assumes each msg is a
dict and that content is a string; add guards so malformed messages don't crash
the run: at the start of the loop check isinstance(msg, dict) and continue if
not, then keep using msg.get("from")/msg.get("role") as before; after resolving
content (from "value"/"content"/"text") verify isinstance(content, str) (or
coerce safe-to-string types if desired) before calling content.strip(), and
continue for non-string/None content. Reference: raw_conversations loop,
variables msg, raw_role, role, and content.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4c24b250-19f8-4133-a8a6-451046fb95d8

📥 Commits

Reviewing files that changed from the base of the PR and between 5fb37f7 and 5ff0638.

⛔ Files ignored due to path filters (1)
  • examples/speculative_decoding/guides/nemotron_mapping.csv is excluded by !**/*.csv
📒 Files selected for processing (3)
  • examples/speculative_decoding/guides/CR2_eagle_config.json
  • examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb
  • examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/speculative_decoding/guides/CR2_eagle_config.json

Comment on lines +217 to +223
"%%bash\n",
"CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n",
"EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n",
"\n",
"python scripts/export_hf_checkpoint.py \\\n",
" --model_path $CKPT_DIR \\\n",
" --export_path $EXPORT_PATH"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Export command paths are inconsistent with notebook working directory.

From guides/, scripts/export_hf_checkpoint.py and ckpts/... resolve incorrectly. This step likely fails unless you cd .. or prefix with ../.

🔧 Proposed fix
         "%%bash\n",
-        "CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n",
-        "EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n",
+        "cd ..\n",
+        "CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n",
+        "EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n",
         "\n",
-        "python scripts/export_hf_checkpoint.py \\\n",
+        "python scripts/export_hf_checkpoint.py \\\n",
         "    --model_path $CKPT_DIR \\\n",
         "    --export_path $EXPORT_PATH"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"%%bash\n",
"CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n",
"EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n",
"\n",
"python scripts/export_hf_checkpoint.py \\\n",
" --model_path $CKPT_DIR \\\n",
" --export_path $EXPORT_PATH"
"%%bash\n",
"cd ..\n",
"CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n",
"EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n",
"\n",
"python scripts/export_hf_checkpoint.py \\\n",
" --model_path $CKPT_DIR \\\n",
" --export_path $EXPORT_PATH"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`
around lines 217 - 223, The export commands use paths that are relative to the
repo root but the notebook runs from guides/, so update the CKPT_DIR,
EXPORT_PATH and the script invocation to use correct relative paths (e.g.,
prefix with ../ or change to absolute paths) so that
scripts/export_hf_checkpoint.py and ckpts/... resolve properly when executed
from the notebook; locate the block setting CKPT_DIR, EXPORT_PATH and the python
call (the lines referencing CKPT_DIR, EXPORT_PATH and
scripts/export_hf_checkpoint.py) and change their paths accordingly or add a
preliminary cd to the repository root before invoking the script.

Signed-off-by: Slawek Kierat <skierat@nvidia.com>
auto-merge was automatically disabled March 5, 2026 10:35

Head branch was pushed to by a user without write access

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py (1)

127-157: ⚠️ Potential issue | 🟠 Major

Guard message/content types before .get() and .strip() to avoid run-wide crashes.

Line 129 assumes each raw_conversations element is a dict, and Line 156 assumes content is a string. A malformed row can still raise and abort the full processing run.

🔧 Proposed fix
 def parse_nemotron_conversation(raw_conversations: list) -> list[dict] | None:
@@
-    msgs = []
+    if not isinstance(raw_conversations, list):
+        return None
+
+    msgs = []
     for msg in raw_conversations:
+        if not isinstance(msg, dict):
+            continue
         # Resolve role field (datasets may use "from" or "role")
         raw_role = msg.get("from") or msg.get("role")
         if not isinstance(raw_role, str):
             continue
@@
-        content = content.strip()
+        if not isinstance(content, str):
+            continue
+        content = content.strip()
         if content:
             msgs.append({"role": role, "content": content})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`
around lines 127 - 157, The loop over raw_conversations assumes each msg is a
dict and content is a str; add safeguards in the for msg in raw_conversations
loop to skip non-dict messages (check isinstance(msg, dict)) before calling
msg.get or indexing, validate raw_role is a str before .lower() (you already
check this but ensure raw_role retrieval only after confirming msg is dict), and
ensure the resolved content is a string before calling .strip() (use
isinstance(content, str) and skip or coerce otherwise). Update the content
resolution logic in add_nemotron_chat.py (the raw_conversations loop, raw_role
and content branches) so malformed rows are skipped early rather than causing
exceptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`:
- Around line 127-157: The loop over raw_conversations assumes each msg is a
dict and content is a str; add safeguards in the for msg in raw_conversations
loop to skip non-dict messages (check isinstance(msg, dict)) before calling
msg.get or indexing, validate raw_role is a str before .lower() (you already
check this but ensure raw_role retrieval only after confirming msg is dict), and
ensure the resolved content is a string before calling .strip() (use
isinstance(content, str) and skip or coerce otherwise). Update the content
resolution logic in add_nemotron_chat.py (the raw_conversations loop, raw_role
and content branches) so malformed rows are skipped early rather than causing
exceptions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fb944350-eec1-4b6f-828e-290e735daac9

📥 Commits

Reviewing files that changed from the base of the PR and between 1f8944f and 717948f.

⛔ Files ignored due to path filters (1)
  • examples/speculative_decoding/guides/nemotron_mapping.bin is excluded by !**/*.bin
📒 Files selected for processing (2)
  • examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb
  • examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb

@h-guo18 h-guo18 enabled auto-merge (squash) March 5, 2026 21:16
@h-guo18 h-guo18 merged commit dd16a96 into NVIDIA:main Mar 5, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants