example demonstrating how to train CosmosReason2 Eagle3#965
example demonstrating how to train CosmosReason2 Eagle3#965h-guo18 merged 7 commits intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds EAGLE3 speculative-decoding support for Cosmos-Reason2-8B: a model config JSON, a training notebook documenting end-to-end training/export/deployment, and a script to download and normalize Nemotron chat conversations into the repository's conversation format. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant HF_Hub as "Hugging Face Hub"
participant Parser as "add_nemotron_chat.py"
participant Trainer as "Training Launcher / Trainer"
participant Export as "Checkpoint Export"
participant vLLM as "vLLM Deployment"
User->>HF_Hub: request Nemotron parquet shard URLs / authenticate
HF_Hub-->>Parser: return parquet shard URLs
User->>Parser: run parser (mapping or streaming mode)
Parser-->>Parser: normalize roles/fields, build conversations, write dataset
Parser->>Trainer: provide dataset + config + draft vocab
Trainer->>Trainer: train EAGLE3 draft head
Trainer->>Export: export HF-compatible checkpoint
Export->>vLLM: deploy exported checkpoint
User->>vLLM: run inference with speculative decoding
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb (2)
121-127: Avoid hard-coded expected row count in a reusable guide.Line 127 hard-codes
89511; this breaks when the mapping file evolves. Derive expected count fromguides/nemotron_mapping.csvinstead.🔧 Proposed fix
-# Verify the output: the file should contain exactly 89 511 conversations. +# Verify the output against the current mapping file. data_path = "input_conversations/nemotron-chat.jsonl" -with open(data_path) as f: +with open(data_path, encoding="utf-8") as f: line_count = sum(1 for _ in f) -assert line_count == 89511, f"Expected 89511 lines, got {line_count}" +with open("guides/nemotron_mapping.csv", encoding="utf-8") as f: + expected_count = sum(1 for line in f if line.strip()) + +assert line_count == expected_count, f"Expected {expected_count} lines, got {line_count}" print(f"✓ {line_count} conversations written to {data_path}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb` around lines 121 - 127, Replace the hard-coded expected row count check for data_path ("input_conversations/nemotron-chat.jsonl") by computing the expected number of conversations from the mapping CSV; open and read "guides/nemotron_mapping.csv" (or use an existing loader) to count its rows (e.g., excluding header) and store that as expected_count, then assert line_count == expected_count using that computed value so the guide remains correct when the mapping file changes; update the variables and the assert near the line_count calculation to reference expected_count instead of 89511.
45-46: Pin training dependencies in requirements.txt instead of using-U.Line 45 uses
-U, which pulls the latest nvidia-modelopt version each time. For reproducibility in a guide, either remove-Uand pin the exact version inrequirements.txt, or ensurerequirements.txtincludes all pinned dependencies—currently it only specifiestransformers==5.0.0rc1.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb` around lines 45 - 46, The notebook currently runs "!pip install -U nvidia-modelopt[hf]" which forces the latest nvidia-modelopt and breaks reproducibility; replace this by removing the -U flag in the notebook and pin the nvidia-modelopt[hf] version (and any other runtime deps) inside requirements.txt (which currently only pins transformers==5.0.0rc1) so the notebook installs from the pinned requirements; update the "!pip install -r ../requirements.txt" usage to be the single source of truth and ensure requirements.txt contains exact versions for nvidia-modelopt[hf] and any other training/runtime packages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 85-86: Update the step wording to match the executed mode: replace
the claim that the script “streams only the required parquet shards” with a
conditional description that when run normally it streams shards but when
invoked with the --mapping-file option it uses the non-streaming random-access
path and writes the conversation jsonl for launch_train.sh; reference the
notebook cell text around the current explanation and mention the --mapping-file
flag and launch_train.sh so readers understand the different behaviors.
In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`:
- Around line 130-133: Guard against non-string msg fields before calling
.lower() or treating as text: check that msg is a dict and that msg.get("from")
or msg.get("role") is an instance of str before assigning role =
msg["from"].lower() / role = msg["role"].lower(); if not, set a safe fallback
(e.g., role = "unknown") or skip the message and log a warning. Do the same for
message content used later (ensure msg.get("content") is str before using it,
otherwise coerce to str or skip with a warning). Apply these checks around the
role and content handling blocks (the code that assigns role and later reads
content) so malformed rows (None, list, object) don't raise exceptions and stop
processing.
- Around line 92-97: The help text for the --mapping-file option claims CSV
input but the parser treats each line as a plain integer (calls int(...) on
whole lines), which breaks on header/column CSVs; update the parsing logic in
add_nemotron_chat (the --mapping-file handling code) to robustly read CSVs (use
csv.reader or split on commas), skip empty lines and an optional header, and
extract the integer index from the first column (stripping whitespace) before
converting to int so both "one-integer-per-line" files and standard CSVs work;
alternatively, if you prefer not to accept CSVs, change the --mapping-file help
text to say "plain text with one 0-based integer index per line" to match the
current int(...) parsing.
- Around line 55-67: The filtered parquet shard list (split_files) can be
returned in nondeterministic order because list_repo_files/list_repo_tree is
unordered; sort split_files before converting to hub URLs so mapping mode loads
shards deterministically. Update the logic around split_files in the function
that builds the shard URLs (use sorted(split_files, key=Path(f).name) or
similar) prior to calling hf_hub_url for each file so hf_hub_url(dataset_id, f,
repo_type="dataset") is produced in a stable, filename-ordered sequence.
---
Nitpick comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 121-127: Replace the hard-coded expected row count check for
data_path ("input_conversations/nemotron-chat.jsonl") by computing the expected
number of conversations from the mapping CSV; open and read
"guides/nemotron_mapping.csv" (or use an existing loader) to count its rows
(e.g., excluding header) and store that as expected_count, then assert
line_count == expected_count using that computed value so the guide remains
correct when the mapping file changes; update the variables and the assert near
the line_count calculation to reference expected_count instead of 89511.
- Around line 45-46: The notebook currently runs "!pip install -U
nvidia-modelopt[hf]" which forces the latest nvidia-modelopt and breaks
reproducibility; replace this by removing the -U flag in the notebook and pin
the nvidia-modelopt[hf] version (and any other runtime deps) inside
requirements.txt (which currently only pins transformers==5.0.0rc1) so the
notebook installs from the pinned requirements; update the "!pip install -r
../requirements.txt" usage to be the single source of truth and ensure
requirements.txt contains exact versions for nvidia-modelopt[hf] and any other
training/runtime packages.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
examples/speculative_decoding/guides/nemotron_mapping.csvis excluded by!**/*.csv
📒 Files selected for processing (3)
examples/speculative_decoding/guides/CR2_eagle_config.jsonexamples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynbexamples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
| "The script streams only the required parquet shards and writes a conversation file in the\n", | ||
| "standard `jsonl` format expected by `launch_train.sh`." |
There was a problem hiding this comment.
Step 3 wording mismatches the executed mode.
Lines 85–86 say the script “streams,” but Line 110 passes --mapping-file, which triggers the non-streaming random-access path in the script.
Also applies to: 110-112
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`
around lines 85 - 86, Update the step wording to match the executed mode:
replace the claim that the script “streams only the required parquet shards”
with a conditional description that when run normally it streams shards but when
invoked with the --mapping-file option it uses the non-streaming random-access
path and writes the conversation jsonl for launch_train.sh; reference the
notebook cell text around the current explanation and mention the --mapping-file
flag and launch_train.sh so readers understand the different behaviors.
examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
Show resolved
Hide resolved
examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
Outdated
Show resolved
Hide resolved
| if "from" in msg: | ||
| role = msg["from"].lower() | ||
| elif "role" in msg: | ||
| role = msg["role"].lower() |
There was a problem hiding this comment.
Guard role/content type assumptions to prevent run-wide failure.
Lines 131/133 and Line 159 assume string values. A single malformed row (None, list, object) will raise and stop processing.
🔧 Proposed fix
- if "from" in msg:
- role = msg["from"].lower()
- elif "role" in msg:
- role = msg["role"].lower()
+ if "from" in msg:
+ raw_role = msg["from"]
+ elif "role" in msg:
+ raw_role = msg["role"]
else:
continue
+ if not isinstance(raw_role, str):
+ continue
+ role = raw_role.lower()
@@
- content = content.strip()
+ if not isinstance(content, str):
+ continue
+ content = content.strip()Also applies to: 150-160
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`
around lines 130 - 133, Guard against non-string msg fields before calling
.lower() or treating as text: check that msg is a dict and that msg.get("from")
or msg.get("role") is an instance of str before assigning role =
msg["from"].lower() / role = msg["role"].lower(); if not, set a safe fallback
(e.g., role = "unknown") or skip the message and log a warning. Do the same for
message content used later (ensure msg.get("content") is str before using it,
otherwise coerce to str or skip with a warning). Apply these checks around the
role and content handling blocks (the code that assigns role and later reads
content) so malformed rows (None, list, object) don't raise exceptions and stop
processing.
Signed-off-by: Slawek Kierat <skierat@nvidia.com>
5fb37f7 to
a1b7c7a
Compare
Signed-off-by: Slawek Kierat <skierat@nvidia.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #965 +/- ##
=======================================
Coverage 72.12% 72.12%
=======================================
Files 209 209
Lines 23628 23628
=======================================
Hits 17042 17042
Misses 6586 6586 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Slawek Kierat <skierat@nvidia.com>
Head branch was pushed to by a user without write access
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py (2)
90-93:⚠️ Potential issue | 🟡 Minor
--mapping-filehelp text does not match the parser contract.The help says CSV, but parsing expects one integer token per line. Either parse CSV first-column values or update wording to plain text indices.
🔧 Minimal wording fix
- "Path to a CSV file containing one 0-based dataset row index per line, " + "Path to a text file containing one 0-based dataset row index per line, "Also applies to: 179-182
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py` around lines 90 - 93, The help text for the --mapping-file option is misleading (it says CSV) while the parser expects one integer index per line; update the help string(s) referencing "--mapping-file" (both occurrences around the existing help text blocks) to clearly state "Path to a plain text file containing one 0-based dataset row index per line, e.g.:\n 429801\n 271023\nRows are loaded in the order they appear" so the wording matches the actual parsing logic (or alternatively adjust the parser to accept CSV first-column values if you prefer CSV behavior).
126-156:⚠️ Potential issue | 🟠 MajorHarden parsing against malformed message objects/content types.
A non-dict
msgor non-stringcontentwill still raise (msg.get,content.strip) and can stop the whole run. Guard both before access.🔧 Proposed fix
def parse_nemotron_conversation(raw_conversations: list) -> list[dict] | None: @@ - msgs = [] - for msg in raw_conversations: + if not isinstance(raw_conversations, list): + return None + + msgs = [] + for msg in raw_conversations: + if not isinstance(msg, dict): + continue @@ - content = content.strip() + if not isinstance(content, str): + continue + content = content.strip() if content: msgs.append({"role": role, "content": content})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py` around lines 126 - 156, The loop over raw_conversations assumes each msg is a dict and that content is a string; add guards so malformed messages don't crash the run: at the start of the loop check isinstance(msg, dict) and continue if not, then keep using msg.get("from")/msg.get("role") as before; after resolving content (from "value"/"content"/"text") verify isinstance(content, str) (or coerce safe-to-string types if desired) before calling content.strip(), and continue for non-string/None content. Reference: raw_conversations loop, variables msg, raw_role, role, and content.examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb (1)
79-80:⚠️ Potential issue | 🟡 MinorStep 3 text mismatches the executed mode.
This cell uses
--mapping-file, which triggers non-streaming random-access mode in the script, not streaming mode.🔧 Proposed wording update
- "The script streams only the required parquet shards and writes a conversation file in the\n", - "standard `jsonl` format expected by `launch_train.sh`." + "Without `--mapping-file`, the script streams chat parquet shards.\n", + "With `--mapping-file` (used below), it loads chat shards in non-streaming mode for index-based selection,\n", + "then writes the standard `jsonl` file expected by `launch_train.sh`."Also applies to: 90-93
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb` around lines 79 - 80, The Step 3 description incorrectly states streaming behavior while the command uses the --mapping-file flag which enables non-streaming random-access mode; update the text that mentions "streams only the required parquet shards" and the expected format to instead state that using --mapping-file triggers non-streaming random-access access and still writes the conversation file in the jsonl format expected by launch_train.sh; apply the same wording fix to the corresponding occurrences around the later block (lines 90-93) so both descriptions match the actual behavior when --mapping-file is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 104-106: The echo message prints a different path than the one
being counted, which is misleading; update the echo to reference the same path
used by the wc command (i.e., change echo \"${count} conversations in
../input_conversations/nemotron-chat.jsonl\" to echo \"${count} conversations in
input_conversations/nemotron-chat.jsonl\"), or refactor to assign the path to a
variable (e.g., file=input_conversations/nemotron-chat.jsonl) and use that
variable in both the count= and echo lines so they remain consistent.
- Around line 217-223: The export commands use paths that are relative to the
repo root but the notebook runs from guides/, so update the CKPT_DIR,
EXPORT_PATH and the script invocation to use correct relative paths (e.g.,
prefix with ../ or change to absolute paths) so that
scripts/export_hf_checkpoint.py and ckpts/... resolve properly when executed
from the notebook; locate the block setting CKPT_DIR, EXPORT_PATH and the python
call (the lines referencing CKPT_DIR, EXPORT_PATH and
scripts/export_hf_checkpoint.py) and change their paths accordingly or add a
preliminary cd to the repository root before invoking the script.
---
Duplicate comments:
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`:
- Around line 79-80: The Step 3 description incorrectly states streaming
behavior while the command uses the --mapping-file flag which enables
non-streaming random-access mode; update the text that mentions "streams only
the required parquet shards" and the expected format to instead state that using
--mapping-file triggers non-streaming random-access access and still writes the
conversation file in the jsonl format expected by launch_train.sh; apply the
same wording fix to the corresponding occurrences around the later block (lines
90-93) so both descriptions match the actual behavior when --mapping-file is
used.
In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`:
- Around line 90-93: The help text for the --mapping-file option is misleading
(it says CSV) while the parser expects one integer index per line; update the
help string(s) referencing "--mapping-file" (both occurrences around the
existing help text blocks) to clearly state "Path to a plain text file
containing one 0-based dataset row index per line, e.g.:\n 429801\n
271023\nRows are loaded in the order they appear" so the wording matches the
actual parsing logic (or alternatively adjust the parser to accept CSV
first-column values if you prefer CSV behavior).
- Around line 126-156: The loop over raw_conversations assumes each msg is a
dict and that content is a string; add guards so malformed messages don't crash
the run: at the start of the loop check isinstance(msg, dict) and continue if
not, then keep using msg.get("from")/msg.get("role") as before; after resolving
content (from "value"/"content"/"text") verify isinstance(content, str) (or
coerce safe-to-string types if desired) before calling content.strip(), and
continue for non-string/None content. Reference: raw_conversations loop,
variables msg, raw_role, role, and content.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4c24b250-19f8-4133-a8a6-451046fb95d8
⛔ Files ignored due to path filters (1)
examples/speculative_decoding/guides/nemotron_mapping.csvis excluded by!**/*.csv
📒 Files selected for processing (3)
examples/speculative_decoding/guides/CR2_eagle_config.jsonexamples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynbexamples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/speculative_decoding/guides/CR2_eagle_config.json
examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb
Show resolved
Hide resolved
| "%%bash\n", | ||
| "CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n", | ||
| "EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n", | ||
| "\n", | ||
| "python scripts/export_hf_checkpoint.py \\\n", | ||
| " --model_path $CKPT_DIR \\\n", | ||
| " --export_path $EXPORT_PATH" |
There was a problem hiding this comment.
Export command paths are inconsistent with notebook working directory.
From guides/, scripts/export_hf_checkpoint.py and ckpts/... resolve incorrectly. This step likely fails unless you cd .. or prefix with ../.
🔧 Proposed fix
"%%bash\n",
- "CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n",
- "EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n",
+ "cd ..\n",
+ "CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n",
+ "EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n",
"\n",
- "python scripts/export_hf_checkpoint.py \\\n",
+ "python scripts/export_hf_checkpoint.py \\\n",
" --model_path $CKPT_DIR \\\n",
" --export_path $EXPORT_PATH"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "%%bash\n", | |
| "CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n", | |
| "EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n", | |
| "\n", | |
| "python scripts/export_hf_checkpoint.py \\\n", | |
| " --model_path $CKPT_DIR \\\n", | |
| " --export_path $EXPORT_PATH" | |
| "%%bash\n", | |
| "cd ..\n", | |
| "CKPT_DIR=ckpts/cosmos-reason2-8b-eagle3/checkpoint-110000\n", | |
| "EXPORT_PATH=export/cosmos-reason2-8b-eagle3\n", | |
| "\n", | |
| "python scripts/export_hf_checkpoint.py \\\n", | |
| " --model_path $CKPT_DIR \\\n", | |
| " --export_path $EXPORT_PATH" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb`
around lines 217 - 223, The export commands use paths that are relative to the
repo root but the notebook runs from guides/, so update the CKPT_DIR,
EXPORT_PATH and the script invocation to use correct relative paths (e.g.,
prefix with ../ or change to absolute paths) so that
scripts/export_hf_checkpoint.py and ckpts/... resolve properly when executed
from the notebook; locate the block setting CKPT_DIR, EXPORT_PATH and the python
call (the lines referencing CKPT_DIR, EXPORT_PATH and
scripts/export_hf_checkpoint.py) and change their paths accordingly or add a
preliminary cd to the repository root before invoking the script.
Signed-off-by: Slawek Kierat <skierat@nvidia.com>
Signed-off-by: Slawek Kierat <skierat@nvidia.com>
Head branch was pushed to by a user without write access
There was a problem hiding this comment.
♻️ Duplicate comments (1)
examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py (1)
127-157:⚠️ Potential issue | 🟠 MajorGuard message/content types before
.get()and.strip()to avoid run-wide crashes.Line 129 assumes each
raw_conversationselement is a dict, and Line 156 assumes content is a string. A malformed row can still raise and abort the full processing run.🔧 Proposed fix
def parse_nemotron_conversation(raw_conversations: list) -> list[dict] | None: @@ - msgs = [] + if not isinstance(raw_conversations, list): + return None + + msgs = [] for msg in raw_conversations: + if not isinstance(msg, dict): + continue # Resolve role field (datasets may use "from" or "role") raw_role = msg.get("from") or msg.get("role") if not isinstance(raw_role, str): continue @@ - content = content.strip() + if not isinstance(content, str): + continue + content = content.strip() if content: msgs.append({"role": role, "content": content})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py` around lines 127 - 157, The loop over raw_conversations assumes each msg is a dict and content is a str; add safeguards in the for msg in raw_conversations loop to skip non-dict messages (check isinstance(msg, dict)) before calling msg.get or indexing, validate raw_role is a str before .lower() (you already check this but ensure raw_role retrieval only after confirming msg is dict), and ensure the resolved content is a string before calling .strip() (use isinstance(content, str) and skip or coerce otherwise). Update the content resolution logic in add_nemotron_chat.py (the raw_conversations loop, raw_role and content branches) so malformed rows are skipped early rather than causing exceptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@examples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py`:
- Around line 127-157: The loop over raw_conversations assumes each msg is a
dict and content is a str; add safeguards in the for msg in raw_conversations
loop to skip non-dict messages (check isinstance(msg, dict)) before calling
msg.get or indexing, validate raw_role is a str before .lower() (you already
check this but ensure raw_role retrieval only after confirming msg is dict), and
ensure the resolved content is a string before calling .strip() (use
isinstance(content, str) and skip or coerce otherwise). Update the content
resolution logic in add_nemotron_chat.py (the raw_conversations loop, raw_role
and content branches) so malformed rows are skipped early rather than causing
exceptions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fb944350-eec1-4b6f-828e-290e735daac9
⛔ Files ignored due to path filters (1)
examples/speculative_decoding/guides/nemotron_mapping.binis excluded by!**/*.bin
📒 Files selected for processing (2)
examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynbexamples/speculative_decoding/prepare_input_conversations/add_nemotron_chat.py
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb
What does this PR do?
Type of change: New example
Overview:
Adds examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb, a step-by-step Jupyter notebook that walks through the full EAGLE3 draft-head training workflow for nvidia/Cosmos-Reason2-8B. The notebook covers:
Also includes guides/nemotron_mapping.csv and guides/CR2_eagle_config.json as companion files.
Usage
Open and run examples/speculative_decoding/guides/train_eagle_head_cosmos_reason2.ipynb cell by cell. After training, serve the exported checkpoint with:
Testing
Tested end-to-end on a 4xB100 GPUs. The exported checkpoint was validated with specdec_bench
Before your PR is "Ready for review"
Additional Information
Cosmos-Reason2-8B requires at least one 80 GB GPU (H100/A100)
Summary by CodeRabbit
New Features
Documentation