Skip to content

cp: test: Update on-policy distillation release tests (1363) into r0.4.0#1376

Merged
zpqiu merged 1 commit intor0.4.0from
cherry-pick-1363-r0.4.0
Oct 17, 2025
Merged

cp: test: Update on-policy distillation release tests (1363) into r0.4.0#1376
zpqiu merged 1 commit intor0.4.0from
cherry-pick-1363-r0.4.0

Conversation

@chtruong814
Copy link
Contributor

@chtruong814 chtruong814 commented Oct 16, 2025

beep boop [🤖]: Hi @zpqiu 👋,

we've cherry picked #1363 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • New Features

    • Added new distillation configuration example for sequence packing workflows.
  • Bug Fixes

    • Updated validation accuracy thresholds and loss metrics in test suites for improved reliability.
  • Chores

    • Streamlined example distillation configurations by removing redundant batch size and optimizer settings.
    • Optimized test runtime parameters and reorganized test suite entries for better maintainability.
    • Removed deprecated distillation configuration examples.

Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@chtruong814 chtruong814 requested review from a team as code owners October 16, 2025 09:13
@chtruong814 chtruong814 requested review from zpqiu and removed request for a team October 16, 2025 09:13
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

📝 Walkthrough

Walkthrough

This PR updates and restructures multiple distillation example configurations and test scripts for Qwen model distillation. Changes include increasing validation batch sizes, removing training/generation batch size and scheduler block parameters from policy/teacher configurations, updating test success criteria with stricter loss thresholds and validation accuracy checks, and consolidating/removing deprecated configuration files and test entries.

Changes

Cohort / File(s) Summary
Config updates: batch size & scheduler removal
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml, examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
Increased distillation.val_batch_size (32→256), removed policy.train_global_batch_size, policy.generation_batch_size, policy.scheduler block and equivalent fields from teacher section.
Config major restructuring
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
Increased val_batch_size (32→512), removed max_val_samples, added loss_fn block (kl_type: reverse), reduced checkpointing.save_period (50→10), updated max_total_sequence_length values, removed optimizer/scheduler/batching configs from policy/teacher, added generation.vllm_cfg.tensor_parallel_size.
New config file
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
Added complete distillation configuration with sequence packing enabled, tensor parallel settings, and logging paths.
Deleted config files
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-instruct-2n8g-fsdp2tp2-seqpack.v1.yaml, examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.yaml, examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.yaml
Removed entire YAML configuration files, eliminating distillation pipeline configurations.
Test script metric & timing updates
tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh, tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh, tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh, tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
Reduced time budgets (NUM_MINUTES: 240→120 or 1200→240), tightened training loss thresholds, added validation/accuracy checks, removed GPU memory usage constraints.
Deleted test scripts
tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.sh, tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.sh
Removed entire Bash test scripts.
Release manifest update
tests/test_suites/release.txt
Removed 8B convergence distillation test entries, updated section heading from "Long 4b and 8b convergence" to "Long 4b convergence", replaced instruct-focused test with base seqpack test entry.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The changes span multiple files with a consistent pattern of configuration restructuring (removal of batch size and scheduler fields across policy/teacher blocks). While the repetition across files reduces individual reasoning effort per file, the heterogeneity of changes (config updates, new additions, deletions, test metric modifications) and the need to verify test success criteria alignment require moderate review complexity.

Possibly related PRs

Suggested labels

r0.4.0

Suggested reviewers

  • terrykong

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title includes explicit cherry-pick metadata and branch references rather than clearly describing the update to on-policy distillation release tests. It does not succinctly convey the main content of the changeset and adds unnecessary noise. A concise title focusing on the test updates would make the purpose clearer. Please revise the title to a concise summary of the change, for example “Update on-policy distillation release tests,” and remove the cherry-pick and branch references.
Test Results For Major Changes ⚠️ Warning The PR makes extensive changes to distillation configurations and test scripts that directly affect convergence criteria and performance thresholds, but its description contains only a generic cherry-pick notice without any evidence of validation, regression tests, or performance benchmarks; therefore it lacks the required documentation of test results to ensure that these major changes do not introduce regressions. Please update the PR description to include detailed test results or validation data—such as before-and-after loss and accuracy metrics, convergence checks, and the specific configurations used—to demonstrate that the revised batch sizes, scheduler removals, and threshold updates maintain or improve numerical stability and performance.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cherry-pick-1363-r0.4.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zpqiu zpqiu enabled auto-merge (squash) October 16, 2025 09:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/test_suites/release.txt (1)

46-49: Fix typos and hyphenation in the comment line.

Use “20-step” and fix “seqence” -> “sequence”.

-# 20 step functional tests on dynamic batching, non-colocated and seqence packing features
+# 20-step functional tests on dynamic batching, non-colocated and sequence packing features
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml (1)

12-20: Explicitly set TP=2 for policy to match fsdp2tp2.

Currently only context_parallel_size is set. Add either policy.dtensor_cfg.tensor_parallel_size: 2 or generation.vllm_cfg.tensor_parallel_size: 2 (matching the long recipe).

Example (match long recipe):

 policy:
   model_name: Qwen/Qwen3-4B-Base
   dtensor_cfg:
     context_parallel_size: 1
+  generation:
+    vllm_cfg:
+      tensor_parallel_size: 2
   dynamic_batching:
     enabled: false
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b81f38 and 86df9f1.

📒 Files selected for processing (14)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-instruct-2n8g-fsdp2tp2-seqpack.v1.yaml (0 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.yaml (0 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.yaml (0 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.sh (0 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.sh (0 hunks)
  • tests/test_suites/release.txt (1 hunks)
💤 Files with no reviewable changes (5)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-instruct-2n8g-fsdp2tp2-seqpack.v1.yaml
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.sh
🧰 Additional context used
📓 Path-based instructions (7)
**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Follow the Google Shell Style Guide for all shell scripts
Use uv run to execute Python scripts in shell/driver scripts instead of activating virtualenvs and calling python directly
Add the NVIDIA copyright header (with current year) at the top of all shell scripts, excluding tests/ and test-only scripts

Files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh
tests/test_suites/llm/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

LLM driver script filenames must mirror the YAML base name and follow the same pattern with .sh extension

Files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh
tests/test_suites/**

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Place driver shell scripts and common.env under tests/test_suites// and list nightly tests in tests/test_suites/nightly.txt

Files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh
  • tests/test_suites/release.txt
examples/configs/recipes/**/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

examples/configs/recipes/**/*.yaml: Recipe YAMLs under examples/configs/recipes/** are runnable snapshots and may omit documentation
When adding support for a new model, add a recipe YAML under examples/configs/recipes/ in the appropriate domain (llm/ or vlm/) with the correct name

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
examples/configs/recipes/llm/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

LLM recipe YAML filenames must follow: --ng-[-modifiers][-long][.vN].yaml

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
examples/configs/recipes/**/*.{yaml,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Known exception: Deepscaler recipes may encode context length in place of the cluster tuple (e.g., grpo-deepscaler-1.5b-8K.*); allowed but document intended hardware in the script

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
examples/configs/recipes/**

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Place recipe YAMLs under examples/configs/recipes//

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
🧠 Learnings (1)
📚 Learning: 2025-10-12T14:46:57.171Z
Learnt from: zpqiu
PR: NVIDIA-NeMo/RL#1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:6-11
Timestamp: 2025-10-12T14:46:57.171Z
Learning: Test scripts in tests/test_suites/llm/ follow a standard configuration pattern that includes NUM_NODES, STEPS_PER_RUN, MAX_STEPS, NUM_RUNS (calculated as `$(( (MAX_STEPS + STEPS_PER_RUN - 1) / STEPS_PER_RUN ))`), and NUM_MINUTES. These variables are part of the test infrastructure's standard interface and should not be flagged as unused even if not directly referenced within the individual script, as they are consumed by external launch tooling or common.env.

Applied to files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
🪛 LanguageTool
tests/test_suites/release.txt

[grammar] ~43-~43: There might be a mistake here.
Context: ... ################ # Long 4b convergence tests/test_suites/llm/distillation-qwen3...

(QB_NEW_EN)


[grammar] ~46-~46: Use a hyphen to join words.
Context: ...-4b-base-2n8g-fsdp2tp2-long.v1.sh # 20 step functional tests on dynamic batchin...

(QB_NEW_EN_HYPHEN)


[grammar] ~46-~46: Ensure spelling is correct
Context: ... on dynamic batching, non-colocated and seqence packing features tests/test_suites/llm/...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~47-~47: There might be a mistake here.
Context: ...4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh tests/test_suites/llm/distillation-qwen3...

(QB_NEW_EN)


[grammar] ~48-~48: There might be a mistake here.
Context: ...4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh tests/test_suites/llm/distillation-qwen3...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh

(QB_NEW_EN)

🪛 Shellcheck (0.11.0)
tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh

[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh

[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh

[warning] 9-9: NUM_RUNS appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh

[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: CI quality check
  • GitHub Check: Lint check
  • GitHub Check: Lint check
  • GitHub Check: Post submodule check comment / Comment on PR
  • GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (14)
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml (1)

5-7: All checks passed; no changes needed.

  • val_batch_size is used globally by the dataloader and max_batches = max_val_samples // val_batch_size yields one eval batch as intended.
  • examples/configs/distillation_math.yaml provides optimizer and scheduler via the POLICY_BASE anchor, inherited by both policy and teacher.
  • Recipe’s checkpointing.save_period=50 versus max_num_steps=20 means no checkpoint is written—acceptable for fast release tests.
tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh (2)

10-10: NUM_MINUTES is OK; consumed by infra.

SC2034 can be ignored here; standard test infra reads this var from common.env/external tooling.

Based on learnings


38-40: Stricter thresholds look good; verify metric key presence at step 20.

Ensure validation runs at step 20 so data["validation/accuracy"]["20"] exists (val_period=10 in the YAML).

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh (2)

10-10: NUM_MINUTES is OK; consumed by infra.

SC2034 can be ignored for this suite variable.

Based on learnings


38-40: Tightened loss + added val acc check are reasonable.

Confirm val_period ensures a step-20 validation to populate the metric.

examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml (2)

5-5: val_batch_size=256: LGTM.

Matches test cadence; improves validation throughput.


12-22: Add explicit tensor_parallel_size: 2 under policy.dtensor_cfg. The base distillation_math.yaml defines &DTENSOR_BASE with tensor_parallel_size: 2, but policy.dtensor_cfg only overrides context_parallel_size—which won’t inherit root values. Append tensor_parallel_size: 2 to policy.dtensor_cfg to match the fsdp2tp2 naming.

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh (2)

10-10: NUM_MINUTES is OK; consumed by infra.

SC2034 can be ignored for this standard suite variable.

Based on learnings


38-40: Added validation accuracy check is good; ensure val at step 20.

Confirms training quality; just verify val_period hits step 20 in the YAML.

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh (2)

7-11: Run budget reductions are sensible for release cadence.

New STEPS_PER_RUN/MAX_STEPS/NUM_MINUTES align with tightened criteria.

Based on learnings


38-40: Tightened loss and added val acc at step 100: good signal.

Works with val_period=50; step 100 will have validation logged.

examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml (1)

31-37: Cluster=num_nodes=2 aligns with 2n8g tests.

Config/test naming consistency looks good.

examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml (2)

5-11: Validation and checkpoint cadence improvements look good.

val_batch_size=512 and save_period=10 suit the tightened test criteria.


14-17: Setting generation.vllm_cfg.tensor_parallel_size=2 matches fsdp2tp2.

Good alignment with file naming/strategy.

@terrykong terrykong added the CI:L0 Run doctests and unit tests label Oct 16, 2025
@zpqiu zpqiu merged commit a3b700a into r0.4.0 Oct 17, 2025
83 of 88 checks passed
@zpqiu zpqiu deleted the cherry-pick-1363-r0.4.0 branch October 17, 2025 05:07
terrykong pushed a commit that referenced this pull request Nov 19, 2025
…r0.4.0` (#1376)

Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
Co-authored-by: alexchiu <alexq@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick CI:L0 Run doctests and unit tests Run CICD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants