Skip to content

feat: add lora config for dpo dtensor backend#1826

Merged
terrykong merged 3 commits intomainfrom
ruit/dpo_lora
Feb 3, 2026
Merged

feat: add lora config for dpo dtensor backend#1826
terrykong merged 3 commits intomainfrom
ruit/dpo_lora

Conversation

@RayenTian
Copy link
Contributor

@RayenTian RayenTian commented Jan 26, 2026

closes #1680

Result

image

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features

    • Introduced Low-Rank Adaptation (LoRA) configuration options with customizable parameters for model optimization
  • Tests

    • Expanded functional test coverage with new DPO workflow tests including LoRA-based scenarios to ensure reliability

✏️ Tip: You can customize this high-level summary in your review settings.

@RayenTian RayenTian marked this pull request as ready for review January 26, 2026 06:39
@RayenTian RayenTian requested review from a team as code owners January 26, 2026 06:39
@RayenTian RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Jan 26, 2026
@RayenTian RayenTian requested a review from yuki-97 January 26, 2026 06:39
@RayenTian RayenTian requested a review from terrykong January 26, 2026 06:39
@RayenTian RayenTian removed the CI:L1 Run doctests, unit tests, and functional tests label Jan 26, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

This PR adds LoRA configuration options to the DPO example configuration file and introduces new functional GPU tests, including a test script for LoRA-based automodel DPO training with comprehensive test orchestration including metric validation.

Changes

Cohort / File(s) Summary
LoRA Configuration
examples/configs/dpo.yaml
Adds new LoRA settings block under policy.dtensor_cfg with parameters: enabled, target_modules, exclude_modules, match_all_linear, dim, alpha, dropout, dropout_position, lora_A_init, and use_triton
Test Registration
tests/functional/L1_Functional_Tests_GPU.sh
Registers two new functional tests: dpo_automodel_lora.sh and dpo_megatron.sh in the GPU test suite execution sequence
New Test Script
tests/functional/dpo_automodel_lora.sh
Implements end-to-end DPO LoRA test orchestration: environment setup, runs DPO automodel training with LoRA enabled, captures metrics to JSON, and validates training loss at step 3 is below 0.8

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

CI:L1

Suggested reviewers

  • terrykong
  • yuki-97
  • joyang-nv
🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR introduces major changes but PR description lacks documented test results, convergence validation, or performance information despite adding functional tests. Update PR description with: (1) Results from running dpo_automodel_lora.sh and dpo_megatron.sh tests including training loss and convergence, (2) Confirmation of no regression in numerics/convergence, (3) Performance measurements. Fix bash syntax error in line 11: change set -eou pipefail to set -euo pipefail.
Title check ❓ Inconclusive The PR title mentions adding a LoRA config for DPO dtensor backend, which aligns with the primary change (adding LoRA configuration to dpo.yaml), but conflicts with PR objectives that describe SFT (supervised fine-tuning) configuration. Clarify whether the PR is for LoRA or SFT configuration. The title says 'lora config' but the PR objectives mention 'SFT config.' Update the title and objectives to be consistent.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ruit/dpo_lora

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@RayenTian RayenTian changed the title feat: add sft config for dpo dtensor backend feat: add lora config for dpo dtensor backend Jan 26, 2026
@RayenTian RayenTian added the CI:L1 Run doctests, unit tests, and functional tests label Jan 26, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/functional/dpo_automodel_lora.sh`:
- Line 11: The shell script uses an incorrectly ordered set invocation "set -eou
pipefail" where the -o flag's argument must immediately follow, causing an
invalid option; update the command to "set -euo pipefail" so -e and -u are
enabled and -o pipefail is applied (replace the existing set -eou pipefail
invocation).

yuki-97
yuki-97 previously approved these changes Jan 26, 2026
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 28, 2026
yuki-97
yuki-97 previously approved these changes Jan 28, 2026
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 29, 2026
terrykong
terrykong previously approved these changes Feb 2, 2026
@terrykong terrykong enabled auto-merge (squash) February 2, 2026 22:00
@terrykong terrykong added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Feb 2, 2026
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
@RayenTian RayenTian dismissed stale reviews from terrykong and yuki-97 via d5dd46c February 3, 2026 02:44
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Feb 3, 2026
terrykong
terrykong previously approved these changes Feb 3, 2026
Signed-off-by: ruit <ruit@nvidia.com>
@RayenTian RayenTian added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Feb 3, 2026
@terrykong terrykong merged commit c9db946 into main Feb 3, 2026
41 of 42 checks passed
@terrykong terrykong deleted the ruit/dpo_lora branch February 3, 2026 13:52
@yuki-97
Copy link
Contributor

yuki-97 commented Feb 3, 2026

thanks a lot for helping fix run_grpo_math.py stuffs in #1841 for newly added scripts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DPO LoRa

3 participants