Rayless sft val loss #5

maharajamihir · 2025-12-22T17:25:47Z

No description provided.

Copilot

Pull request overview

This PR adds validation loss calculation capability to the SFT training script, enabling periodic evaluation on a separate validation dataset during training.

Key Changes:

Introduces validation data source initialization and periodic validation loss calculation
Adds --val-prompt-data, --val-interval, and --val-steps command-line arguments
Implements calculate_val_loss() and _val_step() methods for validation execution

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 11 comments.

File	Description
train_sft.py	Adds validation loss calculation methods, updates data source initialization to support separate validation data, and integrates periodic validation into the training loop
miles/rollout/data_source.py	Modifies RolloutDataSource constructor to accept optional prompt_data parameter, enabling reuse for both training and validation datasets
miles/utils/arguments.py	Adds three new command-line arguments for configuring validation: data path, interval, and number of steps
scripts/run-sft-torchrun.sh	Provides example usage of the new validation arguments with validation data path and configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

train_sft.py

miles/utils/arguments.py

train_sft.py

miles/utils/arguments.py

emergenz · 2025-12-23T16:48:45Z

scripts/run-sft-torchrun.sh

 SFT_ARGS=(
   --rollout-function-path miles.rollout.sft_rollout.generate_rollout
   --prompt-data /fast/project/HFMI_SynergyUnit/tab_model/huggingface/nemo_hf_part_jsonl_4k_tokens.parquet
+   --val-prompt-data /fast/project/HFMI_SynergyUnit/tab_model/huggingface/nemo_hf_part_jsonl_4k_tokens_validation.parquet


Use jsonl for now.

miles/utils/arguments.py

miles/rollout/data_source.py

train_sft.py

Co-authored-by: Franz Srambical <79149449+emergenz@users.noreply.github.com>

emergenz · 2025-12-23T18:43:48Z

miles/utils/arguments.py

                type=str,
                default=None,
                help=(
                    "The path to the prompt data. "


Ah shit sorry, I didn't see this. In that case, let's add The path to the validation prompt data. below.

emergenz · 2025-12-23T18:45:48Z

scripts/run-sft-torchrun.sh

   --prompt-data /fast/project/HFMI_SynergyUnit/tab_model/huggingface/nemo_hf_part_jsonl_4k_tokens.parquet
+   --val-prompt-data /fast/project/HFMI_SynergyUnit/tab_model/huggingface/nemo_hf_part_jsonl_4k_tokens_validation.parquet
+   --val-interval 100
+   --val-steps 50


That sounds like a lot of steps for an interval of 100.

train_sft.py

maharajamihir added 2 commits December 22, 2025 18:23

implemented val loss for rayless branch

ec67eaa

cosmetic fix

23e361f

maharajamihir requested review from Copilot and emergenz December 22, 2025 17:25

maharajamihir self-assigned this Dec 22, 2025

Copilot started reviewing on behalf of maharajamihir December 22, 2025 17:26 View session

Copilot AI reviewed Dec 22, 2025

View reviewed changes

minor fix, add docstring and remove profiling during val loss

4d1210a

emergenz reviewed Dec 23, 2025

View reviewed changes

train_sft.py Outdated Show resolved Hide resolved

emergenz reviewed Dec 23, 2025

View reviewed changes

miles/utils/arguments.py Outdated Show resolved Hide resolved

emergenz reviewed Dec 23, 2025

View reviewed changes

miles/utils/arguments.py Outdated Show resolved Hide resolved

emergenz reviewed Dec 23, 2025

View reviewed changes

miles/utils/arguments.py Outdated Show resolved Hide resolved

emergenz reviewed Dec 23, 2025

View reviewed changes

miles/utils/arguments.py Outdated Show resolved Hide resolved

emergenz reviewed Dec 23, 2025

View reviewed changes

miles/rollout/data_source.py Show resolved Hide resolved

emergenz requested changes Dec 23, 2025

View reviewed changes

train_sft.py Show resolved Hide resolved

maharajamihir and others added 3 commits December 23, 2025 19:32

Apply suggestions from code review; mostly docstrings and comments

9112e08

Co-authored-by: Franz Srambical <79149449+emergenz@users.noreply.github.com>

Apply suggestion from @emergenz

2239baf

Co-authored-by: Franz Srambical <79149449+emergenz@users.noreply.github.com>

fix: only initialize val rolloutmanager if val data is provided

1a44568

emergenz reviewed Dec 23, 2025

View reviewed changes

emergenz approved these changes Dec 23, 2025

View reviewed changes

train_sft.py Show resolved Hide resolved

maharajamihir added 4 commits December 23, 2025 20:04

fix bug and modify message for val prompt data

a6bb3a6

adjust num steps for realistic run

6738abd

change defaults in run script

07e5ab4

set num rollout to 1k

d568331

maharajamihir merged commit 18ab1d2 into rayless-sft Dec 24, 2025

emergenz pushed a commit that referenced this pull request Dec 31, 2025

feat: Rayless sft val loss (#5)

8eb054f

Rayless sft val loss #5

Rayless sft val loss #5

Uh oh!

Conversation

maharajamihir commented Dec 22, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

emergenz Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

emergenz Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

emergenz Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants