Skip to content
This repository was archived by the owner on Nov 19, 2025. It is now read-only.

Conversation

@jveronvialard
Copy link
Collaborator

What does this PR do ?

This PR enables reward model training with validation_drop_last =False.

Changelog

  • Please update the CHANGELOG.md under next version with high level changes in this PR.

Usage

python ${NEMO_ALIGNER_PATH}/examples/nlp/gpt/train_reward_model.py \
    ...
    ++model.data.validation_drop_last=False \
    ...

Before your PR is "Ready for review"

Pre checks:

Checklist when contributing a new algorithm

  • Does the trainer resume and restore model state all states?
  • Does the trainer support all parallelism techniques(PP, TP, DP)?
  • Does the trainer support max_steps=-1 and validation?
  • Does the trainer only call APIs defined in alignable_interface.py?
  • Does the trainer have proper logging?

Additional Information

  • Related to # (issue)

Signed-off-by: Julien Veron Vialard <jveronvialar@nvidia.com>
Signed-off-by: Julien Veron Vialard <jveronvialar@nvidia.com>
for more information, see https://pre-commit.ci

Signed-off-by: NeMo-Aligner CI <nemo-aligner-ci@nvidia.com>
@jveronvialard jveronvialard added the Run CICD Set + un-set to retrigger (add after r*.*.* labels) label Apr 10, 2025
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks! Just a few minor suggestions then should be good to go

Signed-off-by: Julien Veron Vialard <jveronvialar@nvidia.com>
…com:NVIDIA/NeMo-Aligner into jveronvialar/rm-not-dropping-last-val-batch
@jveronvialard jveronvialard added Run CICD Set + un-set to retrigger (add after r*.*.* labels) and removed Run CICD Set + un-set to retrigger (add after r*.*.* labels) labels Apr 10, 2025
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go, thanks!

@odelalleau odelalleau merged commit 9faab40 into main Apr 11, 2025
24 checks passed
@odelalleau odelalleau deleted the jveronvialar/rm-not-dropping-last-val-batch branch April 11, 2025 02:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Algorithms Run CICD Set + un-set to retrigger (add after r*.*.* labels)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants