Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPOTrainer ignores limit_val_batches when running validation #99

Closed
gleibovich-nvidia opened this issue Feb 5, 2024 · 2 comments
Closed
Labels
bug Something isn't working

Comments

@gleibovich-nvidia
Copy link
Collaborator

There's an inconsistency between PPOTrainer and SFT and DPO trainers causing PPOTrainer to always test just a single (same) validation batch on every run, while ignoring the rest of the validation set. The goal originally was to have validation track the same set of held out samples throughout training, thus resetting the validation iterator on every validation run.

Expected behavior is to allow the user to control the size of the validation set to use for testing. One possible solution is to have the validation dataloader return mbs*dp size and have the user control how many he'd like to use for validation with the limit_val_batches as done with other trainers.

@gleibovich-nvidia gleibovich-nvidia added the bug Something isn't working label Feb 5, 2024
@odelalleau
Copy link
Collaborator

Duplicate of #27?

@gleibovich-nvidia
Copy link
Collaborator Author

Yep. Missed it.
Sorry, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants