Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the difference between PPOv2Trainer and PPOTrainer? #1763

Closed
mst272 opened this issue Jun 22, 2024 · 1 comment
Closed

What is the difference between PPOv2Trainer and PPOTrainer? #1763

mst272 opened this issue Jun 22, 2024 · 1 comment

Comments

@mst272
Copy link

mst272 commented Jun 22, 2024

What is the difference between PPOv2Trainer and PPOTrainer? And in trl\examples\scripts\ppo\ppo.py and trl\examples\scripts\ppo.py , there are two dpo.py files, can you tell me what is different between them?

@vwxyzjn
Copy link
Collaborator

vwxyzjn commented Jun 24, 2024

Hi, @mst272. The PPOv2Trainer is the new experimental PPO trainer we now recommend to the users. It's a refactor of PPOTrainer and PPOv2Trainer introduces more uniform APIs, better logging, documentations, and more benchmark results.

@vwxyzjn vwxyzjn closed this as completed Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants