You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is the difference between PPOv2Trainer and PPOTrainer? And in trl\examples\scripts\ppo\ppo.py and trl\examples\scripts\ppo.py , there are two dpo.py files, can you tell me what is different between them?
The text was updated successfully, but these errors were encountered:
Hi, @mst272. The PPOv2Trainer is the new experimental PPO trainer we now recommend to the users. It's a refactor of PPOTrainer and PPOv2Trainer introduces more uniform APIs, better logging, documentations, and more benchmark results.
What is the difference between PPOv2Trainer and PPOTrainer? And in trl\examples\scripts\ppo\ppo.py and trl\examples\scripts\ppo.py , there are two dpo.py files, can you tell me what is different between them?
The text was updated successfully, but these errors were encountered: