Improve logging for PPO + Docs page by natolambert · Pull Request #243 · huggingface/trl

natolambert · 2023-03-22T21:13:21Z

I wanted to make it clearer in experiments the difference between the reward model score and the KL penalty. Added PR to do that :).

HuggingFaceDocBuilderDev · 2023-03-22T21:17:28Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada

Thanks a lot for this!
THe proposed changes should fix the failing tests

natolambert · 2023-03-23T15:39:32Z

Ah @younesbelkada the suggestion overwrote a needed variable. I'll fix it today, shouldn't be so bad :)

natolambert · 2023-03-23T17:11:15Z

feel free to merge when ready / happy @younesbelkada @lvwerra

lvwerra

Thanks @natolambert!

* init pr * try and fix docpreview * fix * try to fix tests * nit * fix tests * convert to tensor

init pr

05b77b6

try and fix docpreview

fd161b5

younesbelkada reviewed Mar 23, 2023

View reviewed changes

Comment thread trl/trainer/ppo_trainer.py Outdated

fix

e59dc57

natolambert force-pushed the ppo_logging branch from 914c839 to e59dc57 Compare March 23, 2023 16:13

Nathan Lambert added 4 commits March 23, 2023 09:32

try to fix tests

4048161

nit

4108de2

fix tests

b536910

convert to tensor

36fcf08

lvwerra approved these changes Mar 24, 2023

View reviewed changes

lvwerra merged commit 404621f into main Mar 24, 2023

lvwerra deleted the ppo_logging branch March 24, 2023 08:34

yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025

Improve logging for PPO + Docs page (huggingface#243)

904aa66

* init pr * try and fix docpreview * fix * try to fix tests * nit * fix tests * convert to tensor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve logging for PPO + Docs page#243

Improve logging for PPO + Docs page#243
lvwerra merged 7 commits intomainfrom
ppo_logging

natolambert commented Mar 22, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Mar 22, 2023 •

edited

Loading

Uh oh!

younesbelkada left a comment

Uh oh!

Uh oh!

natolambert commented Mar 23, 2023

Uh oh!

natolambert commented Mar 23, 2023

Uh oh!

lvwerra left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

natolambert commented Mar 22, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Mar 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

natolambert commented Mar 23, 2023

Uh oh!

natolambert commented Mar 23, 2023

Uh oh!

lvwerra left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HuggingFaceDocBuilderDev commented Mar 22, 2023 •

edited

Loading