Standardise example scripts by lewtun · Pull Request #842 · huggingface/trl

lewtun · 2023-10-07T13:35:40Z

This PR standardises all the example scripts to follow the run_xxx.py convention, where xxx typically refers to the algorithm instead of the task (i.e. have just 1 PPO example instead of calling it "sentiment tuning"). The resulting structure is as follows:

examples/scripts
├── run_ddpo.py
├── run_dpo.py
├── run_ppo.py
├── run_ppo_multi_adapter.py
├── run_reward_modeling.py
└── run_sft.py

IMO this makes it a bit easier for newcomers to know what each script does by filename instead of guessing whether e.g. multi adapter RL refers to PPO or something else.

I also deleted an old and duplicate multi adapter RL script multi_adapter_rl.py which seems to be outdated.

Eventually, we could harmonize the scripts so that the SFT and reward models produced by run_sft.py and run_reward_modeling.py are the same ones that feed into run_ppo.py and run_dpo.py. This would give a true end to end pipeline that is maintained & solid for many people to work from :)

HuggingFaceDocBuilderDev · 2023-10-07T13:41:00Z

The documentation is not available anymore as the PR was closed or merged.

vwxyzjn · 2023-10-09T12:56:24Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-10-09T12:59:00Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6456929199

vwxyzjn

Love the standardization! Very nice change. I assume multi_adapter_rl.py is deprecated in favor of multi_adapter_rl_v2.py (the now run_ppo_multi_adapter.py)?

lewtun · 2023-10-09T13:35:52Z

Love the standardization! Very nice change. I assume multi_adapter_rl.py is deprecated in favor of multi_adapter_rl_v2.py (the now run_ppo_multi_adapter.py)?

Yes, that's correct!

vwxyzjn · 2023-10-09T14:43:38Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-10-09T14:44:57Z

Benchmark on Comment: failed ❌
https://github.com/huggingface/trl/actions/runs/6458185139

vwxyzjn · 2023-10-09T14:46:15Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-10-09T14:47:02Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6458212548

vwxyzjn · 2023-10-09T15:37:17Z

[COSTA BENCHMARK BOT]: Here are the results

lvwerra

Generally looks great, thanks! Small nit: I don't like the run_xxx.py naming that much, I think just xxx.py would do the job and be less redundant.

lewtun · 2023-10-11T13:54:06Z

Generally looks great, thanks! Small nit: I don't like the run_xxx.py naming that much, I think just xxx.py would do the job and be less redundant.

Good idea! Done in a6d1d90

I'll merge if all the tests still pass

vwxyzjn · 2023-10-11T15:11:35Z

LG!

* Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>

* enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (#856) * Standardise example scripts (#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (#853) * dont use get_peft_model if model is already peft (#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>

* Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>

* enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (huggingface#856) * Standardise example scripts (huggingface#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (huggingface#853) * dont use get_peft_model if model is already peft (huggingface#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>

* Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>

* enable xpu support * fix bug * review commits * fix style * add xou decorator * refactor review commit * fix test * review commit * fix test * Update benchmark.yml (huggingface#856) * Standardise example scripts (huggingface#842) * Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com> * Fix version check in import_utils.py (huggingface#853) * dont use get_peft_model if model is already peft (huggingface#857) * merge conflict * add xou decorator * resolve * resolves * upstream * refactor and precommit * fix new tests * add device mapping for xpu --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Costa Huang <costa.huang@outlook.com> Co-authored-by: Adam Pauls <adpauls@gmail.com> Co-authored-by: abhishek thakur <1183441+abhishekkrthakur@users.noreply.github.com>

Standardise example scripts

17f6266

lewtun requested review from lvwerra, vwxyzjn and younesbelkada October 9, 2023 13:04

vwxyzjn reviewed Oct 9, 2023

View reviewed changes

Merge branch 'main' into order-scripts

f9fb3be

vwxyzjn approved these changes Oct 9, 2023

View reviewed changes

fix plotting script

b705d7d

lvwerra approved these changes Oct 10, 2023

View reviewed changes

Rename run_xxx to xxx

a6d1d90

lewtun commented Oct 11, 2023

View reviewed changes

Comment thread docs/source/sentiment_tuning.mdx Outdated

Fix doc

1083a9f

lewtun merged commit ddd3188 into main Oct 11, 2023

lewtun deleted the order-scripts branch October 11, 2023 15:28

neo mentioned this pull request Oct 11, 2023

Update example script link for ddpo huggingface/blog#1573

Merged

pcuenca pushed a commit to huggingface/blog that referenced this pull request Oct 20, 2023

Update example script link for ddpo (huggingface/trl#842) (#1573)

18b4484

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024

Standardise example scripts (huggingface#842)

ffd35ff

* Standardise example scripts * fix plotting script * Rename run_xxx to xxx * Fix doc --------- Co-authored-by: Costa Huang <costa.huang@outlook.com>

Conversation

lewtun commented Oct 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vwxyzjn commented Oct 9, 2023

Uh oh!

github-actions bot commented Oct 9, 2023

Uh oh!

vwxyzjn left a comment

Choose a reason for hiding this comment

Uh oh!

lewtun commented Oct 9, 2023

Uh oh!

vwxyzjn commented Oct 9, 2023

Uh oh!

github-actions bot commented Oct 9, 2023

Uh oh!

vwxyzjn commented Oct 9, 2023

Uh oh!

github-actions bot commented Oct 9, 2023

Uh oh!

vwxyzjn commented Oct 9, 2023

Uh oh!

lvwerra left a comment

Choose a reason for hiding this comment

Uh oh!

lewtun commented Oct 11, 2023

Uh oh!

Uh oh!

vwxyzjn commented Oct 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lewtun commented Oct 7, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 7, 2023 •

edited

Loading