Add deepspeed experiment by vwxyzjn · Pull Request #795 · huggingface/trl

vwxyzjn · 2023-09-19T18:16:14Z

No description provided.

vwxyzjn · 2023-09-19T18:16:37Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

github-actions · 2023-09-19T18:17:17Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239403115

HuggingFaceDocBuilderDev · 2023-09-19T18:20:48Z

The documentation is not available anymore as the PR was closed or merged.

vwxyzjn · 2023-09-19T18:25:10Z

/benchmark-trl-experiments benchmark/benchmark_level2.sh

github-actions · 2023-09-19T18:25:42Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239487870

vwxyzjn · 2023-09-19T18:41:50Z

/benchmark-trl-experiments benchmark/benchmark_level1.sh

vwxyzjn · 2023-09-19T18:41:56Z

/benchmark-trl-experiments benchmark/benchmark_level2.sh

github-actions · 2023-09-19T18:42:47Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239645721

github-actions · 2023-09-19T18:43:23Z

Benchmark on Comment: succeeded ✅
https://github.com/huggingface/trl/actions/runs/6239646225

vwxyzjn · 2023-09-19T19:35:48Z

[COSTA BENCHMARK BOT]: Here are the results

vwxyzjn · 2023-09-19T22:19:46Z

[COSTA BENCHMARK BOT]: Here are the results

vwxyzjn · 2023-09-20T13:16:13Z

Cerebras results are expected — it's training against a random reward model, so it's reward learning curve should be more chaotic.

lewtun

Thanks a lot for adding this sweet benchmark 🚀 ! I left a comment about adding a benchmark for ZeRO-3 but that can also be a separate PR if you prefer

lewtun · 2023-09-20T13:17:36Z

@@ -1,4 +1,4 @@
-# compound
+# compound: gpt2xl + grad_accu


For my own understanding, is this compound arg documented somewhere?

The compound comment simply means we are using more features at once (e.g., in this case, we are using a larger model and gradiant accumulation at the same time) :)

lewtun · 2023-09-20T13:19:44Z

+
+# compound: Cerebras-GPT-6.7B + deepspeed zero2 + grad_accu
+python benchmark/benchmark.py \
+    --command "accelerate launch --config_file examples/accelerate_configs/deepspeed_zero2.yaml examples/scripts/sentiment_tuning.py --ppo_config.exp_name sentiment_tuning_Cerebras-GPT-6.7B_grad_accu_deepspeed_stage2  --ppo_config.batch_size 32  --ppo_config.mini_batch_size 32 --ppo_config.log_with wandb --ppo_config.model_name cerebras/Cerebras-GPT-6.7B --ppo_config.reward_model sentiment-analysis:cerebras/Cerebras-GPT-6.7B" \


Eventually I think we should do the "proper" thing and fine-tune these models on IMDB so we have a genuine good policy / reward model. Of course, not necessary for this PR, but perhaps good to be as realistic as possible for the benchmark

I think that sounds good. Perhaps we can set up an end-to-end example where we train the reward model and then the policy model at the same time.

lewtun · 2023-09-20T13:20:35Z

-    --slurm-template-path benchmark/trl.slurm_template
+    --slurm-template-path benchmark/trl.slurm_template
+
+# compound: Cerebras-GPT-6.7B + deepspeed zero2 + grad_accu


Should we also benchmark ZeRO-3?

Let's probably do this in a separate PR.

* Add deepspeed experiment * add deepspeed pip install * update hello world.sh * update comments * remove cleanup

Add deepspeed experiment

ad76d65

add deepspeed pip install

8b0af81

update hello world.sh

1632baf

vwxyzjn requested a review from lewtun September 20, 2023 13:16

lewtun approved these changes Sep 20, 2023

View reviewed changes

vwxyzjn added 2 commits September 20, 2023 13:26

update comments

48ab7c9

remove cleanup

5aaeda3

vwxyzjn merged commit b8f0c4c into huggingface:main Sep 20, 2023

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024

Add deepspeed experiment (huggingface#795)

b245af2

* Add deepspeed experiment * add deepspeed pip install * update hello world.sh * update comments * remove cleanup

yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025

Add deepspeed experiment (huggingface#795)

a8ef6da

* Add deepspeed experiment * add deepspeed pip install * update hello world.sh * update comments * remove cleanup

Conversation

vwxyzjn commented Sep 19, 2023

Uh oh!

vwxyzjn commented Sep 19, 2023

Uh oh!

github-actions bot commented Sep 19, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Sep 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vwxyzjn commented Sep 19, 2023

Uh oh!

github-actions bot commented Sep 19, 2023

Uh oh!

vwxyzjn commented Sep 19, 2023

Uh oh!

vwxyzjn commented Sep 19, 2023

Uh oh!

github-actions bot commented Sep 19, 2023

Uh oh!

github-actions bot commented Sep 19, 2023

Uh oh!

vwxyzjn commented Sep 19, 2023

Uh oh!

vwxyzjn commented Sep 19, 2023

Uh oh!

vwxyzjn commented Sep 20, 2023

Uh oh!

lewtun left a comment

Choose a reason for hiding this comment

Uh oh!

lewtun Sep 20, 2023

Choose a reason for hiding this comment

Uh oh!

vwxyzjn Sep 20, 2023

Choose a reason for hiding this comment

Uh oh!

lewtun Sep 20, 2023

Choose a reason for hiding this comment

Uh oh!

vwxyzjn Sep 20, 2023

Choose a reason for hiding this comment

Uh oh!

lewtun Sep 20, 2023

Choose a reason for hiding this comment

Uh oh!

vwxyzjn Sep 20, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HuggingFaceDocBuilderDev commented Sep 19, 2023 •

edited

Loading