Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
Benchmark and documentation are ready at https://github.com/vwxyzjn/trl/blob/refactor-benchmark/benchmark/README.md. We can probably better tune some of these models for better performance in follow up PRs, including testing for deepspeed integration cc @lewtun. |
lvwerra
left a comment
There was a problem hiding this comment.
Hi @vwxyzjn, looks really good to me. The openbenchmark code itself is a bit obscure to me so if we could document well how it works that would be great. Left some comments here and there but in general happy to add it. Since this is not strictly a part of the library and more part of our test suite we can also be a bit more experimental here :)
Regarding tyro - this looks good to me in general, would also love to hear @younesbelkada feedback who maybe also knows about design decisions in transformers about the CLI.
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
…nto refactor-benchmark
|
Thanks @lvwerra! I have addressed the comments :) |
younesbelkada
left a comment
There was a problem hiding this comment.
Thank a lot for this great work , as discussed offline :D !
Feel free to merge once the CI is green
* refactor and benchmark * update code * Add accelerate logging * logs * quick fix * update config * precommit * modify training example * fix multi-gpu all_reduce error `Tensors must be CUDA and dense` * support more models and benchmark * update * add changes * upload benchmark * precommit * add tyro as a dependency * add tyro * pre-commit * precommit * weird... * lol typo * precommit * sigh * push changes * Update benchmark/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Add experiments * upload image to tag specific folder * add openrlbenchmark documentation * rename * remove unused field * precommit * push changes --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* refactor and benchmark * update code * Add accelerate logging * logs * quick fix * update config * precommit * modify training example * fix multi-gpu all_reduce error `Tensors must be CUDA and dense` * support more models and benchmark * update * add changes * upload benchmark * precommit * add tyro as a dependency * add tyro * pre-commit * precommit * weird... * lol typo * precommit * sigh * push changes * Update benchmark/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Add experiments * upload image to tag specific folder * add openrlbenchmark documentation * rename * remove unused field * precommit * push changes --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* refactor and benchmark * update code * Add accelerate logging * logs * quick fix * update config * precommit * modify training example * fix multi-gpu all_reduce error `Tensors must be CUDA and dense` * support more models and benchmark * update * add changes * upload benchmark * precommit * add tyro as a dependency * add tyro * pre-commit * precommit * weird... * lol typo * precommit * sigh * push changes * Update benchmark/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Add experiments * upload image to tag specific folder * add openrlbenchmark documentation * rename * remove unused field * precommit * push changes --------- Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>




This PR does a few refactor. Some raw thoughts:
query_dataset="imdb"reward_model="sentiment-analysis:lvwerra/distilbert-imdb", it could work with a pipeline or a trained reward modelgpt2in place oflvwerra/gpt2-imdbWe have multiple benchmark axes:
M_s(e.g., gpt2), how does that affect the performance of a target modelM_t(e.g., falcon, gptj, llama2)?We can probably have a
train.pythat can doUses
tyroto eliminate duplicate code / comments.Help text also works
more controlled terminology and tracking config
add accelerate logging
Log
global_backward_batch_sizeglobal_batch_sizeworld_size