-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Forward rank and world size info to Torchbench models when using dynamo runner #108438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108438
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 20aad90 with merge base 8851603 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
39b6fcc
to
e5ba94d
Compare
@pytorchbot label "topic: not user facing" |
Failures about unhandled extra_args should be resolved once the changes from torchbenchmark/util/extra_args.py https://github.com/pytorch/benchmark/pull/1867/files#diff-6ccf656b90a64ee9ee4e55aec794320710e717b65271baeae74e69940524bb6a land |
try: | ||
with tqdm(desc="loading model"): | ||
extra_args = [] | ||
if hasattr(args, "rank") and hasattr(args, "world_size"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to update the hash with your torchbench changes on they land in order for the CI to pick it up
https://github.com/pytorch/pytorch/blob/main/.github/ci_commit_pins/torchbench.txt
Summary: Adds simple_gpt + DTensor implemented in meta-pytorch/simple_gpt#7 to torchbench Tested via `python benchmarks/dynamo/torchbench.py -d cuda --output-directory=benchmark_logs --output=performance.csv --inference --performance --timing --print-memory --multiprocess --nothing --only simple_gpt`. Note: --nothing is used here to disable compile, since DTensor + compile isn't yet supported in main ``` dev,name,batch_size,speedup,abs_latency,compilation_latency,compression_ratio,eager_peak_mem,dynamo_peak_mem,calls_captured,unique_graphs,graph_breaks,unique_graph_breaks cuda,simple_gpt,1,0.966153,196.819773,-0.059319,1.000000,4.576880,4.576880,0,0,0,0 cuda,simple_gpt,1,0.967389,196.608152,-0.058833,1.000000,4.577404,4.577404,0,0,0,0 cuda,simple_gpt,1,0.973152,196.093583,-0.059316,1.000000,4.593133,4.593133,0,0,0,0 cuda,simple_gpt,1,0.973087,196.124046,-0.075580,1.000000,4.611483,4.611483,0,0,0,0 cuda,simple_gpt,1,0.967908,193.998484,-0.040192,1.000000,4.593133,4.593133,0,0,0,0 cuda,simple_gpt,1,0.968949,193.798088,-0.028878,1.000000,4.593133,4.593133,0,0,0,0 ``` 2 changes were required to the model: - decorate torch.no_grad() on the caches, previously this was done outside the model, the entire eval call was wrapped in a torch.no_grad() context. After using torchbench, I notice even with only inference mode, we don't disable gradient calculations - rank/world size, added support from torchbench side in pytorch/pytorch#108438 and updated model to fetch from the provided extra_args Pull Request resolved: #1867 Reviewed By: msaroufim Differential Revision: D49065244 Pulled By: xmfan fbshipit-source-id: d4709fa3997c6a25c75e87eff7c13492b370b1af
266257b
to
20aad90
Compare
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Adding support to pass rank and world_size to torchbench model, via its extra_args parameter: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/model.py#L83C80-L83C90
This is used for models which distribute over multiple GPUs e.g. simple_gpt pytorch/benchmark#1867
Also add an option to skip multiprocess only gpu models
Testing via
python benchmarks/dynamo/torchbench.py -d cuda --output=benchmark_logs/performance.csv --inference --performance --timing --print-memory --multiprocess --only simple_gpt
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @chenyang78 @aakhundov @kadeng @anijain2305