Skip to content

Conversation

xmfan
Copy link
Member

@xmfan xmfan commented Sep 1, 2023

Adding support to pass rank and world_size to torchbench model, via its extra_args parameter: https://github.com/pytorch/benchmark/blob/main/torchbenchmark/util/model.py#L83C80-L83C90

This is used for models which distribute over multiple GPUs e.g. simple_gpt pytorch/benchmark#1867

Also add an option to skip multiprocess only gpu models

Testing via python benchmarks/dynamo/torchbench.py -d cuda --output=benchmark_logs/performance.csv --inference --performance --timing --print-memory --multiprocess --only simple_gpt

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @chenyang78 @aakhundov @kadeng @anijain2305

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 1, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108438

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 20aad90 with merge base 8851603 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@xmfan xmfan force-pushed the xmfan/distributed_torchbench branch 2 times, most recently from 39b6fcc to e5ba94d Compare September 1, 2023 21:33
@xmfan xmfan requested review from Chillee and H-Huang September 1, 2023 21:37
@xmfan xmfan changed the title wip torchbench Forward rank and world size info to Torchbench models when using dynamo runner Sep 1, 2023
@xmfan xmfan marked this pull request as ready for review September 1, 2023 21:38
@xmfan
Copy link
Member Author

xmfan commented Sep 5, 2023

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Sep 5, 2023
@xmfan
Copy link
Member Author

xmfan commented Sep 5, 2023

Failures about unhandled extra_args should be resolved once the changes from torchbenchmark/util/extra_args.py https://github.com/pytorch/benchmark/pull/1867/files#diff-6ccf656b90a64ee9ee4e55aec794320710e717b65271baeae74e69940524bb6a land

try:
with tqdm(desc="loading model"):
extra_args = []
if hasattr(args, "rank") and hasattr(args, "world_size"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to update the hash with your torchbench changes on they land in order for the CI to pick it up

https://github.com/pytorch/pytorch/blob/main/.github/ci_commit_pins/torchbench.txt

facebook-github-bot pushed a commit to pytorch/benchmark that referenced this pull request Sep 8, 2023
Summary:
Adds simple_gpt + DTensor implemented in meta-pytorch/simple_gpt#7 to torchbench

Tested via `python benchmarks/dynamo/torchbench.py -d cuda --output-directory=benchmark_logs --output=performance.csv --inference --performance --timing --print-memory --multiprocess --nothing --only simple_gpt`. Note: --nothing is used here to disable compile, since DTensor + compile isn't yet supported in main

```
dev,name,batch_size,speedup,abs_latency,compilation_latency,compression_ratio,eager_peak_mem,dynamo_peak_mem,calls_captured,unique_graphs,graph_breaks,unique_graph_breaks
cuda,simple_gpt,1,0.966153,196.819773,-0.059319,1.000000,4.576880,4.576880,0,0,0,0
cuda,simple_gpt,1,0.967389,196.608152,-0.058833,1.000000,4.577404,4.577404,0,0,0,0
cuda,simple_gpt,1,0.973152,196.093583,-0.059316,1.000000,4.593133,4.593133,0,0,0,0
cuda,simple_gpt,1,0.973087,196.124046,-0.075580,1.000000,4.611483,4.611483,0,0,0,0
cuda,simple_gpt,1,0.967908,193.998484,-0.040192,1.000000,4.593133,4.593133,0,0,0,0
cuda,simple_gpt,1,0.968949,193.798088,-0.028878,1.000000,4.593133,4.593133,0,0,0,0
```

2 changes were required to the model:
- decorate torch.no_grad() on the caches, previously this was done outside the model, the entire eval call was wrapped in a torch.no_grad() context. After using torchbench, I notice even with only inference mode, we don't disable gradient calculations
- rank/world size, added support from torchbench side in pytorch/pytorch#108438 and updated model to fetch from the provided extra_args

Pull Request resolved: #1867

Reviewed By: msaroufim

Differential Revision: D49065244

Pulled By: xmfan

fbshipit-source-id: d4709fa3997c6a25c75e87eff7c13492b370b1af
@xmfan xmfan requested a review from a team as a code owner September 8, 2023 21:41
@xmfan xmfan force-pushed the xmfan/distributed_torchbench branch from 266257b to 20aad90 Compare September 13, 2023 21:00
@xmfan
Copy link
Member Author

xmfan commented Sep 14, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 14, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions github-actions bot deleted the xmfan/distributed_torchbench branch March 23, 2025 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants