-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[torchbench] The official benchmark for performance and accuracy check #7040
Comments
@zpcore Can you provide more details on how to run torchbench with pytorch/xla? |
Here is the configuration script we use to run torchbench on TPU/GPU: https://github.com/GoogleCloudPlatform/ml-auto-solutions/blob/master/dags/pytorch_xla/configs/pytorchxla_torchbench_config.py. For example when targetting tpu, get_torchbench_tpu_config() is the main entry function that constructs all the commands including installing dependencies and torchbench models, run torchbench and upload result to gcs bucket (you may not need). Similar to GPU but all commands are running in our torch_xla GPU release docker. |
I don't think they support openxla backend in native torchbench. We need to move model to xla devices, which is handled in https://github.com/pytorch/xla/tree/master/benchmarks.
We don't have plan to add the accuracy metric at this time. |
Thank you for your reply. I did not use Google Cloud. It seems that this mainly uses the running method of |
Okay, I think accuracy check is probably quite important. FYI, I modified torchbench under pytorch/benchmarks/dynamo to utilize its accuracy check method for correctness verification (I have moved both the model and example_inputs to the xla device). Below is the correctness verification I did for the remaining models after excluding examples that still could not run with pytorch/xla within torchbench. This may be a helpful reference for investigating correctness issues with pytorch/xla: Environment: NVIDIA A100 80G with CUDA 12.1 Experimental Control groups: 1. dynamo-openxla vs eager 2. dynamo-inductor vs eager ./benchmarks/dynamo/torchbench.py --device=cuda --iterations-per-run=1 --output=torchbench_training_fp32_xla.csv --output-directory=./reports_only --trace-on-xla --backend=openxla --accuracy --train --iterations=10 --xla-tolerance 0.1 --only=dcgan --float32 List the three control groups in the table, where:
Let's note: Exp1. dynamo-openxla tolerance=0.01 Comparing Exp1 and Exp3, it can be observed that under the same tolerance, dynamo inductor shows a performance very close to eager in terms of accuracy, while dynamo openxla has a significant accuracy difference. Contrasting Exp1 and Exp2, it is noticeable that when relaxing the accuracy tolerance threshold to a relatively high level, some cases can pass. However, cases that still cannot pass under Exp2 are likely due to bugs in the compilation, such as the issue mentioned in #7042, which is present in most |
There are a couple possibility
I think the easiest way to check is to use LazyTensor to run the model(pretty much just drop BTW we don't really recommend user to do |
❓ Questions and Help
Hi I found two available codebases for testing torchbench with pytorch/xla:
However for the first codebase, it seems the support for dynamo + openxla backend would not trigger xla compilation actually. Is it no longer maintained?
And for the second one, I found it is able to test the performance, but has no way to validate the accuracy comparing to eager mode, while the first benchmark tool is able to do that. Any support for this?
Looking forward to your feedback.
The text was updated successfully, but these errors were encountered: