-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade submodule oneDNN to v3.3.6 #122164
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/122164
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit d637dae with merge base ae983d2 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to provide more details (e.g., links to the issues) on the motivation of this upgrade.
Thanks. I have added links. |
I have tested aarch64 linux for torch.compile and #115482 |
@snadampal Thanks for the results. Are you also able to validate on Apple Silicon? We don't have such platforms. |
@Xia-Weiwen , no, i don't have the platform either. |
@milpuz01, as you will also do some test on ARM platforms, could you help share your test result when available? Then we can have broader coverage to verify this upgrade. |
@malfet and @atalman, we've finished the upgrade test with test scope and result pasted in the PR description. And @milpuz01 from ARM will also help test from ARM platform perspective to verify the upgrade. Could you review and see whether the test scope is good enough for you? (One missing part is that we don't have Apple platform coverage due to hardware not available) And there is another issue marked by @albanD as for PT 2.3 -- #120982. OneDNN team seems have fixed this issue at main branch. Do you suggest to have this fix included for PT 2.3 or not? If yes, we need OneDNN team to backport this fix into v3.3 branch and tag a new version again, and a new round of test to verify the new version, which will consume more additional weeks, not sure it will be a good fit for PT 2.3 schedule. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: .github/workflows/trunk.yml / macos-12-py3-arm64-mps / test (mps, 1, 1, macos-m1-stable) Details for Dev Infra teamRaised by workflow job |
Looks like the macos-arm64 failure is not related to this PR, present on the base as well. |
@pytorchbot merge -f "All required changes are passing" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
thanks for merging it @atalman . |
As the title. Including issue fixes for aarch64: - oneapi-src/oneDNN#1831 - oneapi-src/oneDNN#1834 --- ## Validation results (on Intel CPU + Linux) **Static quantization with Inductor on CV models** Quant method | Geomean throughput ratio (v3.3.6/baseline) -- | -- ptq | 0.982937 ptq (cpp wrapper) | 0.978384 qat | 0.978828 **Torchbench cpu userbenchmark with Inductor** Items | Perf Geomean Ratio (v3.3.6/baseline) -- | -- eager_throughtput_bf16_infer | 1.00x eager_throughtput_fp32_infer | 1.00x jit_llga_throughtput_amp_bf16 | 1.01x jit_llga_throughtput_fp32 | 1.00x eager_throughtput_fx_int8 | 1.00x eager_throughtput_bf16_train | 1.46x eager_throughtput_fp32_train | 1.41x **Dynamo benchmarks tests** Precision | Shape | Wrapper | Thread | Eager old/new GEOMEAN | Inductor old/new GEOMEAN -- | -- | -- | -- | -- | -- Float32 | Static | Default | Multiple | 1.003836812 | 1.003425 Float32 | Static | Default | Single | 1.000181451 | 0.999611 Float32 | Dynamic | Default | Multiple | 1.003980183 | 1.006563 Float32 | Dynamic | Default | Single | 1.000076939 | 0.999969 AMP | Static | Default | Multiple | 0.996824772 | 0.998715 AMP | Static | Default | Single | 0.996402574 | 1.001483 AMP | Dynamic | Default | Multiple | 0.994919866 | 1.000467 AMP | Dynamic | Default | Single | 0.9962054 | 1.000767 (on Aarch64) pytorch#122164 (comment) --- Pull Request resolved: pytorch#122164 Approved by: https://github.com/snadampal, https://github.com/malfet, https://github.com/atalman
Thank you all for helping land this PR. I have cherry-picked it for release/2.3 here: #122930 |
As the title. Including issue fixes for aarch64: - oneapi-src/oneDNN#1831 - oneapi-src/oneDNN#1834 --- ## Validation results (on Intel CPU + Linux) **Static quantization with Inductor on CV models** Quant method | Geomean throughput ratio (v3.3.6/baseline) -- | -- ptq | 0.982937 ptq (cpp wrapper) | 0.978384 qat | 0.978828 **Torchbench cpu userbenchmark with Inductor** Items | Perf Geomean Ratio (v3.3.6/baseline) -- | -- eager_throughtput_bf16_infer | 1.00x eager_throughtput_fp32_infer | 1.00x jit_llga_throughtput_amp_bf16 | 1.01x jit_llga_throughtput_fp32 | 1.00x eager_throughtput_fx_int8 | 1.00x eager_throughtput_bf16_train | 1.46x eager_throughtput_fp32_train | 1.41x **Dynamo benchmarks tests** Precision | Shape | Wrapper | Thread | Eager old/new GEOMEAN | Inductor old/new GEOMEAN -- | -- | -- | -- | -- | -- Float32 | Static | Default | Multiple | 1.003836812 | 1.003425 Float32 | Static | Default | Single | 1.000181451 | 0.999611 Float32 | Dynamic | Default | Multiple | 1.003980183 | 1.006563 Float32 | Dynamic | Default | Single | 1.000076939 | 0.999969 AMP | Static | Default | Multiple | 0.996824772 | 0.998715 AMP | Static | Default | Single | 0.996402574 | 1.001483 AMP | Dynamic | Default | Multiple | 0.994919866 | 1.000467 AMP | Dynamic | Default | Single | 0.9962054 | 1.000767 (on Aarch64) #122164 (comment) --- Pull Request resolved: #122164 Approved by: https://github.com/snadampal, https://github.com/malfet, https://github.com/atalman
As the title. Including issue fixes for aarch64: - oneapi-src/oneDNN#1831 - oneapi-src/oneDNN#1834 --- ## Validation results (on Intel CPU + Linux) **Static quantization with Inductor on CV models** Quant method | Geomean throughput ratio (v3.3.6/baseline) -- | -- ptq | 0.982937 ptq (cpp wrapper) | 0.978384 qat | 0.978828 **Torchbench cpu userbenchmark with Inductor** Items | Perf Geomean Ratio (v3.3.6/baseline) -- | -- eager_throughtput_bf16_infer | 1.00x eager_throughtput_fp32_infer | 1.00x jit_llga_throughtput_amp_bf16 | 1.01x jit_llga_throughtput_fp32 | 1.00x eager_throughtput_fx_int8 | 1.00x eager_throughtput_bf16_train | 1.46x eager_throughtput_fp32_train | 1.41x **Dynamo benchmarks tests** Precision | Shape | Wrapper | Thread | Eager old/new GEOMEAN | Inductor old/new GEOMEAN -- | -- | -- | -- | -- | -- Float32 | Static | Default | Multiple | 1.003836812 | 1.003425 Float32 | Static | Default | Single | 1.000181451 | 0.999611 Float32 | Dynamic | Default | Multiple | 1.003980183 | 1.006563 Float32 | Dynamic | Default | Single | 1.000076939 | 0.999969 AMP | Static | Default | Multiple | 0.996824772 | 0.998715 AMP | Static | Default | Single | 0.996402574 | 1.001483 AMP | Dynamic | Default | Multiple | 0.994919866 | 1.000467 AMP | Dynamic | Default | Single | 0.9962054 | 1.000767 (on Aarch64) pytorch#122164 (comment) --- Pull Request resolved: pytorch#122164 Approved by: https://github.com/snadampal, https://github.com/malfet, https://github.com/atalman
As the title. Including issue fixes for aarch64:
Validation results
(on Intel CPU + Linux)
Static quantization with Inductor on CV models
Torchbench cpu userbenchmark with Inductor
Dynamo benchmarks tests
(on Aarch64)
#122164 (comment)
cc @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen