-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BFloat16 dtype support for oneDNN Graph JIT fuser #85591
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/85591
Note: Links to docs will display an error until the docs builds have been completed. ✅ No Failures, 4 PendingAs of commit a1c08be: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This comment was marked as off-topic.
This comment was marked as off-topic.
305a249
to
142cb17
Compare
142cb17
to
249bbb1
Compare
3421818
to
5bb6ea4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XiaobingSuper, added you as a reviewer, because this PR uses a distinct ideep commit (same oneDNN commit in ideep/mkl-dnn/third_party/oneDNN
& doesn't change any ideep
file, but only changes oneDNN Graph files (ideep/mkl-dnn
). Thanks!
This comment was marked as resolved.
This comment was marked as resolved.
cc @malfet , do we remove the choice of importing the diff into internal? |
No, but the plugin has been finicky throughout the day |
@malfet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
d75504d
to
9ccd1bb
Compare
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 additional jobs have failed, first few of them are: Meta Internal-Only Changes Check Details for Dev Infra teamRaised by workflow job |
Thanks again, @frank-wei! It looks like it'd have to be imported again before merging. |
@frank-wei has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Hi @malfet, please help me with a query about the Thanks! |
I feel like once we imported this PR, it always has to sync internally whenever we want to merge it from outside. This could get improved since once the PR is on diff train, the diff is supposed to re-created internally. |
Thanks for your inputs, @frank-wei! :) If the answer to my query about the pytorchmergebot stale check is yes, then IMHO, the 3 day period seems a bit small, as it might require Meta engineers to frequently re-import PRs. But on the other hand, I guess it does seem to help keep the trunk CI greener, so maybe it was determined as being worth the trade-off. :) |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Hey @sanchitintel. |
Thanks again, @frank-wei! :) |
Note to internal oncall: this PR updates ideep, it needs to be kept in sync internally. |
Fix DOS newlines introduced by #85591
Fix DOS newlines introduced by #85591
Fix DOS newlines introduced by #85591
Fix DOS newlines introduced by #85591
Fix DOS newlines introduced by #85591
Hi @weiwangmeta, we'd be submitting a PR to update ideep again today with a new oneDNN version. I'll add you as a reviewer in that PR. Thanks! |
Fix DOS newlines introduced by #85591
Fix DOS newlines in `onednn/decompose_silu.[cpp|h]` introduced by #85591 as well as one in `.github/PULL_REQUEST_TEMPLATE.md` Pull Request resolved: #86973 Approved by: https://github.com/huydhn, https://github.com/izaitsevfb
BFloat16 dtype support for faster inference with TorchScript using oneDNN Graph
Intel Xeon Cooper Lake platform & beyond support the
AVX512_BF16
ISA, which is essentially native BFloat16 support.oneDNN Graph delivers high inference performance with BFloat16 on such machines.
While oneDNN Graph can still be used with BFloat16 on older machines that lack
avx512_bf16
ISA but supportavx512bw
,avx512vl
&avx512dq
ISAs, the BF16 performance on these older machines will be significantly poorer (probably even poorer than Float32), as they lack native BF16 support.Currently, AMP support for eager mode & JIT mode is divergent in PyTorch.
So, for using oneDNN Graph with BFloat16, eager-mode AMP should be leveraged by turning off AMP for JIT mode, using
torch._C._jit_set_autocast_mode(False)
in python code, so as to avoid conflicts.Please use the following environment variable to view JIT logs -
PYTORCH_JIT_LOG_LEVEL=">>graph_helper:>>graph_fuser:>>kernel:>>interface"
Changes being made in this PR
oneDNN
commit or theideep
files. While theideep
commit is being updated, only files pertaining to oneDNN Graph are being updated. oneDNN Graph is being upgraded to version 0.5.2 (alpha patch release 2).To put things into perspective,
ideep
is a git submodule of PyTorch.oneDNN Graph
is a git submodule ofideep
(ideep/mkl-dnn
), and oneDNN is a git submodule of oneDNN Graph (ideep/mkl-dnn/third_party/oneDNN
).OpInfo
in a subsequent PR, if that'd be possible. Should we create a list of ops from opDB that are supported by oneDNN Graph, and add it tocommon_methods_invocations.py
?Example of using oneDNN Graph with BFloat16
TorchBench based Benchmarks
URL: https://github.com/sanchitintel/benchmark/tree/onednn_graph_benchmark (instructions present at URL).
Batch-size(s): TorchBench-default for each model
Baseline : PyTorch JIT OFI FP32
Machine: Intel(R) Xeon(R) Platinum 8371HC (Cooper Lake)
Sockets used: 1
Number of cores on one socket: 26
Intel OpenMP & tcmalloc were preloaded
Benchmark results with single thread
Benchmark results with 26 threads: