-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update quantization doc: add x86 backend as default backend of server inference #86794
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86794
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit c3d8613: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ghstack-source-id: e3f8c232ce566213b31fd5448a07a86ca41a0d3f Pull Request resolved: #86794
…ption" [ghstack-poisoned]
ghstack-source-id: 4bb0ca12ffe3465d9970b4c6698fac29c9bc6289 Pull Request resolved: #86794
Since x86 is now the default qengine replacing fbgemm, can we only recommend x86 for server CPUs? We can leave a note that fbgemm is still available but not recommended. |
…ption" [ghstack-poisoned]
ghstack-source-id: 08dd01ebb39df926006c3126cde54d300844381b Pull Request resolved: #86794
…ption" [ghstack-poisoned]
ghstack-source-id: 5a2fbb19f228039521387b824bd9c35bba7aca62 Pull Request resolved: #86794
docs/source/quantization.rst
Outdated
@@ -742,30 +748,31 @@ Backend/Hardware Support | |||
|
|||
Today, PyTorch supports the following backends for running quantized operators efficiently: | |||
|
|||
* x86 CPUs with AVX2 support or higher (without AVX2 some operations have inefficient implementations), via `fbgemm <https://github.com/pytorch/FBGEMM>`_ | |||
* x86 CPUs with AVX2 support or higher (without AVX2 some operations have inefficient implementations), via `x86` to apply the optimization of `fbgemm <https://github.com/pytorch/FBGEMM>`_ and `onednn <https://github.com/oneapi-src/oneDNN>`_ (see the details at `RFC <https://github.com/pytorch/pytorch/issues/83888>`_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* x86 CPUs with AVX2 support or higher (without AVX2 some operations have inefficient implementations), via `x86` to apply the optimization of `fbgemm <https://github.com/pytorch/FBGEMM>`_ and `onednn <https://github.com/oneapi-src/oneDNN>`_ (see the details at `RFC <https://github.com/pytorch/pytorch/issues/83888>`_) | |
* x86 CPUs with AVX2 support or higher (without AVX2 some operations have inefficient implementations), via `x86` optimized by `fbgemm <https://github.com/pytorch/FBGEMM>`_ and `onednn <https://github.com/oneapi-src/oneDNN>`_ (see the details at `RFC <https://github.com/pytorch/pytorch/issues/83888>`_) |
…ption" [ghstack-poisoned]
ghstack-source-id: 4f27ab24f6341b4f69d3a98a396ba4e070f2fa2f Pull Request resolved: #86794
@pytorchbot rebase |
@XiaobingSuper feel free to merge when ready. |
@pytorchbot successfully started a rebase job. Check the current status here |
Rebase failed due to Command
Raised by https://github.com/pytorch/pytorch/actions/runs/3493543061 |
…d of server inference" [ghstack-poisoned]
ghstack-source-id: 6e9d7bef60d4f6312c66971b9643bcbcedfc05df Pull Request resolved: #86794
@kit1980 @jerryzh168, code is rebased. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
… inference (pytorch#86794) Pull Request resolved: pytorch#86794 Approved by: https://github.com/jgong5, https://github.com/kit1980
Stack from ghstack (oldest at bottom):