-
Notifications
You must be signed in to change notification settings - Fork 685
[ET-VK][DCE] Remove redundant quantized linear implementations #14151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) [ghstack-poisoned]
As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) ghstack-source-id: 308785943 Pull Request resolved: #14151
This pull request was exported from Phabricator. Differential Revision: D82120824 |
This PR needs a
|
…ions" As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) [ghstack-poisoned]
Pull Request resolved: #14151 As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) ghstack-source-id: 308791208
This pull request was exported from Phabricator. Differential Revision: D82120824 |
…ions" As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) [ghstack-poisoned]
Pull Request resolved: #14151 As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. ghstack-source-id: 308939585 Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/)
This pull request was exported from Phabricator. Differential Revision: D82120824 |
b304076
into
gh/SS-JIA/326/base
Pull Request resolved: #14151 As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. ghstack-source-id: 308939585 Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/)
Stack from ghstack (oldest at bottom):
As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under
QuantizedLinear.cpp
, the operators implemented inQuantizedLinearQGANW.cpp
andQuantizedLinear_QTA8A_QGA4W.cpp
are no longer required.This diff removes all code related to the operators implemented in those files, namely:
et_vk.linear_weight_int4.default
et_vk.linear_qta8a_qga4w.default
AOT export logic needed to support those ops are also removed.
Differential Revision: D82120824