-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to enable per channel support on dynamic linear. #37623
Conversation
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 259e09c (more details on the Dr. CI page):
Extra GitHub checks: 1 failed
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 123 times. |
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Same kernel now supports both per-channel and per-tensor linear. Fixed fully connected test. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 93fdcb14221310b172caa1d2fa3976b7c088bd4b Pull Request resolved: pytorch/pytorch#37623
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
// TODO Kimish, we are allocating affine_quantized regardless of per channel or not. | ||
// This allocation is actually used only for packing weight and thus will be freed. | ||
// Still we should be consistent. Fix this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the scale and offset even used here? Could this just be a normal CPU tensor of dtype uint8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it can be. Let me do that after this stack lands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't review the microkernel changes, but the rest looks good.
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
Summary: Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear. Test Plan: qnnpack tests q8gemm fully-connected-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D21339040](https://our.internmc.facebook.com/intern/diff/D21339040) [ghstack-poisoned]
This pull request has been merged in 0a554ae. |
Stack from ghstack:
Summary:
Follows the same strategy as static linear. Supply pointer to an array of zero points and requantization scales. Thus same kernel now supports both per-channel and per-tensor linear.
Test Plan:
qnnpack tests
q8gemm
fully-connected-test
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D21339040