-
Notifications
You must be signed in to change notification settings - Fork 689
Arm backend: Add TOSA dialect op for MATMUL #14694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14694
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 2 New Failures, 1 Cancelled Job, 1 Unrelated FailureAs of commit b52fed9 with merge base 400b2a5 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
output_qparams.get_zp_per_tensor(), | ||
) | ||
|
||
def call(self, graph_module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why not do this in the aten.bmm node visitor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aten.bmm is not a one-to-one mapping to TOSA.MATMUL due to TOSA outputting an INT32 for INT8 inputs while aten.bmm would output an INT8. So a rescale needs to be inserted and we're trying to avoid inserting rescales in multiple node visitors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're trying to avoid inserting rescales in multiple node visitor
Yeah I saw #14584
I called this out is because this pass feels a lot like bmm lowering which we have been doing through the node visitors.
""" | ||
|
||
target = "aten.bmm.default" | ||
target = "tosa.MATMUL.default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: perhaps we should rename this file a bit further to op_tosa_matmul.py
and others, if any, just to distinguish from aten op target lowering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me!
Adds TOSA backend dialect op for MATMUL and associating pass to rewrite edge.aten.bmm to tosa.MATMUL. Also renames the files of node_visttor that lowers a TOSA backend dialect op to op_tosa_*.py, e.g. op_tosa_matmul.py. Signed-off-by: Oscar Andersson <oscar.andersson@arm.com> Change-Id: I578e5f7333922e02402dabc24ef1b12adf383b18
ff47900
to
65e80e6
Compare
Signed-off-by: Oscar Andersson <oscar.andersson@arm.com> Change-Id: Iaa2c17c2f2a984970a91022cad2c49fbb7f3202e
Adds TOSA backend dialect op for MATMUL and associating pass to rewrite edge.aten.bmm to tosa.MATMUL.
cc @digantdesai @freddan80 @per @zingo