-
Notifications
You must be signed in to change notification settings - Fork 754
Qualcomm AI Engine Direct - gpu support part1 #12165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12165
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Unrelated FailureAs of commit 5a7b62b with merge base 1fe59c8 ( NEW FAILURE - The following job has failed:
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label "release notes: qualcomm" |
| #include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpContext.h> | ||
| #include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpDevice.h> | ||
| #include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpGraph.h> | ||
| #include <executorch/backends/qualcomm/runtime/backends/gpu/GpuBackend.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly worried about the runtime size increase, that usually is a requirement for production. Do we know how much size increase with this PR? If I have a model runs on HTP only, can the runtime include HTP only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The libqnn_executorch_backend.so grows from 630984 to 652672 bytes. We'll deprecate few files in next PR, hopefully it could further reduce the number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What files will be deprecated in next PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be aot/ir and runtime/backend/CustomProtocol*. We now switch to QNN IR backend (DLC) for online-prepare path, the qcir and the legacy code for multi-method compilation can be fully deprecated.
But it would break backward compatibility since we used to wrap preprocess result with custom protocol. Probably will let you to decide when will be the right time to apply the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I was thinking wrong about the impact of deprecating files. We still need to keep the custom protocol implementation to make multi-graph path work.
The change is in #12583 now and will guarantee BC.
|
Sorry I need to spend a bit more time on this, because we don't have CI to test the pllm model and I'm worried it will cause breakage |
No worries, I think GA decoder models is way more important than this. This PR is mainly a proof of concept that we can extend the capability of QNN backend. |
|
Can we prioritize the stories.pte as part of CI to prevent BC breakage? Otherwise it's hard to catch failure |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
We should be good to continue this PR, what do you think? |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
Should we rebase this PR are land it? |
We're doing some final checking and will submit PR next week. We will also verify all the enabled models and try to support them with GPU after this PR. |
f21b2b8 to
a87ddf2
Compare
8f2b82e to
5e88803
Compare
|
I'm getting the error message
Can you help fixing it? |
Hi @cccclai, |
79baee9 to
7ce2b7a
Compare
|
Sorry was out for 3 days last week, seems like you push new commits, let me check again |
- rename folders in backends/qualcomm/runtime/backends - add gpu infra
7ce2b7a to
5a7b62b
Compare
|
It seems like an internal test failing, I will forward fix |
Summary
Test plan