Skip to content

Conversation

winskuo-quic
Copy link
Collaborator

@winskuo-quic winskuo-quic commented Jan 23, 2025

Summary

QNN Static Stories 110M Llama Compile only CI.
Verify PTE size for hybrid mode to ensure PTE size is reduced when weight sharing is enabled.
TODO: Runtime inference speed test.

cc @cccclai @shewu-quic

Copy link

pytorch-bot bot commented Jan 23, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7884

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Pending

As of commit 9d73dd8 with merge base eacbeb7 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 23, 2025
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the test!

set -ex

# Download and prepare stories llama model artifacts
prepare_model_artifacts() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use this?

download_stories_model_artifacts() {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the recommendation. I will reuse this and remove setup-stories-llama.sh.

PYTHON_EXECUTABLE=python bash -x .ci/scripts/setup-stories-llama.sh
# Test static llama stories110m
PYTHON_EXECUTABLE=python backends/qualcomm/tests/test_qnn_delegate.py -k TestExampleScript.test_stories_single_llama --model SM8650 --build_folder build-android/ --executorch_root . --artifact_dir . --compile_only"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the accuracy, probably can borrow some logic here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to add accuracy test to this PR.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@lucylq lucylq added the release notes: qualcomm Changes to the Qualcomm backend delegate label Jan 30, 2025
@cccclai
Copy link
Contributor

cccclai commented Jan 30, 2025

Somehow the CI isn't trigger, can you push a new commit?

billmguo pushed a commit to billmguo/executorch that referenced this pull request Jan 31, 2025
…7884)

Summary:
QNN Static Stories 110M Llama Compile only CI.
Verify PTE size for hybrid mode to ensure PTE size is reduced when weight sharing is enabled.
TODO: Runtime inference speed test and accuracy test

Pull Request resolved: pytorch#7884

Differential Revision: D68635095

Pulled By: cccclai
@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/llama_ci_for_meta branch from ed50f6a to 526a0d8 Compare February 3, 2025 02:30
billmguo pushed a commit to billmguo/executorch that referenced this pull request Feb 3, 2025
…7884)

Summary:
QNN Static Stories 110M Llama Compile only CI.
Verify PTE size for hybrid mode to ensure PTE size is reduced when weight sharing is enabled.
TODO: Runtime inference speed test and accuracy test

Pull Request resolved: pytorch#7884

Differential Revision: D68635095

Pulled By: cccclai
@winskuo-quic
Copy link
Collaborator Author

Somehow the CI isn't trigger, can you push a new commit?

Hi @cccclai,
I have changed this to a draft PR to perform testing.
I will notify you once CI for pte size check and accuracy test passed.

Thanks

@winskuo-quic winskuo-quic marked this pull request as draft February 3, 2025 09:47
@cccclai
Copy link
Contributor

cccclai commented Feb 3, 2025

I feel like making it step by step is fine. Like

  • one PR with export success
  • one PR for pte size check succuess
  • one PR for accuracy check

Just so we can more and more coverage during this time

@cccclai
Copy link
Contributor

cccclai commented Feb 3, 2025

In the meanwhile, there is an SSH option to log in to the server if needed.

image

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/llama_ci_for_meta branch 2 times, most recently from e74a418 to fd5f7ac Compare February 4, 2025 08:42
@digantdesai digantdesai added partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/ labels Feb 4, 2025
@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/llama_ci_for_meta branch 7 times, most recently from 20c42a9 to e4fdae5 Compare February 11, 2025 01:46
@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/llama_ci_for_meta branch 2 times, most recently from 9595ff5 to d565402 Compare February 11, 2025 05:26
@winskuo-quic winskuo-quic marked this pull request as ready for review February 11, 2025 05:36
@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/llama_ci_for_meta branch from 62c5ceb to 9d73dd8 Compare February 11, 2025 12:17
@winskuo-quic
Copy link
Collaborator Author

winskuo-quic commented Feb 11, 2025

I feel like making it step by step is fine. Like

  • one PR with export success
  • one PR for pte size check succuess
  • one PR for accuracy check

Just so we can more and more coverage during this time

Hi @cccclai,
Actually, as long as the first test you mentioned can work properly, there's not much effort to enable pte size check and accuracy test. Therefore, I have enabled all 3 of them in this PR. I will work on inference speed test in a future PR.
I have tested to fail the accuracy and pte size test on purpose to ensure the CI is working, as shown in the 2 images below. This PR is now ready and has passed the 2 static llama tests I have added. Please have another look.
Thanks

Failed on purpose CI results.
image
image

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, thank you so much!

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai cccclai merged commit 14ddfd4 into pytorch:main Feb 11, 2025
75 of 78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/ partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm release notes: qualcomm Changes to the Qualcomm backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants