Qualcomm AI Engine Direct - CI For Llama #8512

winskuo-quic · 2025-02-17T08:21:37Z

Summary

Isolate LLM to an individual class for test_qnn_delegate.py
This PR should not affect Executorch's CI. This is mainly for internal CI that checks pte size, accuracy, and inference speed. Runs stories110m and Llama 3.2 1B

cc @cccclai @shewu-quic @cbilgin @mergennachin @byjlw

pytorch-bot · 2025-02-17T08:21:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8512

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job

As of commit cdf332b with merge base 433e30b ():

NEW FAILURE - The following job has failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-custom-ops-macos (cmake) / macos-job (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

winskuo-quic · 2025-02-18T09:35:44Z

Hi @cccclai,
This PR is mainly for our internal CI to perform inference speed test on Stories Llama and Llama 3.2 1B.
ExecuTorch's CI will still work as usual, which tests pte size and accuracy using Stories Llama.
Please have a look.
Thanks.

facebook-github-bot · 2025-02-18T21:28:23Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai

Looks good!

cccclai · 2025-02-19T05:25:59Z

examples/qualcomm/oss_scripts/llama/runner/runner.cpp

+
+  // For now, we just print the total inference time for CI, can save more info
+  // in future if needed.
+  std::ofstream outfile("outputs/inference_speed.txt");


It seems default to write to this path. If users don't have this path then I assume it will fail? - can we make this runner more generic so it can be reused directly

Yes, I think by running the executable directly without using llama.py, it will fail. I will make a separate PR making this more flexible to users not using python script.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 17, 2025

winskuo-quic changed the title ~~Qualcomm AI Engine Direct - CI For LLama~~ Qualcomm AI Engine Direct - CI For Llama Feb 17, 2025

Enable inference speed test and 1b test

cdf332b

winskuo-quic force-pushed the dev1/winskuo/add_1B_llama_UT branch from 940ee4a to cdf332b Compare February 18, 2025 08:24

winskuo-quic marked this pull request as ready for review February 18, 2025 09:36

swolchok requested a review from cccclai February 18, 2025 18:39

cccclai approved these changes Feb 19, 2025

View reviewed changes

cccclai added module: user experience Issues related to reducing friction for users release notes: qualcomm Changes to the Qualcomm backend delegate labels Feb 19, 2025

cccclai merged commit f0ef51c into pytorch:main Feb 19, 2025
76 of 81 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qualcomm AI Engine Direct - CI For Llama #8512

Qualcomm AI Engine Direct - CI For Llama #8512

Uh oh!

winskuo-quic commented Feb 17, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Feb 17, 2025 •

edited

Loading

Uh oh!

winskuo-quic commented Feb 18, 2025

Uh oh!

facebook-github-bot commented Feb 18, 2025

Uh oh!

cccclai left a comment

Uh oh!

cccclai Feb 19, 2025

Uh oh!

winskuo-quic Feb 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Qualcomm AI Engine Direct - CI For Llama #8512

Qualcomm AI Engine Direct - CI For Llama #8512

Uh oh!

Conversation

winskuo-quic commented Feb 17, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pytorch-bot bot commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8512

❌ 1 New Failure, 1 Cancelled Job

Uh oh!

winskuo-quic commented Feb 18, 2025

Uh oh!

facebook-github-bot commented Feb 18, 2025

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

cccclai Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

winskuo-quic Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

winskuo-quic commented Feb 17, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Feb 17, 2025 •

edited

Loading