-
Notifications
You must be signed in to change notification settings - Fork 646
Qualcomm AI Engine Direct - Reland GA Static QWEN2.5 0.5B #12582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualcomm AI Engine Direct - Reland GA Static QWEN2.5 0.5B #12582
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12582
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 2 Unrelated FailuresAs of commit 7272556 with merge base e81727a ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Hi @cccclai,
|
"use_scaled_rope": false, | ||
"vocab_size": 151936, | ||
"use_hf_rope": true, | ||
"attention_qkv_bias": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice
state_dict[f"layers.{layer_i}.attention.wk.weight"] = permute( | ||
state_dict[f"layers.{layer_i}.attention.wk.weight"], n_kv_heads | ||
else: | ||
state_dict = torch.load( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean we support both huggingface model and static llama?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing the PR.
I think the use of huggingface's weight in static llama is there before this pr.
For Qwen, it's just going into the if
statement. Static Llama's behavior is not changed.
Hey can you rebase? |
84bdb3e
to
7272556
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you!
) ### Summary Previous PR were merged unintentionally: pytorch#12054 On top of previous PR, also had a new commit addressing code review. ### Test plan [PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.
Summary: Forward fix for buck build with pytorch#12582 Differential Revision: D79056925
Summary
Previous PR were merged unintentionally: #12054
On top of previous PR, also had a new commit addressing code review.
Test plan
[PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.