-
Notifications
You must be signed in to change notification settings - Fork 684
Fix qwen2_5_xnnpack_q8da4w.yaml #13965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13965
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d809491 with merge base 14d0745 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
enabled: True | ||
enabled: True | ||
|
||
quantization: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For llama, we use
quantization:
qmode: 8da4w
group_size: 128
embedding_quantize: 4,32
Is 8da4w good enough for accuracy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we can do that as well. Caveat though that I remember the 0.5B version not reacting well to embedding quantization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah we don't want embedding to quantize for sure. But I think we need group wise quantization. If we just do qmode: 8da4w
, is it using group wise quantization with a default group size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, 128 would be the default group size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks!
No description provided.