Skip to content

Conversation

jackzhxng
Copy link
Contributor

No description provided.

@jackzhxng jackzhxng requested a review from lucylq as a code owner September 4, 2025 21:07
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 4, 2025
Copy link

pytorch-bot bot commented Sep 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13965

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d809491 with merge base 14d0745 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

github-actions bot commented Sep 4, 2025

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

enabled: True
enabled: True

quantization:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For llama, we use

quantization:
  qmode: 8da4w
  group_size: 128
  embedding_quantize: 4,32

Is 8da4w good enough for accuracy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can do that as well. Caveat though that I remember the 0.5B version not reacting well to embedding quantization

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah we don't want embedding to quantize for sure. But I think we need group wise quantization. If we just do qmode: 8da4w, is it using group wise quantization with a default group size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, 128 would be the default group size

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Thanks!

@jackzhxng jackzhxng merged commit 0eed262 into main Sep 5, 2025
117 of 119 checks passed
@jackzhxng jackzhxng deleted the jz/fix-qwen2_5-config branch September 5, 2025 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants