Skip to content

Conversation

helunwencser
Copy link
Contributor

@helunwencser helunwencser commented Oct 2, 2024

Stack from ghstack (oldest at bottom):

Differential Revision: D63714794

This PR adds support to load qat_lora checkpoints. It mainly does the following two things:
- Refactor the existing quantization flow for SpinQuant to be separate function, which is used to load QAT checkpoint as well since they share the same format.
- For QAT_LoRA checkpoint, we do one more extra step after quantization. It replaces `Int8DynActInt4WeightLinear` layers with `Int8DynActInt4WeightLinearLoRA` which contains LoRA adaptor.

Differential Revision: [D63714794](https://our.internmc.facebook.com/intern/diff/D63714794/)

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Oct 2, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5823

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 96e6198 with merge base 152e22d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D63714794

helunwencser added a commit that referenced this pull request Oct 2, 2024
This PR adds support to load qat_lora checkpoints. It mainly does the following two things:
- Refactor the existing quantization flow for SpinQuant to be separate function, which is used to load QAT checkpoint as well since they share the same format.
- For QAT_LoRA checkpoint, we do one more extra step after quantization. It replaces `Int8DynActInt4WeightLinear` layers with `Int8DynActInt4WeightLinearLoRA` which contains LoRA adaptor.

Differential Revision: [D63714794](https://our.internmc.facebook.com/intern/diff/D63714794/)

ghstack-source-id: 245945707
Pull Request resolved: #5823
@helunwencser helunwencser changed the base branch from gh/helunwencser/38/base to main October 2, 2024 18:38
This PR adds support to load qat_lora checkpoints. It mainly does the following two things:
- Refactor the existing quantization flow for SpinQuant to be separate function, which is used to load QAT checkpoint as well since they share the same format.
- For QAT_LoRA checkpoint, we do one more extra step after quantization. It replaces `Int8DynActInt4WeightLinear` layers with `Int8DynActInt4WeightLinearLoRA` which contains LoRA adaptor.

Differential Revision: [D63714794](https://our.internmc.facebook.com/intern/diff/D63714794/)

[ghstack-poisoned]
helunwencser added a commit that referenced this pull request Oct 2, 2024
Pull Request resolved: #5823

Internal:
This PR adds support to load qat_lora checkpoints. It mainly does the following two things:
- Refactor the existing quantization flow for SpinQuant to be separate function, which is used to load QAT checkpoint as well since they share the same format.
- For QAT_LoRA checkpoint, we do one more extra step after quantization. It replaces `Int8DynActInt4WeightLinear` layers with `Int8DynActInt4WeightLinearLoRA` which contains LoRA adaptor.

Differential Revision: [D63714794](https://our.internmc.facebook.com/intern/diff/D63714794/)
ghstack-source-id: 245956347
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D63714794

@helunwencser helunwencser changed the title add support for loading qat_lora checkpoints add more options for loading checkpoints Oct 2, 2024
@mergennachin mergennachin self-requested a review October 2, 2024 19:20
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 9ff3351.

@cccclai cccclai mentioned this pull request Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants