[ET-VK] Add custom `VkInt4WeightOnlyQuantizer` for vulkan #6234

SS-JIA · 2024-10-15T16:47:25Z

Stack from ghstack (oldest at bottom):

Context

This diff adds the VkInt4WeightOnlyQuantizer class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from torchao.quantization.GPTQ.WeightOnlyInt4Linear with some minor changes as annotated in the implementation.

Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment.

Differential Revision: D64406457

## Context This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation. Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment. Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/) [ghstack-poisoned]

pytorch-bot · 2024-10-15T16:47:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6234

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d2e8d29 with merge base 8673567 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-10-15T16:47:36Z

This pull request was exported from Phabricator. Differential Revision: D64406457

## Context This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation. Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment. Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/) [ghstack-poisoned]

facebook-github-bot · 2024-10-15T16:49:50Z

This pull request was exported from Phabricator. Differential Revision: D64406457

## Context This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation. Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment. Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/) [ghstack-poisoned]

facebook-github-bot · 2024-10-15T17:47:34Z

This pull request was exported from Phabricator. Differential Revision: D64406457

## Context This diff adds the `VkInt4WeightOnlyQuantizer` class which enables 4-bit quantization of linear layers via source transformation. This quantizer class is copied from `torchao.quantization.GPTQ.WeightOnlyInt4Linear` with some minor changes as annotated in the implementation. Note that the pt2e quantization flow does not yet support groupwise quantization, so source transformation is the only way to perform groupwise quantization at the moment. Differential Revision: [D64406457](https://our.internmc.facebook.com/intern/diff/D64406457/) [ghstack-poisoned]

facebook-github-bot · 2024-10-16T17:16:55Z

This pull request was exported from Phabricator. Differential Revision: D64406457

facebook-github-bot · 2024-10-16T19:51:49Z

This pull request has been merged in 58ee33d.

pytorch-bot bot added ciflow/periodic module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ labels Oct 15, 2024

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 15, 2024

facebook-github-bot added the fb-exported label Oct 15, 2024

This was referenced Oct 15, 2024

[ET-VK] Fix implementation of int4 quantized linear #6200

Closed

[Ez] Enable Vulkan 4-bit weight only quantization in export_llama #6235

Closed

junpi3 approved these changes Oct 16, 2024

View reviewed changes

facebook-github-bot closed this in 58ee33d Oct 16, 2024

facebook-github-bot added the Merged label Oct 16, 2024

SS-JIA deleted the gh/SS-JIA/115/head branch January 24, 2025 19:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ET-VK] Add custom `VkInt4WeightOnlyQuantizer` for vulkan #6234

[ET-VK] Add custom `VkInt4WeightOnlyQuantizer` for vulkan #6234

Uh oh!

SS-JIA commented Oct 15, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 16, 2024

Uh oh!

facebook-github-bot commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ET-VK] Add custom VkInt4WeightOnlyQuantizer for vulkan #6234

[ET-VK] Add custom VkInt4WeightOnlyQuantizer for vulkan #6234

Uh oh!

Conversation

SS-JIA commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Uh oh!

pytorch-bot bot commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6234

✅ No Failures

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 15, 2024

Uh oh!

facebook-github-bot commented Oct 16, 2024

Uh oh!

facebook-github-bot commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ET-VK] Add custom `VkInt4WeightOnlyQuantizer` for vulkan #6234

[ET-VK] Add custom `VkInt4WeightOnlyQuantizer` for vulkan #6234

SS-JIA commented Oct 15, 2024 •

edited

Loading

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading