Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[quant] PerChannelFloatQParams support for quint4x2 dtype #45594

Closed
wants to merge 5 commits into from

Conversation

supriyar
Copy link
Contributor

@supriyar supriyar commented Sep 30, 2020

Stack from ghstack:

Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D24025595

Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D24025595](https://our.internmc.facebook.com/intern/diff/D24025595)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Sep 30, 2020
Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 97553da79ec9e1eb2a63516a5ce37a9747dd9264
Pull Request resolved: #45594
Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D24025595](https://our.internmc.facebook.com/intern/diff/D24025595)

[ghstack-poisoned]
Copy link
Contributor

@raghuramank100 raghuramank100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does dequantize() work with no changes?

@supriyar
Copy link
Contributor Author

supriyar commented Oct 1, 2020

Does dequantize() work with no changes?

No, it does require change. I've made changes to the dequantize_per_channel_affine_kernel to do the unpacking when dequantizing.

Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D24025595](https://our.internmc.facebook.com/intern/diff/D24025595)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Oct 1, 2020
Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a10aaeeebc5102d4e301d1d23e3a6ab257e9758d
Pull Request resolved: #45594
Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D24025595](https://our.internmc.facebook.com/intern/diff/D24025595)

[ghstack-poisoned]
supriyar added a commit that referenced this pull request Oct 1, 2020
Summary:
Adds support for Per-channel quantization using float qparams for 4-bit dtype
We use the new dispatch mechanism and use existing quantize/dequantize kernels to pack the
4-bit data depending on the bit_width.
Size of 4-bit quantized tensor is half that of 8-bit quantized tensor.

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_quantize_per_channel_sub_byte

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 2e8feabb1ffa50d6219b987ff78c796e5918903b
Pull Request resolved: #45594
@codecov
Copy link

codecov bot commented Oct 2, 2020

Codecov Report

Merging #45594 into gh/supriyar/190/base will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@                  Coverage Diff                  @@
##           gh/supriyar/190/base   #45594   +/-   ##
=====================================================
  Coverage                 68.50%   68.50%           
=====================================================
  Files                       408      408           
  Lines                     52487    52487           
=====================================================
  Hits                      35954    35954           
  Misses                    16533    16533           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 320861a...bff8e10. Read the comment docs.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 1a2d3b6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants