Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QuantTensor quantization invariant op on per tensor vs per channel #728

Open
volcacius opened this issue Oct 18, 2023 · 1 comment
Open
Labels
good first issue Good for newcomers

Comments

@volcacius
Copy link
Contributor

Currently we don't check whether a QuantTensor is per channel or per tensor when we do certain ops like flatten or shuffle/unshuffle that are sensitive to it.

@ScXfjiang
Copy link

PyTorch Tensor has the qscheme attribute to specify how to quantize a tensor, which includes:

  • torch.per_tensor_affine
  • torch.per_tensor_symmetric
  • torch.per_channel_affine
  • torch.per_channel_symmetric

In Brevitas, if granularity is our only concern so far, maybe we can assess it by examining the shape of the quantization parameters (scale, zero point):

  • shape [1] --> per_tensor quantization
  • otherwise --> per_channel quantization

#891

@ScXfjiang ScXfjiang mentioned this issue Apr 21, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants