Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor Core Layout docs is not clear #386

Closed
msaroufim opened this issue Jun 16, 2024 · 4 comments
Closed

Tensor Core Layout docs is not clear #386

msaroufim opened this issue Jun 16, 2024 · 4 comments

Comments

@msaroufim
Copy link
Member

Right now what we have is docstrings but they could use work - this came up as @vayuda was looking at extending his bitpacking work to include a notion of scales

  1. What does tensor core layout mean? It's not a googlable term and it seems to mean put into a format that tinygemm can understand torch.ops.aten._weight_int4pack_mm(input_tensor.contiguous(), packed_weight, groupsize, scale_and_zero)
  2. It's kind of unclear why scale_and_zero are a single tensor
  3. innerKtiles is never defined
  4. The API does not describe how it wants to be used
@register_aqt_layout_cls("tensor_core_tiled")
class TensorCoreTiledAQTLayout(AQTLayout):
    """
    Layout storage class for tensor_core_tiled layout for affine quantized tensor, this is for int4 only,
    it stores the original tensor of dimension [n][k] (int32 dtype) as packed weight of 4-d tensor of
    dimension: [n / 8][k / (InnerKTiles * 16)][32][innerKTiles / 2]
    TODO: innerKTiles is hardcoded as 8 currently, we'll make this an argument later after decided
    on the API
    fields:
      packed_weight (torch.Tensor): the 4-d packed tensor in a tensor_core_tiled layout
      scale_and_zero (torch.Tensor): the combined scale Tensor used to map between floating point tensor to quantized tensor and zero_point Tensor
    """
@jerryzh168
Copy link
Contributor

jerryzh168 commented Jun 17, 2024

  1. yeah tensor_core_tiled layout means it's a layout optimized for tensor core int4 tinygemm kernels
  2. scale_and_zero is also packed because tinygemm requires it
  3. inner_k_tiles is documented here: https://github.com/pytorch/ao/blob/main/torchao/quantization/quant_api.py#L360
  4. "tensor_core_tiled" layout is just a type of layout used for AffineQuantizedTensor, this is how it's used:
    return to_affine_quantized(weight, mapping_type, block_size, target_dtype, quant_min, quant_max, eps, zero_point_dtype=zero_point_dtype, preserve_zero=preserve_zero, zero_point_domain=zero_point_domain, extended_layout="tensor_core_tiled", inner_k_tiles=inner_k_tiles)
    , TensorCoreTiledAQTLayout is not a top level API

@supriyar
Copy link
Contributor

@jerryzh168 is this being addressed by @jainapurva refactor or still an issue?

@jerryzh168
Copy link
Contributor

not fully I think, but I can follow up with some doc fixes to address this

jerryzh168 added a commit to jerryzh168/ao that referenced this issue Oct 18, 2024
Summary:
Following pytorch#988 we added TP support for int4_weight_only quantization in torchao
that's using TensorCoreTiledLayout

Addresses one work item in pytorch#988

Also clarified docs based on pytorch#386

Also restructructured the tests in test/dtypes/test_affine_quantized_tensor_parallel.py to not depend on
torchao/utils.py to reduce the jumps people have to do to understand what is tested

Test Plan:
python test/dtypes/test_affine_quantized_tensor_parallel.py

Reviewers:

Subscribers:

Tasks:

Tags:
jerryzh168 added a commit that referenced this issue Oct 19, 2024
* Add tensor parallelism support for int4_weight_only quantization

Summary:
Following #988 we added TP support for int4_weight_only quantization in torchao
that's using TensorCoreTiledLayout

Addresses one work item in #988

Also clarified docs based on #386

Also restructructured the tests in test/dtypes/test_affine_quantized_tensor_parallel.py to not depend on
torchao/utils.py to reduce the jumps people have to do to understand what is tested

Test Plan:
python test/dtypes/test_affine_quantized_tensor_parallel.py

Reviewers:

Subscribers:

Tasks:

Tags:

* typo
@jerryzh168
Copy link
Contributor

I think we can close now, I just landed #1120, cc @msaroufim please let me know if there is any follow up questions around tensor core tiled layout or layout

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants