Tensor Core Layout docs is not clear #386

msaroufim · 2024-06-16T19:38:52Z

Right now what we have is docstrings but they could use work - this came up as @vayuda was looking at extending his bitpacking work to include a notion of scales

What does tensor core layout mean? It's not a googlable term and it seems to mean put into a format that tinygemm can understand torch.ops.aten._weight_int4pack_mm(input_tensor.contiguous(), packed_weight, groupsize, scale_and_zero)
It's kind of unclear why scale_and_zero are a single tensor
innerKtiles is never defined
The API does not describe how it wants to be used

@register_aqt_layout_cls("tensor_core_tiled")
class TensorCoreTiledAQTLayout(AQTLayout):
    """
    Layout storage class for tensor_core_tiled layout for affine quantized tensor, this is for int4 only,
    it stores the original tensor of dimension [n][k] (int32 dtype) as packed weight of 4-d tensor of
    dimension: [n / 8][k / (InnerKTiles * 16)][32][innerKTiles / 2]
    TODO: innerKTiles is hardcoded as 8 currently, we'll make this an argument later after decided
    on the API
    fields:
      packed_weight (torch.Tensor): the 4-d packed tensor in a tensor_core_tiled layout
      scale_and_zero (torch.Tensor): the combined scale Tensor used to map between floating point tensor to quantized tensor and zero_point Tensor
    """

The text was updated successfully, but these errors were encountered:

jerryzh168 · 2024-06-17T23:34:59Z

yeah tensor_core_tiled layout means it's a layout optimized for tensor core int4 tinygemm kernels
scale_and_zero is also packed because tinygemm requires it
inner_k_tiles is documented here: https://github.com/pytorch/ao/blob/main/torchao/quantization/quant_api.py#L360

"tensor_core_tiled" layout is just a type of layout used for AffineQuantizedTensor, this is how it's used:

ao/torchao/quantization/quant_api.py

Line 375 in aeee551

    
           return to_affine_quantized(weight, mapping_type, block_size, target_dtype, quant_min, quant_max, eps, zero_point_dtype=zero_point_dtype, preserve_zero=preserve_zero, zero_point_domain=zero_point_domain, extended_layout="tensor_core_tiled", inner_k_tiles=inner_k_tiles)

, TensorCoreTiledAQTLayout is not a top level API

supriyar · 2024-10-18T20:11:56Z

@jerryzh168 is this being addressed by @jainapurva refactor or still an issue?

jerryzh168 · 2024-10-18T20:26:34Z

not fully I think, but I can follow up with some doc fixes to address this

Summary: Following pytorch#988 we added TP support for int4_weight_only quantization in torchao that's using TensorCoreTiledLayout Addresses one work item in pytorch#988 Also clarified docs based on pytorch#386 Also restructructured the tests in test/dtypes/test_affine_quantized_tensor_parallel.py to not depend on torchao/utils.py to reduce the jumps people have to do to understand what is tested Test Plan: python test/dtypes/test_affine_quantized_tensor_parallel.py Reviewers: Subscribers: Tasks: Tags:

* Add tensor parallelism support for int4_weight_only quantization Summary: Following #988 we added TP support for int4_weight_only quantization in torchao that's using TensorCoreTiledLayout Addresses one work item in #988 Also clarified docs based on #386 Also restructructured the tests in test/dtypes/test_affine_quantized_tensor_parallel.py to not depend on torchao/utils.py to reduce the jumps people have to do to understand what is tested Test Plan: python test/dtypes/test_affine_quantized_tensor_parallel.py Reviewers: Subscribers: Tasks: Tags: * typo

jerryzh168 · 2024-10-19T01:13:21Z

I think we can close now, I just landed #1120, cc @msaroufim please let me know if there is any follow up questions around tensor core tiled layout or layout

jerryzh168 mentioned this issue Oct 18, 2024

Add tensor parallelism support for int4_weight_only quantization #1120

Merged

msaroufim closed this as completed Oct 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor Core Layout docs is not clear #386

Tensor Core Layout docs is not clear #386

msaroufim commented Jun 16, 2024

jerryzh168 commented Jun 17, 2024 •

edited

Loading

supriyar commented Oct 18, 2024

jerryzh168 commented Oct 18, 2024

jerryzh168 commented Oct 19, 2024

Tensor Core Layout docs is not clear #386

Tensor Core Layout docs is not clear #386

Comments

msaroufim commented Jun 16, 2024

jerryzh168 commented Jun 17, 2024 • edited Loading

supriyar commented Oct 18, 2024

jerryzh168 commented Oct 18, 2024

jerryzh168 commented Oct 19, 2024

jerryzh168 commented Jun 17, 2024 •

edited

Loading