-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[mlir][linalg] Update pack and unpack documentation #143903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-linalg Author: Christopher McGirr (chrsmcgrr) Changes
I encountered some odd variations of The following changes reconcile those differences. Full diff: https://github.com/llvm/llvm-project/pull/143903.diff 3 Files Affected:
diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td b/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td
index 1e48a5e3a20ee..fef1900be62ea 100644
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td
+++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgRelayoutOps.td
@@ -94,11 +94,13 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [
and optionally transposes the tiled source tensor dimensions.
`inner_dims_pos` (mandatory) specifies `k` source tensor dimensions that are
- being tiled, where `0 < k <= n`. The order of the dimensions matters:
- - The tiled dimensions (of size `inner_tiles`) are added to the end of the result
- tensor in the order in which they appear in `inner_dims_pos`.
+ being tiled, where `0 < k <= n`.
- `inner_dims_pos[i]` specifies the source tensor dimension tiled by
- `inner_tiles[i]`.
+ `inner_tiles[i]` where `0 <= i < k`.
+ - the resulting tiled source dimension maps to an outer dimension of the
+ packed tensor in the order the non-tiled dimension appeared in the source
+ tensor, i.e. `shape(result)[inner_dims_pos[i]]` is equal to
+ `shape(source)[inner_dims_pos[i]] / inner_tiles[i]`.
`inner_tiles` (mandatory) specifies `k` tile sizes. These tile sizes
correspond to the least significant ("inner") result tensor dimension sizes,
@@ -117,6 +119,16 @@ def Linalg_PackOp : Linalg_RelayoutOp<"pack", [
into %dest : tensor<128x256xf32> -> tensor<16x8 x 8x32 xf32>
// \ / \ /
// outer dims inner dims
+ // CHW to CHWhw
+ %0 = linalg.pack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2]
+ into %dest : tensor<1x8x16xf32> -> tensor<1x2x4 x 4x2 xf32>
+ // \ / \ /
+ // outer dims inner dims
+ // HCW to HCWhw
+ %0 = linalg.pack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2]
+ into %dest : tensor<20x1x12xf32> -> tensor<10x1x3 x 4x2xf32>
+ // \ / \ /
+ // Outer Dims: 10x1x3 Inner Dims: 4x2
```
`outer_dims_perm` (optional) specifies a permutation for the outer
@@ -246,12 +258,14 @@ def Linalg_UnPackOp : Linalg_RelayoutOp<"unpack"> {
The "unpack" operation converts a source tensor of rank `n` with a tiled and
packed layout to a result tensor of rank `n - k`.
- `inner_dims_pos` (mandatory) specifies `k` source tensor dimensions with
- which the last `k` source tensor dimensions are combined, where
- `0 < k <= n/2`. Each `inner_dims_pos` element must be `>= 0` and `< n - k`.
- The order of the dimensions in `inner_dims_pos` matters: dimension
- `inner_dims_pos[i]` is combined with dimension `n - k + i` (assuming that
- `outer_dims_perm` is not specified).
+ `inner_dims_pos` (mandatory) specifies `k` result tensor dimensions that
+ were tiled with the `inner_tiles` to create the packed source tensor. The
+ source tensor dimensions can be combined given `inner_dims_pos` as follows:
+ the inner tile `shape(source)[n-k+i]` is combined with
+ `shape(source)[inner_dims_pos[i]]` where `0 <= i < k` and stored at
+ `shape(result)[inner_dims_pos[i]]`. The remaining dimensions are
+ `shape(result)[j] = shape(source)[j]` where `0 <= j < n-k` and `j` is not in
+ the set of `inner_dims_pos` indices.
`inner_tiles` (mandatory) specifies `k` tile sizes. These tile sizes
correspond to the least significant ("inner") source tensor dimension sizes.
@@ -266,7 +280,11 @@ def Linalg_UnPackOp : Linalg_RelayoutOp<"unpack"> {
dimensions. If specified, it must have `n - k` elements. If specified, this
permutation is applied before combining any dimensions.
- Example:
+ Note, the amount of elements in the source (packed tensor) and the result
+ (unpacked) can be unequal, i.e. `SizeOf(source) >= SizeOf(result)`. As
+ the unpack operation may drop any padding introduced by the pack operation.
+
+ Examples:
```mlir
// NCnc to NC:
@@ -277,6 +295,17 @@ def Linalg_UnPackOp : Linalg_RelayoutOp<"unpack"> {
%0 = linalg.unpack %source outer_dims_perm = [1, 0] inner_dims_pos = [0, 1]
inner_tiles = [8, 32] into %dest
: tensor<8x16x8x32xf32> -> tensor<128x256xf32>
+
+ // CHW to CHWhw:
+ %0 = linalg.unpack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2]
+ into %dest : tensor<1x3x2x4x2xf32> -> tensor<1x5x7xf32>
+ // / \
+ // Outer Dims: 1x3x2 Inner Dims: 4x2
+ // HCW to HCWhw
+ %0 = linalg.unpack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2]
+ into %dest : tensor<10x1x3 x 4x2xf32> -> tensor<20x1x12xf32>
+ // / \
+ // Outer Dims: 10x1x3 Inner Dims: 4x2
```
}];
let arguments = (ins AnyRankedTensor:$source,
diff --git a/mlir/test/Dialect/Linalg/invalid.mlir b/mlir/test/Dialect/Linalg/invalid.mlir
index c0c5f785e856b..fcca6dfadf723 100644
--- a/mlir/test/Dialect/Linalg/invalid.mlir
+++ b/mlir/test/Dialect/Linalg/invalid.mlir
@@ -1824,6 +1824,17 @@ func.func @unpack_invalid_outer_dims_perm(%source: tensor<128x256xf32>, %dest: t
// -----
+// Here we have the source tensor being tiled as: `source[1] / 32` and `source[0] / 16` but the inner_dims_pos does not imply
+// a transpose of the outer dimensions for the result tensor. The tiled dimensions appear in the result tensor in the order
+// they appear in the source tensor, i.e. 16x4x32x16
+func.func @pack_invalid_result_shape(%input: tensor<256x128xf32>, %output: tensor<4x16x32x16xf32>) -> tensor<4x16x32x16xf32> {
+ // expected-error@+1 {{the shape of output is not large enough to hold the packed data. Expected at least 'tensor<16x4x32x16xf32>', got 'tensor<4x16x32x16xf32>'}}
+ %0 = linalg.pack %input inner_dims_pos = [1, 0] inner_tiles = [32, 16] into %output : tensor<256x128xf32> -> tensor<4x16x32x16xf32>
+ return %0 : tensor<4x16x32x16xf32>
+}
+
+// -----
+
func.func @pack_invalid(%input: tensor<256x128xf32>, %output: tensor<8x8x32x16xf32>) -> tensor<8x8x32x16xf32> {
// expected-error@+1 {{the shape of output is not large enough to hold the packed data. Expected at least 'tensor<8x8x16x32xf32>', got 'tensor<8x8x32x16xf32>'}}
%0 = linalg.pack %input inner_dims_pos = [1, 0] inner_tiles = [16, 32] into %output : tensor<256x128xf32> -> tensor<8x8x32x16xf32>
diff --git a/mlir/test/Dialect/Linalg/named-ops.mlir b/mlir/test/Dialect/Linalg/named-ops.mlir
index 470bc1c78640c..b21b234bc7841 100644
--- a/mlir/test/Dialect/Linalg/named-ops.mlir
+++ b/mlir/test/Dialect/Linalg/named-ops.mlir
@@ -2771,6 +2771,69 @@ func.func @pad_and_pack_partially_dynamic(%source: tensor<?x?xf32>, %dest: tenso
// -----
+func.func @pack_descending_inner_dims_with_padding(%source: tensor<1x5x7xf32>, %dest: tensor<1x3x2x4x2xf32>, %pad: f32) -> tensor<1x3x2x4x2xf32> {
+ %0 = linalg.pack %source padding_value(%pad : f32) inner_dims_pos = [2, 1] inner_tiles = [4, 2] into %dest : tensor<1x5x7xf32> -> tensor<1x3x2x4x2xf32>
+ return %0 : tensor<1x3x2x4x2xf32>
+}
+
+// CHECK-LABEL: func.func @pack_descending_inner_dims_with_padding(
+// CHECK-SAME: %[[SOURCE:.*]]: tensor<1x5x7xf32>,
+// CHECK-SAME: %[[DEST:.*]]: tensor<1x3x2x4x2xf32>,
+// CHECK-SAME: %[[PAD:.*]]: f32)
+// CHECK: %{{.*}} = linalg.pack
+// CHECK-SAME: inner_dims_pos = [2, 1]
+// CHECK-SAME: inner_tiles = [4, 2]
+// CHECK-SAME: into %[[DEST]] : tensor<1x5x7xf32> -> tensor<1x3x2x4x2xf32>
+
+// -----
+
+// The function suffix "with_padding" refers to the padding that was introduced by the pack operation. But here
+// we are dropping the padding. Creating a tensor with less elements than what we started with.
+func.func @unpack_descending_inner_dims_with_padding(%source: tensor<1x3x2x4x2xf32>, %dest: tensor<1x5x7xf32>) -> tensor<1x5x7xf32> {
+ %0 = linalg.unpack %source inner_dims_pos = [2, 1] inner_tiles = [4, 2] into %dest : tensor<1x3x2x4x2xf32> -> tensor<1x5x7xf32>
+ return %0 : tensor<1x5x7xf32>
+}
+
+// CHECK-LABEL: func.func @unpack_descending_inner_dims_with_padding(
+// CHECK-SAME: %[[SOURCE:.*]]: tensor<1x3x2x4x2xf32>,
+// CHECK-SAME: %[[DEST:.*]]: tensor<1x5x7xf32>)
+// CHECK: %{{.*}} = linalg.unpack
+// CHECK-SAME: inner_dims_pos = [2, 1]
+// CHECK-SAME: inner_tiles = [4, 2]
+// CHECK-SAME: into %[[DEST]] : tensor<1x3x2x4x2xf32> -> tensor<1x5x7xf32>
+
+// -----
+
+func.func @pack_non_adjacent_inner_dims(%source: tensor<20x1x12xf32>, %dest: tensor<10x1x3x4x2xf32>) -> tensor<10x1x3x4x2xf32> {
+ %0 = linalg.pack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2] into %dest : tensor<20x1x12xf32> -> tensor<10x1x3x4x2xf32>
+ return %0 : tensor<10x1x3x4x2xf32>
+}
+
+// CHECK-LABEL: func.func @pack_non_adjacent_inner_dims(
+// CHECK-SAME: %[[SOURCE:.*]]: tensor<20x1x12xf32>,
+// CHECK-SAME: %[[DEST:.*]]: tensor<10x1x3x4x2xf32>)
+// CHECK: %{{.*}} = linalg.pack
+// CHECK-SAME: inner_dims_pos = [2, 0]
+// CHECK-SAME: inner_tiles = [4, 2]
+// CHECK-SAME: into %[[DEST]] : tensor<20x1x12xf32> -> tensor<10x1x3x4x2xf32>
+
+// -----
+
+func.func @unpack_non_adjacent_inner_dims(%source: tensor<10x1x3x4x2xf32>, %dest: tensor<20x1x12xf32>) -> tensor<20x1x12xf32> {
+ %0 = linalg.unpack %source inner_dims_pos = [2, 0] inner_tiles = [4, 2] into %dest : tensor<10x1x3x4x2xf32> -> tensor<20x1x12xf32>
+ return %0 : tensor<20x1x12xf32>
+}
+
+// CHECK-LABEL: func.func @unpack_non_adjacent_inner_dims(
+// CHECK-SAME: %[[SOURCE:.*]]: tensor<10x1x3x4x2xf32>,
+// CHECK-SAME: %[[DEST:.*]]: tensor<20x1x12xf32>)
+// CHECK: %{{.*}} = linalg.unpack
+// CHECK-SAME: inner_dims_pos = [2, 0]
+// CHECK-SAME: inner_tiles = [4, 2]
+// CHECK-SAME: into %[[DEST]] : tensor<10x1x3x4x2xf32> -> tensor<20x1x12xf32>
+
+// -----
+
func.func @unpack_fully_dynamic(%source: tensor<?x?x?x?xf32>, %dest: tensor<?x?xf32>, %tile_n : index, %tile_m : index) -> tensor<?x?xf32> {
%0 = linalg.unpack %source inner_dims_pos = [0, 1] inner_tiles = [%tile_n, %tile_m] into %dest : tensor<?x?x?x?xf32> -> tensor<?x?xf32>
return %0 : tensor<?x?xf32>
|
@banach-space here are the documentation changes we discussed. Please comment if they do not align with your expectations. |
Hey @chrsmcgrr - thanks for taking on the tricky task of improving the docs for pack and unpack. This is long overdue! I’ve left some suggestions inline, mostly based on past experience and what (in my opinion) makes the current version a bit hard to follow - mainly too much prose and not enough code or math. That said, if others feel differently, I’m happy to iterate. Goes without saying, this is already a massive improvement 🙏🏻 |
- `inner_dims_pos[i]` specifies the source tensor dimension tiled by | ||
`inner_tiles[i]`. | ||
`inner_tiles[i]` where `0 <= i < k`. | ||
- the resulting tiled source dimension maps to an outer dimension of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Since this is a sentence (full stop at the end), start with a capital letter.
- Introduce
inner_tiles
before this bullet point. Otherwise, this is referring to a concept yet to be introduced. - This sentence is a bit complex to parse. I have no good suggestion how to improve it - the underlying logic is just very tricky. In cases like this, Maths/code/diagrams are better than words. I would remove text and use code instead:
shape(result)[inner_dims_pos[i]] = shape(source)[inner_dims_pos[i]] / inner_tiles[i]
. You could add: "The following relationship for the tiled dimensions holds: ...".
- The tiled dimensions (of size `inner_tiles`) are added to the end of the result | ||
tensor in the order in which they appear in `inner_dims_pos`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this particular sentence helpful. Would you mind keeping it? You could rewrite it using code: shape(result)[rank(result) + i] = inner_tiles[i]
for 0 <= i < k.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the "order in which they appear in inner_dims_pos" is what threw me off. That is why I removed it. I will add the code example.
The order of the dimensions in `inner_dims_pos` matters: dimension | ||
`inner_dims_pos[i]` is combined with dimension `n - k + i` (assuming that | ||
`outer_dims_perm` is not specified). | ||
`inner_dims_pos` (mandatory) specifies `k` result tensor dimensions that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The meaning of result
vs source
is getting confusing when comparing linalg.pack
and linalg.unpack
, so here's a small suggestion.
`inner_dims_pos` (mandatory) specifies `k` result tensor dimensions that | |
`inner_dims_pos` (mandatory) specifies `k` result tensor (i.e. unpacked tensor) dimensions that |
`outer_dims_perm` is not specified). | ||
`inner_dims_pos` (mandatory) specifies `k` result tensor dimensions that | ||
were tiled with the `inner_tiles` to create the packed source tensor. The | ||
source tensor dimensions can be combined given `inner_dims_pos` as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
source tensor dimensions can be combined given `inner_dims_pos` as follows: | |
source tensor (i.e. packed tensor) dimensions can be combined given `inner_dims_pos` as follows: |
Note, the amount of elements in the source (packed tensor) and the result | ||
(unpacked) can be unequal, i.e. `SizeOf(source) >= SizeOf(result)`. As | ||
the unpack operation may drop any padding introduced by the pack operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'd keep this shorter and favour code over text.
- Personally, I feel that it makes more sense to say that the packing op may drop elements and, as an outcome of that, the resulting tensor might be smaller than the source tensor.
- I suggest more explicit
NumElementsOf
instead ofSizeOf
("size" is a bit context-sensitive).
Note, the amount of elements in the source (packed tensor) and the result | |
(unpacked) can be unequal, i.e. `SizeOf(source) >= SizeOf(result)`. As | |
the unpack operation may drop any padding introduced by the pack operation. | |
Note, the unpack operation may drop any padding introduced by the pack operation and hence the following holds `NumElementsOf(source) >= NumElementsOf(result)`. |
`inner_tiles` (mandatory) specifies `k` tile sizes. These tile sizes | ||
correspond to the least significant ("inner") source tensor dimension sizes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to delete this? That's currently repeated above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. But now, I have rephrased and shuffled the text a bit. Does that help?
the inner tile `shape(source)[n-k+i]` is combined with | ||
`shape(source)[inner_dims_pos[i]]` where `0 <= i < k` and stored at | ||
`shape(result)[inner_dims_pos[i]]`. The remaining dimensions are | ||
`shape(result)[j] = shape(source)[j]` where `0 <= j < n-k` and `j` is not in | ||
the set of `inner_dims_pos` indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you reduce text and perhaps use code instead? For example, it is not clear what "combined" means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I will re-write this.
// Here we have the source tensor being tiled as: `source[1] / 32` and `source[0] / 16` but the inner_dims_pos does not imply | ||
// a transpose of the outer dimensions for the result tensor. The tiled dimensions appear in the result tensor in the order | ||
// they appear in the source tensor, i.e. 16x4x32x16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, isn't this example simply missing outer_dims_perm = [1, 0]
? So yes, the order of the outer dims should not be permuted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example was more to reinforce what the operation does because reading from the documentation I found it difficult to follow the semantics. With the inter-play between inner_dims_pos
, inner_tiles
and outer_dims_perm
.
I can remove the example if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove the example if you like.
I do like this example, so lets keep it. My comment was more along the lines of: "Perhaps this long description could be trimmed to sth shorter referring to missing outer_dims_perm
"? Here's a specific suggestion:
The outer dims in the output tensor are incorrectly/unexpectedly transposed. This could be fixed by adding `outer_dims_perm = [1, 0]` (the default value assumes no transpose).
being tiled, where `0 < k <= n`. The order of the dimensions matters: | ||
- The tiled dimensions (of size `inner_tiles`) are added to the end of the result | ||
tensor in the order in which they appear in `inner_dims_pos`. | ||
being tiled, where `0 < k <= n`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found that k can equal to zero, which is a transpose op, Can you update the doc since you're touching the doc?
func.func @foo(%src: tensor<1x2x3xf32>, %dest: tensor<3x2x1xf32>) -> tensor<3x2x1xf32> {
%0 = linalg.pack %src outer_dims_perm = [2, 1, 0] inner_dims_pos = [] inner_tiles = [] into %dest : tensor<1x2x3xf32> -> tensor<3x2x1xf32>
return %0 : tensor<3x2x1xf32>
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a pretty interesting example, maybe it deserves a test case in named-ops.mlir
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already use such form in data-layout propagation and decomposition tests. But yeah, we can add a test to named -ops.mlir since it is special.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add the example :)
* Clarified the `inner_dim_pos` attribute in the case of high dimensionality tensors. * Added a 5D examples to show-case the use-cases that triggered this updated. * Added a reminder for linalg.unpack that number of elements are not required to be the same between input/output due to padding being dropped. I encountered some odd variations of `linalg.pack` and `linalg.unpack` while working on some TFLite models and the definition in the documentation did not match what I saw pass in IR verification. The following changes reconcile those differences. Signed-off-by: Christopher McGirr <mcgirr@roofline.ai>
Signed-off-by: Christopher McGirr <mcgirr@roofline.ai>
ec054a3
to
565ee3e
Compare
I've added your feedback. Let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, Chris!
This is much clearer now and I like the overall structure. There are some fine details that are not clear to me, see my questions inline. Could you take a look and see what you think?
Thanks!
// Here we have the source tensor being tiled as: `source[1] / 32` and `source[0] / 16` but the inner_dims_pos does not imply | ||
// a transpose of the outer dimensions for the result tensor. The tiled dimensions appear in the result tensor in the order | ||
// they appear in the source tensor, i.e. 16x4x32x16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove the example if you like.
I do like this example, so lets keep it. My comment was more along the lines of: "Perhaps this long description could be trimmed to sth shorter referring to missing outer_dims_perm
"? Here's a specific suggestion:
The outer dims in the output tensor are incorrectly/unexpectedly transposed. This could be fixed by adding `outer_dims_perm = [1, 0]` (the default value assumes no transpose).
@@ -2771,6 +2771,69 @@ func.func @pad_and_pack_partially_dynamic(%source: tensor<?x?xf32>, %dest: tenso | |||
|
|||
// ----- | |||
|
|||
func.func @pack_descending_inner_dims_with_padding(%source: tensor<1x5x7xf32>, %dest: tensor<1x3x2x4x2xf32>, %pad: f32) -> tensor<1x3x2x4x2xf32> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't "transposed" be more descriptive? Similar comment for the example below.
func.func @pack_descending_inner_dims_with_padding(%source: tensor<1x5x7xf32>, %dest: tensor<1x3x2x4x2xf32>, %pad: f32) -> tensor<1x3x2x4x2xf32> { | |
func.func @pack_transposed_inner_dims_with_padding(%source: tensor<1x5x7xf32>, %dest: tensor<1x3x2x4x2xf32>, %pad: f32) -> tensor<1x3x2x4x2xf32> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I will change the name.
`inner_tiles` (mandatory) specifies `k` tile sizes. These tile sizes | ||
correspond to the least significant ("inner") result tensor dimension sizes, | ||
in the same order. Tile sizes can be static or dynamic. | ||
|
||
`inner_dims_pos` (mandatory) specifies `k` source tensor dimensions that are | ||
being tiled, where `0 <= k <= n`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just realised that this was incorrect. Please double-check :)
being tiled, where `0 <= k <= n`. | |
being tiled, where `0 <= k < n`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it incorrect? I think you can pack all the dimensions? E.g.,
llvm-project/mlir/test/Dialect/Linalg/named-ops.mlir
Lines 2697 to 2701 in 0a2b6f6
func.func @pack_nc_to_ncnc(%source: tensor<128x256xf32>, %dest: tensor<4x16x32x16xf32>) -> tensor<128x256xf32> { | |
%0 = linalg.pack %source inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %dest : tensor<128x256xf32> -> tensor<4x16x32x16xf32> | |
%1 = tensor.empty() : tensor<128x256xf32> | |
%2 = linalg.unpack %0 inner_dims_pos = [0, 1] inner_tiles = [32, 16] into %1 : tensor<4x16x32x16xf32> -> tensor<128x256xf32> | |
return %2 : tensor<128x256xf32> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can pack all the dimensions?
Yes, since there are n dimensions, the indices should go from 0
to n - 1
, right? So 0 <= k < n
, not 0 <= k <= n
.
Hopefully not having an off-by-one moment here 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "pack" operation converts a source tensor of rank `n` into a result
tensor of rank
n + k
with a tiled and packed layout (maybe with padding)
and optionally transposes the tiled source tensor dimensions.
If we read from the start, I think k
indicates the number of dimensions to pack, which is also the number of elements in inner_dims_pos
. So it is 0 <= k <= n
.
You are right that the indices should go from 0
to n-1
, and we may say that in the following statement like
`inner_tiles[i]` where `0 <= i < k`. All the values in `inner_dims_pos` are within [0, n).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've incorporated the last statement @hanhanW into the docs.
packed source tensor. The source tensor (i.e. packed tensor) dimensions can | ||
be unpacked given `inner_dims_pos` as follows. | ||
- For `0 <= i < k` the following relationship holds: | ||
`shape(result)[inner_dims_pos[i]] = shape(source)[n-k+i] + shape(source)[inner_dims_pos[i]]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's /
in tensor.pack
, then there should be *
in tensor.unpack
, right? Also, because of padding, it should be <=
rather than =
, right? Please double check :)
`shape(result)[inner_dims_pos[i]] = shape(source)[n-k+i] + shape(source)[inner_dims_pos[i]]`. | |
`shape(result)[inner_dims_pos[i]] <= shape(source)[n-k+i] * shape(source)[inner_dims_pos[i]]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally right it should be *
and I did forget about the padding case. The new equality would hold now.
Signed-off-by: Christopher McGirr <mcgirr@roofline.ai>
I've implemented the new feedback. Please have a look @banach-space @adam-smnk @hanhanW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for improving this, it's so much clearer now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
inner_dim_pos
attribute in the case of high dimensionality tensors.I encountered some odd variations of
linalg.pack
andlinalg.unpack
while working on some TFLite models and the definition in the documentation did not match what I saw pass in IR verification.The following changes reconcile those differences.