[mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern #141613

chrsmcgrr · 2025-05-27T14:40:12Z

Given the following example:

module {
  func.func @main(%arg0: tensor<1x1x1x4x1xf32>, %arg1: tensor<1x1x4xf32>) -> tensor<1x1x1x4x1xf32> {
    %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32>
    return %pack : tensor<1x1x1x4x1xf32>
  }
}

We would generate an invalid transpose operation because the calculated permutation would be [0, 2, 0] which is semantically incorrect. As the permutation must contain unique integers corresponding to the source tensor dimensions.

The following change modifies how we calculate the permutation array and ensures that the dimension indices given in the permutation array is unique.

The above example would then translate to a transpose having a permutation of [1, 2, 0]. Following the rule, that the inner_dim_pos is appended to the permutation array and the preceding indices are filled with the remaining dimensions.

llvmbot · 2025-05-27T14:40:49Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: Christopher McGirr (chrsmcgrr)

Changes

Given the following example:

module {
  func.func @<!-- -->main(%arg0: tensor&lt;1x1x1x4x1xf32&gt;, %arg1: tensor&lt;1x1x4xf32&gt;) -&gt; tensor&lt;1x1x1x4x1xf32&gt; {
    %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor&lt;1x1x4xf32&gt; -&gt; tensor&lt;1x1x1x4x1xf32&gt;
    return %pack : tensor&lt;1x1x1x4x1xf32&gt;
  }
}

We would generate an invalid transpose operation because the calculated permutation would be [0, 2, 0] which is semantically incorrect. As the permutation must contain unique integers corresponding to the source tensor dimensions.

The following change modifies how we calculate the permutation array and ensures that the dimension indices given in the permutation array is unique.

The above example would then translate to a transpose having a permutation of [1, 2, 0]. Following the rule, that the inner_dim_pos is appended to the permutation array and the preceding indices are filled with the remaining dimensions.

Full diff: https://github.com/llvm/llvm-project/pull/141613.diff

2 Files Affected:

(modified) mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp (+15-8)
(modified) mlir/test/Dialect/Linalg/decompose-pack.mlir (+19)

diff --git a/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp b/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
index 8718c57b9e86c..7b6c8243d1040 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
@@ -1205,16 +1205,23 @@ LogicalResult DecomposeOuterUnitDimsPackOpPattern::matchAndRewrite(
   //    %init = tensor.empty()
   //    %transposed_tile = linalg.transpose ins(%source_or_padded_source),
   //                                        outs(%init)
-  // Two assumptions are made:
-  //  1. All outer dims are 1 - the corresponding transposition doesn't matter.
-  //  2. Inner dims position correspond to the trailing `numTiles` dims.
-  SmallVector<int64_t> tilesPermNormalized =
-      getPackUnpackNormalizedPerm(srcRank, packOp.getInnerDimsPos());
+  // Assumptions made:
+  //  1. Inner dims position correspond to the trailing `numTiles` dims.
   SmallVector<int64_t> srcPermForTranspose;
-  for (int64_t i = 0; i < (srcRank - numTiles); i++)
+  ArrayRef<int64_t> innerDimPos(packOp.getInnerDimsPos());
+  for (int64_t i = 0; i < srcRank; i++) {
+    // As we assume the trailing dimensions of the inner dim position correspond
+    // to the trailing indices of the transpose permutation, we need to
+    // calculate the remaining indicies of the transpose permutation. This is
+    // done by adding the indices not contained in the inner dimension position.
+    //   For example if we have a source tensor of dimensions [0, 1, 2, 3]
+    //   and inner dim position of [3, 0], the remaining indices are [1, 2].
+    //   and the transpose will be [1, 2, 3, 0].
+    if (llvm::is_contained(innerDimPos, i))
+      continue;
     srcPermForTranspose.push_back(i);
-
-  srcPermForTranspose.append(SmallVector<int64_t>(packOp.getInnerDimsPos()));
+  }
+  srcPermForTranspose.append(innerDimPos.begin(), innerDimPos.end());
 
   LLVM_DEBUG(DBGS() << "Pack permutation: " << packOp << "\n"
                     << "perm: " << llvm::interleaved(srcPermForTranspose)
diff --git a/mlir/test/Dialect/Linalg/decompose-pack.mlir b/mlir/test/Dialect/Linalg/decompose-pack.mlir
index 911b453f919c3..6d091406a639c 100644
--- a/mlir/test/Dialect/Linalg/decompose-pack.mlir
+++ b/mlir/test/Dialect/Linalg/decompose-pack.mlir
@@ -229,3 +229,22 @@ func.func @simple_KCRS_to_KRSCsr(%arg0: tensor<1x1x32x8xf32>, %arg1: tensor<1x1x
 // CHECK:         %[[INSERT:.+]] = tensor.insert_slice %[[TRANSP]] into %[[DEST]]
 // CHECK-SAME:      [0, 0, 0, 0, 0, 0] [1, 1, 1, 1, 8, 32] [1, 1, 1, 1, 1, 1]
 // CHECK:         return %[[INSERT]]
+
+// -----
+
+func.func @pack_with_unit_outer_dims_and_unit_inner(%arg0: tensor<1x1x4xf32>, %arg1: tensor<1x1x1x4x1xf32>) -> tensor<1x1x1x4x1xf32> {
+  %pack = linalg.pack %arg0 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg1 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32>
+  return %pack : tensor<1x1x1x4x1xf32>
+}
+
+// CHECK-LABEL: func.func @pack_with_unit_outer_dims_and_unit_inner
+// CHECK-SAME:    %[[SRC:[a-zA-Z0-9]+]]
+// CHECK-SAME:    %[[DEST:[a-zA-Z0-9]+]]
+// CHECK:         %[[EMPTY:.+]] = tensor.empty() : tensor<1x4x1xf32>
+// CHECK:         %[[TRANSP:.+]] = linalg.transpose
+// CHECK-SAME:      ins(%[[SRC]] : tensor<1x1x4xf32>)
+// CHECK-SAME:      outs(%[[EMPTY]] : tensor<1x4x1xf32>)
+// CHECK-SAME:      permutation = [1, 2, 0]
+// CHECK:         %[[INSERT:.+]] = tensor.insert_slice %[[TRANSP]] into %[[DEST]]
+// CHECK-SAME:      [0, 0, 0, 0, 0] [1, 1, 1, 4, 1] [1, 1, 1, 1, 1] : tensor<1x4x1xf32> into tensor<1x1x1x4x1xf32>
+// CHECK:         return %[[INSERT]]
\ No newline at end of file

hanhanW

IIUC, you extend the pattern to handle the case that there are non ones in unpacked outer dimensions? I.e., should we relax the check in line 1165 - 1169? Then you are not fixing a corner case. Instead, you extend the support in general?

mlir/test/Dialect/Linalg/decompose-pack.mlir

hanhanW · 2025-05-27T15:00:38Z

module {
func.func @main(%arg0: tensor<1x1x1x4x1xf32>, %arg1: tensor<1x1x4xf32>) -> tensor<1x1x1x4x1xf32> {
%pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32>
return %pack : tensor<1x1x1x4x1xf32>
}
}

E.g., I'd expect the below test case working with your support, if I read your intention correctly.

func.func @main(%arg0: tensor<2x1x1x4x1xf32>, %arg1: tensor<1x2x4xf32>) -> tensor<2x1x1x4x1xf32> {
  %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x2x4xf32> -> tensor<2x1x1x4x1xf32>
  return %pack : tensor<2x1x1x4x1xf32>
}

mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp

chrsmcgrr · 2025-05-28T07:22:44Z

IIUC, you extend the pattern to handle the case that there are non ones in unpacked outer dimensions? I.e., should we relax the check in line 1165 - 1169? Then you are not fixing a corner case. Instead, you extend the support in general?

Thanks for the quick reply @hanhanW

Not quite, at least for my use case I am still only concerned with unit outer dimensions in the unpacked case. AFAIK, the outer dimension in my case would be index [1] and the inner dimensions would be [2, 0] when looking at the source, unpacked tensor. Correct me if I am wrong.

My change is more about the adjacent trailing dimensions as @banach-space has now explained.

I would be happy to extend 1165-1169 if anyone needs it.

banach-space · 2025-05-28T09:36:41Z

I would be happy to extend 1165-1169 if anyone needs it.

If it's not required, I would refrain from extending it right now. These "decomposition" patterns are already riddled with assumptions that we neither document nor test (like the case with non-adjacent dims that you discovered). Extending them could lead to even more un-verified assumptions.

Btw, @chrsmcgrr , could you also the check DecomposeOuterUnitDimsUnPackOpPattern? We should keep these "decompose" patterns symmetrical 😅

Thanks!

chrsmcgrr · 2025-06-02T14:30:34Z

@banach-space @hanhanW I've updated the comments and removed the adjacent trailing dimensions check as it is no longer needed. This change will allow for that use-case.

I have also added the corresponding test to the unpack version which works fine out-of-the-box. Looking at the unpack pattern I can't see a clean way of making the patterns symmetrical. So I will leave it for now.

Let me know what you think.

banach-space

Thanks for the updates! Some minor comments inline.

mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp

mlir/test/Dialect/Linalg/decompose-unpack.mlir

mlir/test/Dialect/Linalg/decompose-pack.mlir

mlir/test/Dialect/Linalg/decompose-unpack.mlir

banach-space

Nice, thank you, LGTM!

@main

Given the following example: ``` module { func.func @main(%arg0: tensor<1x1x1x4x1xf32>, %arg1: tensor<1x1x4xf32>) -> tensor<1x1x1x4x1xf32> { %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32> return %pack : tensor<1x1x1x4x1xf32> } } ``` We would generate an invalid transpose operation because the calculated permutation would be `[0, 2, 0]` which is semantically incorrect. As the permutation must contain unique integers corresponding to the source tensor dimensions. The following change modifies how we calculate the permutation array and ensures that the dimension indices given in the permutation array is unique. The above example would then translate to a transpose having a permutation of `[1, 2, 0]`. Following the rule, that the `inner_dim_pos` is appended to the permutation array and the preceding indices are filled with the remaining dimensions.

…pattern

… pattern

hanhanW · 2025-06-26T17:35:57Z

I think all the comments are addressed, so we can land the PR? @chrsmcgrr let us know if you need any of us to help merge it.

chrsmcgrr · 2025-06-27T06:51:42Z

@hanhanW Yes if you could merge it that would be great :)

llvm-ci · 2025-06-27T08:02:14Z

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-hwasan running on sanitizer-buildbot12 while building mlir at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/13385

Here is the relevant piece of the build log for the reference

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure) (timed out)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) All available features: add-latomic-workaround, buildhost=linux, c++26, can-create-symlinks, character-conversion-warnings, clang, clang-21, clang-21.0, clang-21.0.0, diagnose-if-support, enable-benchmarks=no, gcc-style-warnings, glibc-old-ru_RU-decimal-point, has-1024-bit-atomics, has-64-bit-atomics, has-fblocks, has-fconstexpr-steps, has-unix-headers, hwasan, large_tests, libcpp-abi-version=1, libcpp-has-no-availability-markup, libcpp-has-no-experimental-syncstream, libcpp-has-no-experimental-tzdb, libcpp-has-no-incomplete-pstl, libcpp-has-thread-api-pthread, linux, long_tests, lsan, objective-c++, optimization=none, sanitizer-new-delete, std-at-least-c++03, std-at-least-c++11, std-at-least-c++14, std-at-least-c++17, std-at-least-c++20, std-at-least-c++23, std-at-least-c++26, stdlib=libc++, stdlib=llvm-libc++, target=aarch64-unknown-linux-gnu, thread-safety, verify-support
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{cxx} substitution: '/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build0/bin/clang++'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{flags} substitution: '-pthread --target=aarch64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=hwaddress'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{compile_flags} substitution: '-nostdinc++ -I %{target-include-dir} -I %{include-dir} -I %{libcxx-dir}/test/support -std=c++26 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-noexcept-type -Wno-atomic-alignment -Wno-reserved-module-identifier -Wdeprecated-copy -Wdeprecated-copy-dtor -Wshift-negative-value -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -Wno-local-type-template-args -Wno-c++11-extensions -Wno-unknown-pragmas -Wno-pass-failed -Wno-mismatched-new-delete -Wno-redundant-move -Wno-self-move -Wno-nullability-completeness -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_NONE -Werror=thread-safety -Wuser-defined-warnings'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{link_flags} substitution: '-lc++experimental -nostdlib++ -L %{lib-dir} -Wl,-rpath,%{lib-dir} -lc++ -latomic'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{benchmark_flags} substitution: '-I /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxx/test/benchmarks/google-benchmark/include -L /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxx/test/benchmarks/google-benchmark/lib -L /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxx/test/benchmarks/google-benchmark/lib64 -l benchmark'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{exec} substitution: '%{executor} --execdir %T -- '
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) All available features: add-latomic-workaround, buildhost=linux, c++26, c++experimental, can-create-symlinks, character-conversion-warnings, clang, clang-21, clang-21.0, clang-21.0.0, diagnose-if-support, enable-benchmarks=no, gcc-style-warnings, glibc-old-ru_RU-decimal-point, has-1024-bit-atomics, has-64-bit-atomics, has-fblocks, has-fconstexpr-steps, has-unix-headers, hwasan, large_tests, libcpp-abi-version=1, libcpp-hardening-mode=none, libcpp-has-no-availability-markup, libcpp-has-thread-api-pthread, linux, lsan, objective-c++, optimization=none, sanitizer-new-delete, std-at-least-c++03, std-at-least-c++11, std-at-least-c++14, std-at-least-c++17, std-at-least-c++20, std-at-least-c++23, std-at-least-c++26, stdlib=libc++, stdlib=llvm-libc++, target=aarch64-unknown-linux-gnu, thread-safety, verify-support
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:73: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 1500 seconds was requested on the command line. Forcing timeout to be 1500 seconds.
-- Testing: 10814 of 10833 tests, 72 workers --
command timed out: 1200 seconds without output running [b'python', b'../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=2020.667521
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
Step 10 (stage2/hwasan check-cxx) failure: stage2/hwasan check-cxx (failure)
...
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cfloat.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cinttypes.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/climits.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/clocale.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cmath.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/csetjmp.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/csignal.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cstdarg.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cstddef.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cstdint.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cstdio.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cstdlib.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cstring.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/ctime.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cuchar.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cwchar.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat/cwctype.inc
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.cppm
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/share/libc++/v1/std.compat.cppm
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/lib/libc++.modules.json
-- Install configuration: "Release"
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/lib/libc++.so.1.0
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/lib/libc++.so.1
-- Set non-toolchain portion of runtime path of "/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/lib/libc++.so.1.0" to ""
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/lib/libc++.so
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/lib/libc++.a
-- Installing: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxxabi/test-suite-install/lib/libc++experimental.a
[4/5] Running runtimes regression tests
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) Using %{cxx} substitution: '/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build0/bin/clang++'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) Using %{flags} substitution: ' --target=aarch64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=hwaddress'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) Using %{compile_flags} substitution: '-nostdinc++ -I %{include} -I %{cxx-include} -I %{cxx-target-include} %{maybe-include-libunwind} -I %{libcxx}/test/support -I %{libcxx}/src -D_LIBCPP_ENABLE_CXX17_REMOVED_UNEXPECTED_FUNCTIONS -std=c++26 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-noexcept-type -Wno-atomic-alignment -Wno-reserved-module-identifier -Wdeprecated-copy -Wdeprecated-copy-dtor -Wshift-negative-value -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -Wno-local-type-template-args -Wno-c++11-extensions -Wno-unknown-pragmas -Wno-pass-failed -Wno-mismatched-new-delete -Wno-redundant-move -Wno-self-move -Wno-nullability-completeness -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -Werror=thread-safety -Wuser-defined-warnings'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) Using %{link_flags} substitution: '-nostdlib++ -L %{lib} -Wl,-rpath,%{lib} -lc++ -lc++abi -pthread -latomic'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) Using %{benchmark_flags} substitution: ''
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) Using %{exec} substitution: '%{executor} --execdir %T -- '
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++abi-shared.cfg.in) All available features: add-latomic-workaround, buildhost=linux, c++26, can-create-symlinks, character-conversion-warnings, clang, clang-21, clang-21.0, clang-21.0.0, diagnose-if-support, enable-benchmarks=no, gcc-style-warnings, glibc-old-ru_RU-decimal-point, has-1024-bit-atomics, has-64-bit-atomics, has-fblocks, has-fconstexpr-steps, has-unix-headers, hwasan, large_tests, libcpp-abi-version=1, libcpp-has-no-availability-markup, libcpp-has-no-experimental-syncstream, libcpp-has-no-experimental-tzdb, libcpp-has-no-incomplete-pstl, libcpp-has-thread-api-pthread, linux, long_tests, lsan, objective-c++, optimization=none, sanitizer-new-delete, std-at-least-c++03, std-at-least-c++11, std-at-least-c++14, std-at-least-c++17, std-at-least-c++20, std-at-least-c++23, std-at-least-c++26, stdlib=libc++, stdlib=llvm-libc++, target=aarch64-unknown-linux-gnu, thread-safety, verify-support
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{cxx} substitution: '/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build0/bin/clang++'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{flags} substitution: '-pthread --target=aarch64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=hwaddress'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{compile_flags} substitution: '-nostdinc++ -I %{target-include-dir} -I %{include-dir} -I %{libcxx-dir}/test/support -std=c++26 -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wunused-template -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-noexcept-type -Wno-atomic-alignment -Wno-reserved-module-identifier -Wdeprecated-copy -Wdeprecated-copy-dtor -Wshift-negative-value -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -Wno-local-type-template-args -Wno-c++11-extensions -Wno-unknown-pragmas -Wno-pass-failed -Wno-mismatched-new-delete -Wno-redundant-move -Wno-self-move -Wno-nullability-completeness -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_NONE -Werror=thread-safety -Wuser-defined-warnings'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{link_flags} substitution: '-lc++experimental -nostdlib++ -L %{lib-dir} -Wl,-rpath,%{lib-dir} -lc++ -latomic'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{benchmark_flags} substitution: '-I /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxx/test/benchmarks/google-benchmark/include -L /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxx/test/benchmarks/google-benchmark/lib -L /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_build_hwasan/libcxx/test/benchmarks/google-benchmark/lib64 -l benchmark'
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) Using %{exec} substitution: '%{executor} --execdir %T -- '
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/libcxx/utils/libcxx/test/config.py:24: note: (llvm-libc++-shared.cfg.in) All available features: add-latomic-workaround, buildhost=linux, c++26, c++experimental, can-create-symlinks, character-conversion-warnings, clang, clang-21, clang-21.0, clang-21.0.0, diagnose-if-support, enable-benchmarks=no, gcc-style-warnings, glibc-old-ru_RU-decimal-point, has-1024-bit-atomics, has-64-bit-atomics, has-fblocks, has-fconstexpr-steps, has-unix-headers, hwasan, large_tests, libcpp-abi-version=1, libcpp-hardening-mode=none, libcpp-has-no-availability-markup, libcpp-has-thread-api-pthread, linux, lsan, objective-c++, optimization=none, sanitizer-new-delete, std-at-least-c++03, std-at-least-c++11, std-at-least-c++14, std-at-least-c++17, std-at-least-c++20, std-at-least-c++23, std-at-least-c++26, stdlib=libc++, stdlib=llvm-libc++, target=aarch64-unknown-linux-gnu, thread-safety, verify-support
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:73: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 1500 seconds was requested on the command line. Forcing timeout to be 1500 seconds.
-- Testing: 10814 of 10833 tests, 72 workers --

command timed out: 1200 seconds without output running [b'python', b'../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=2020.667521
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..

@main

…lvm#141613) Given the following example: ``` module { func.func @main(%arg0: tensor<1x1x1x4x1xf32>, %arg1: tensor<1x1x4xf32>) -> tensor<1x1x1x4x1xf32> { %pack = linalg.pack %arg1 outer_dims_perm = [1, 2, 0] inner_dims_pos = [2, 0] inner_tiles = [4, 1] into %arg0 : tensor<1x1x4xf32> -> tensor<1x1x1x4x1xf32> return %pack : tensor<1x1x1x4x1xf32> } } ``` We would generate an invalid transpose operation because the calculated permutation would be `[0, 2, 0]` which is semantically incorrect. As the permutation must contain unique integers corresponding to the source tensor dimensions. The following change modifies how we calculate the permutation array and ensures that the dimension indices given in the permutation array is unique. The above example would then translate to a transpose having a permutation of `[1, 2, 0]`. Following the rule, that the `inner_dim_pos` is appended to the permutation array and the preceding indices are filled with the remaining dimensions.

chrsmcgrr requested review from dcaballe, hanhanW and nicolasvasilache as code owners May 27, 2025 14:40

llvmbot added mlir:linalg mlir labels May 27, 2025

hanhanW reviewed May 27, 2025

View reviewed changes

mlir/test/Dialect/Linalg/decompose-pack.mlir Outdated Show resolved Hide resolved

banach-space reviewed May 27, 2025

View reviewed changes

mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp Outdated Show resolved Hide resolved

chrsmcgrr requested review from banach-space and hanhanW June 2, 2025 14:30

chrsmcgrr force-pushed the fix-decomposition-of-unit-outer-dims branch from 5c41b17 to 4f378a5 Compare June 4, 2025 11:23

banach-space reviewed Jun 11, 2025

View reviewed changes

chrsmcgrr force-pushed the fix-decomposition-of-unit-outer-dims branch from 4f378a5 to 79e0ff5 Compare June 12, 2025 15:00

hanhanW approved these changes Jun 18, 2025

View reviewed changes

mlir/test/Dialect/Linalg/decompose-unpack.mlir Outdated Show resolved Hide resolved

banach-space approved these changes Jun 19, 2025

View reviewed changes

chrsmcgrr added 4 commits June 26, 2025 07:59

Update(1) [mlir][linalg] fix OuterUnitDims linalg.pack decomposition …

861c6a2

…pattern

Update(2) [mlir][linalg] fix OuterUnitDims linalg.pack decomposition …

c3b4e61

…pattern

Update(3) [mlir][linalg] fix OuterUnitDims linalg.pack decomposition…

af4d38d

… pattern

chrsmcgrr force-pushed the fix-decomposition-of-unit-outer-dims branch from 79e0ff5 to af4d38d Compare June 26, 2025 05:59

rYm-A mentioned this pull request Jun 26, 2025

[MLIR][linalg] DecomposeOuterUnitDimsPackOpPattern not checking trailing dimension for tiling #145861

Closed

AGindinson merged commit 96c1611 into llvm:main Jun 27, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern #141613

[mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern #141613

Uh oh!

chrsmcgrr commented May 27, 2025

Uh oh!

llvmbot commented May 27, 2025 •

edited

Loading

Uh oh!

hanhanW left a comment

Uh oh!

Uh oh!

hanhanW commented May 27, 2025

Uh oh!

Uh oh!

chrsmcgrr commented May 28, 2025

Uh oh!

banach-space commented May 28, 2025

Uh oh!

chrsmcgrr commented Jun 2, 2025 •

edited

Loading

Uh oh!

banach-space left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

banach-space left a comment

Uh oh!

hanhanW commented Jun 26, 2025

Uh oh!

chrsmcgrr commented Jun 27, 2025

Uh oh!

Uh oh!

llvm-ci commented Jun 27, 2025

Uh oh!

Uh oh!

[mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern #141613

[mlir][linalg] fix OuterUnitDims linalg.pack decomposition pattern #141613

Uh oh!

Conversation

chrsmcgrr commented May 27, 2025

Uh oh!

llvmbot commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanhanW left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hanhanW commented May 27, 2025

Uh oh!

Uh oh!

chrsmcgrr commented May 28, 2025

Uh oh!

banach-space commented May 28, 2025

Uh oh!

chrsmcgrr commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

banach-space left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

banach-space left a comment

Choose a reason for hiding this comment

Uh oh!

hanhanW commented Jun 26, 2025

Uh oh!

chrsmcgrr commented Jun 27, 2025

Uh oh!

Uh oh!

llvm-ci commented Jun 27, 2025

Uh oh!

Uh oh!

llvmbot commented May 27, 2025 •

edited

Loading

chrsmcgrr commented Jun 2, 2025 •

edited

Loading