[mlir][tosa] Work around GCC bug in tosa-to-tensor #91521

sabauma · 2024-05-08T19:11:14Z

GCC 12 and 13 generate incorrect code for a pattern in the tosa-to-tensor pass responsible for lowering tosa.reshape. This results in the tosa.reshape lowering producing IR which fails to verify. I've narrowed down the set of cmake flags needed to reproduce the issue to this:

cmake -G Ninja ../llvm \
  -DLLVM_ENABLE_PROJECTS="mlir" \
  -DLLVM_TARGETS_TO_BUILD=host \
  -DLLVM_ENABLE_PROJECTS=mlir \
  -DCMAKE_BUILD_TYPE="Release" \
  -DCMAKE_CXX_FLAGS_RELEASE="-O2" \
  -DCMAKE_CXX_FLAGS="-O2" \
  -DCMAKE_CXX_COMPILER=g++ \
  -DCMAKE_C_COMPILER=gcc

This is the failing test case:

func.func @fails_in_gcc_12(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> {
  %0 = tosa.reshape %arg0 {new_shape = array<i64: 1, 1, 1, -1>} : (tensor<?xf32>) -> tensor<1x1x1x?xf32>
  return %0 : tensor<1x1x1x?xf32>
}

This should lower to a tensor.expand_shape operation like so:

func.func @foo(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> {
  %c0 = arith.constant 0 : index
  %dim = tensor.dim %arg0, %c0 : tensor<?xf32>
  %c1 = arith.constant 1 : index
  %expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 1, 1, %dim] : tensor<?xf32> into tensor<1x1x1x?xf32>
  return %expanded : tensor<1x1x1x?xf32>
}

Under GCC 12/13 with the above cmake configuration, the tensor.expand_shape looks like this

%2 = "tensor.expand_shape"(%arg0) <{reassociation = [[0, 1, 2, 3]], static_output_shape = array<i64>}> : (tensor<?xf32>) -> tensor<?x1x1x?xf32>

The key difference is the computed output type of tensor<?x1x1x?xf32> rather than the expected tensor<1x1x1x?xf32>. This expand_shape fails to verify with this error message:

error: 'tensor.expand_shape' op expected number of static shape dims to be equal to the output rank (4) but found 0 inputs instead

The problematic code is calculating the intermediate shape of the generated tensor.expand_shape operation in the
expand_shape/collapse_shape sequence that implements tosa.reshape.

// Compute result shape
bool resultIsStatic = true;
auto resultShape = llvm::map_to_vector(newShape, [&](int64_t size) {
  // Omitted

  // If we do not know the total size of the tensor, keep this dimension
  // dynamic in the result shape.
  if (!inputIsStatic) {
    resultIsStatic = false;
    return ShapedType::kDynamic;
  }
});

if (resultIsStatic) {
  // do something
  return;
}

// do something else
return;

The failure point seems to be the update of the resultIsStatic variable in the lambda body. The assignment of false is not propagated to the use in the if-statement, resulting in the branch being taken when it should not.

I've found several modification to the code that gets around the bug. The version I settled on is one which makes the logic a little more obvious.

llvmbot · 2024-05-08T19:11:48Z

@llvm/pr-subscribers-mlir

Author: Spenser Bauman (sabauma)

Changes

GCC 12 and 13 generate incorrect code for a pattern in the tosa-to-tensor pass responsible for lowering tosa.reshape. This results in the tosa.reshape lowering producing IR which fails to verify. I've narrowed down the set of cmake flags needed to reproduce the issue to this:

cmake -G Ninja ../llvm \
  -DLLVM_ENABLE_PROJECTS="mlir" \
  -DLLVM_TARGETS_TO_BUILD=host \
  -DLLVM_ENABLE_PROJECTS=mlir \
  -DCMAKE_BUILD_TYPE="Release" \
  -DCMAKE_CXX_FLAGS_RELEASE="-O2" \
  -DCMAKE_CXX_FLAGS="-O2" \
  -DCMAKE_CXX_COMPILER=g++ \
  -DCMAKE_C_COMPILER=gcc

This is the failing test case:

func.func @<!-- -->fails_in_gcc_12(%arg0: tensor&lt;?xf32&gt;) -&gt; tensor&lt;1x1x1x?xf32&gt; {
  %0 = tosa.reshape %arg0 {new_shape = array&lt;i64: 1, 1, 1, -1&gt;} : (tensor&lt;?xf32&gt;) -&gt; tensor&lt;1x1x1x?xf32&gt;
  return %0 : tensor&lt;1x1x1x?xf32&gt;
}

This should lower to a tensor.expand_shape operation like so:

func.func @<!-- -->foo(%arg0: tensor&lt;?xf32&gt;) -&gt; tensor&lt;1x1x1x?xf32&gt; {
  %c0 = arith.constant 0 : index
  %dim = tensor.dim %arg0, %c0 : tensor&lt;?xf32&gt;
  %c1 = arith.constant 1 : index
  %expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 1, 1, %dim] : tensor&lt;?xf32&gt; into tensor&lt;1x1x1x?xf32&gt;
  return %expanded : tensor&lt;1x1x1x?xf32&gt;
}

Under GCC 12/13 with the above cmake configuration, the tensor.expand_shape looks like this

%2 = "tensor.expand_shape"(%arg0) &lt;{reassociation = [[0, 1, 2, 3]], static_output_shape = array&lt;i64&gt;}&gt; : (tensor&lt;?xf32&gt;) -&gt; tensor&lt;?x1x1x?xf32&gt;

The key difference is the computed output type of tensor<?x1x1x?xf32> rather than the expected tensor<1x1x1x?xf32>. This expand_shape fails to verify with this error message:

error: 'tensor.expand_shape' op expected number of static shape dims to be equal to the output rank (4) but found 0 inputs instead

The problematic code is calculating the intermediate shape of the generated tensor.expand_shape operation in the
expand_shape/collapse_shape sequence that implements tosa.reshape.

// Compute result shape
bool resultIsStatic = true;
auto resultShape = llvm::map_to_vector(newShape, [&amp;](int64_t size) {
  // Omitted

  // If we do not know the total size of the tensor, keep this dimension
  // dynamic in the result shape.
  if (!inputIsStatic) {
    resultIsStatic = false;
    return ShapedType::kDynamic;
  }
});

if (resultIsStatic) {
  // do something
  return;
}

// do something else
return;

The failure point seems to be the update of the resultIsStatic variable in the lambda body. The assignment of false is not propagated to the use in the if-statement, resulting in the branch being taken when it should not.

I've found several modification to the code that gets around the bug. The version I settled on is one which makes the logic a little more obvious.

Full diff: https://github.com/llvm/llvm-project/pull/91521.diff

1 Files Affected:

(modified) mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp (+1)

diff --git a/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp b/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp
index cd6da35582469..488c24e90f5a0 100644
--- a/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp
+++ b/mlir/lib/Conversion/TosaToTensor/TosaToTensor.cpp
@@ -84,6 +84,7 @@ TensorType inferReshapeExpandedType(TensorType inputType,
     return totalSize / totalSizeNoPlaceholder;
   });
 
+
   // A syntactic restriction in 'tensor.expand_shape' forbids a dynamically
   // shaped input from being reshaped into a statically shaped result. We may
   // simply turn the first result dimension dynamic to address this.

github-actions · 2024-05-08T19:14:10Z

✅ With the latest revision this PR passed the C/C++ code formatter.

pinskia · 2024-05-09T02:46:14Z

Is there a GCC bug report about this? If not please file one and double check to make sure this is a GCC bug or just a working around something else.

sabauma · 2024-05-09T11:16:58Z

Is there a GCC bug report about this? If not please file one and double check to make sure this is a GCC bug or just a working around something else.

I'm working on a GCC bug report at the moment. Currently, I'm blocked on account creation on their bug tracker. I'll update this report when I have an issue filed. Based on GCC's IR dumps, I do think this is a bug in GCC, though I'm not that knowledgeable in their IR. There is a rather obvious removal of the relevant branch after the full redundancy elimination pass.

sjarus

Thank you for the deep dive into this problem and for the resolution!

@foo

GCC 12 and 13 generate incorrect code for a pattern in the tosa-to-tensor pass responsible for lowering tosa.reshape. This results in the tosa.reshape lowering producing IR which fails to verify. I've narrowed down the set of cmake flags needed to reproduce the issue to this: cmake -G Ninja ../llvm \ -DLLVM_ENABLE_PROJECTS="mlir" \ -DLLVM_TARGETS_TO_BUILD=host \ -DLLVM_ENABLE_PROJECTS=mlir \ -DCMAKE_BUILD_TYPE="Release" \ -DCMAKE_CXX_FLAGS_RELEASE="-O2" \ -DCMAKE_CXX_FLAGS="-O2" \ -DCMAKE_CXX_COMPILER=g++ \ -DCMAKE_C_COMPILER=gcc This is the failing test case: func.func @fails_in_gcc_12(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> { %0 = tosa.reshape %arg0 {new_shape = array<i64: 1, 1, 1, -1>} : (tensor<?xf32>) -> tensor<1x1x1x?xf32> return %0 : tensor<1x1x1x?xf32> } This should correctly lower to a single tensor.expand_shape operation like so: func.func @foo(%arg0: tensor<?xf32>) -> tensor<1x1x1x?xf32> { %c0 = arith.constant 0 : index %dim = tensor.dim %arg0, %c0 : tensor<?xf32> %c1 = arith.constant 1 : index %expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 1, 1, %dim] : tensor<?xf32> into tensor<1x1x1x?xf32> return %expanded : tensor<1x1x1x?xf32> } Under GCC 12/13 with the above cmake configuration, the tensor.expand_shape looks like this %2 = "tensor.expand_shape"(%arg0) <{reassociation = [[0, 1, 2, 3]], static_output_shape = array<i64>}> : (tensor<?xf32>) -> tensor<?x1x1x?xf32> This expand_shape fails to verify with this error message: error: 'tensor.expand_shape' op expected number of static shape dims to be equal to the output rank (4) but found 0 inputs instead The problematic code is calculating the intermediate shape of the generated tensor.expand_shape operation in the expand_shape/collapse_shape sequence that implements tosa.reshape. // Compute result shape bool resultIsStatic = true; auto resultShape = llvm::map_to_vector(newShape, [&](int64_t size) { // Omitted // If we do not know the total size of the tensor, keep this dimension // dynamic in the result shape. if (!inputIsStatic) { resultIsStatic = false; return ShapedType::kDynamic; } }); if (resultIsStatic) { // do something return; } // do something else return; The failure point seems to be the update of the resultIsStatic variable in the lambda body. The assignment of false is not propagated to the use in the if-statement, resulting in the branch being taken when it should not. I've found several modification to the code that gets around the bug. The version I settled on is one which makes the logic a little more obvious.

sabauma · 2024-05-10T14:01:08Z

Just to close the loop on this, I've filed the GCC bug report here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115033

rafaelubalmw · 2024-05-10T14:17:41Z

Interesting GCC bug report, and great double contribution, Spenser. Hopefully GCC folks can take it from there. Makes me wonder about the potentially critical ramifications of this bug in lambda variable capture.

sabauma requested review from MaheshRavishankar, sjarus and rafaelubalmw May 8, 2024 19:11

sabauma self-assigned this May 8, 2024

llvmbot added the mlir label May 8, 2024

sabauma force-pushed the gcc-fix branch from 103b3fb to 077ee1e Compare May 8, 2024 19:49

sjarus approved these changes May 9, 2024

View reviewed changes

sabauma force-pushed the gcc-fix branch from 077ee1e to 584198e Compare May 10, 2024 13:04

rafaelubalmw approved these changes May 10, 2024

View reviewed changes

sabauma merged commit 9d66dca into llvm:main May 11, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir][tosa] Work around GCC bug in tosa-to-tensor #91521

[mlir][tosa] Work around GCC bug in tosa-to-tensor #91521

sabauma commented May 8, 2024

llvmbot commented May 8, 2024

github-actions bot commented May 8, 2024 •

edited

pinskia commented May 9, 2024

sabauma commented May 9, 2024

sjarus left a comment

sabauma commented May 10, 2024

rafaelubalmw commented May 10, 2024

[mlir][tosa] Work around GCC bug in tosa-to-tensor #91521

[mlir][tosa] Work around GCC bug in tosa-to-tensor #91521

Conversation

sabauma commented May 8, 2024

llvmbot commented May 8, 2024

github-actions bot commented May 8, 2024 • edited

pinskia commented May 9, 2024

sabauma commented May 9, 2024

sjarus left a comment

Choose a reason for hiding this comment

sabauma commented May 10, 2024

rafaelubalmw commented May 10, 2024

github-actions bot commented May 8, 2024 •

edited