Skip to content

[MLIR][Affine] Fix dead store elimination for vector stores with different types#189248

Merged
joker-eph merged 1 commit into
llvm:mainfrom
joker-eph:fix/issue-113687
Apr 1, 2026
Merged

[MLIR][Affine] Fix dead store elimination for vector stores with different types#189248
joker-eph merged 1 commit into
llvm:mainfrom
joker-eph:fix/issue-113687

Conversation

@joker-eph
Copy link
Copy Markdown
Contributor

@joker-eph joker-eph commented Mar 29, 2026

affine-scalrep's findUnusedStore incorrectly classified an affine.vector_store as dead when a subsequent store wrote to the same base index but with a smaller vector type. A vector<1xi64> store at [0,0] does not fully overwrite a vector<5xi64> store at [0,0], so the first store must be preserved.

The loadCSE function in the same file already had the correct type-equality check for loads; this patch adds the analogous check for stores in findUnusedStore.

Fixes #113687

Assisted-by: Claude Code

@joker-eph joker-eph requested a review from bondhugula March 29, 2026 14:14
@joker-eph joker-eph changed the title [MLIR][Affine] Fix dead store elimination for vector stores with diff… [MLIR][Affine] Fix dead store elimination for vector stores with different types Mar 29, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Mar 29, 2026

@llvm/pr-subscribers-mlir-affine

@llvm/pr-subscribers-mlir

Author: Mehdi Amini (joker-eph)

Changes

affine-scalrep's findUnusedStore incorrectly classified an affine.vector_store as dead when a subsequent store wrote to the same base index but with a smaller vector type. A vector<1xi64> store at [0,0] does not fully overwrite a vector<5xi64> store at [0,0], so the first store must be preserved.

The loadCSE function in the same file already had the correct type-equality check for loads; this patch adds the analogous check for stores in findUnusedStore.

Fixes #113687

Assisted-by: Claude Code


Full diff: https://github.com/llvm/llvm-project/pull/189248.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/Affine/Utils/Utils.cpp (+10-2)
  • (modified) mlir/test/Dialect/Affine/scalrep.mlir (+37)
diff --git a/mlir/lib/Dialect/Affine/Utils/Utils.cpp b/mlir/lib/Dialect/Affine/Utils/Utils.cpp
index 7b976943e1bb7..d691a4ac192f2 100644
--- a/mlir/lib/Dialect/Affine/Utils/Utils.cpp
+++ b/mlir/lib/Dialect/Affine/Utils/Utils.cpp
@@ -908,8 +908,9 @@ mlir::affine::hasNoInterveningEffect<mlir::MemoryEffects::Read,
 // This attempts to find stores which have no impact on the final result.
 // A writing op writeA will be eliminated if there exists an op writeB if
 // 1) writeA and writeB have mathematically equivalent affine access functions.
-// 2) writeB postdominates writeA.
-// 3) There is no potential read between writeA and writeB.
+// 2) writeB writes the same type as writeA (so it fully covers writeA's bytes).
+// 3) writeB postdominates writeA.
+// 4) There is no potential read between writeA and writeB.
 static void findUnusedStore(AffineWriteOpInterface writeA,
                             SmallVectorImpl<Operation *> &opsToErase,
                             PostDominanceInfo &postDominanceInfo,
@@ -936,6 +937,13 @@ static void findUnusedStore(AffineWriteOpInterface writeA,
     if (srcAccess != destAccess)
       continue;
 
+    // Check that the store types match. If types differ, writeB may not cover
+    // all bytes written by writeA (e.g. a narrower vector type), so
+    // conservatively assume writeA is not dead.
+    if (writeA.getValueToStore().getType() !=
+        writeB.getValueToStore().getType())
+      continue;
+
     // writeB must postdominate writeA.
     if (!postDominanceInfo.postDominates(writeB, writeA))
       continue;
diff --git a/mlir/test/Dialect/Affine/scalrep.mlir b/mlir/test/Dialect/Affine/scalrep.mlir
index 901e15911f4ea..fb6eef941790a 100644
--- a/mlir/test/Dialect/Affine/scalrep.mlir
+++ b/mlir/test/Dialect/Affine/scalrep.mlir
@@ -997,3 +997,40 @@ func.func @zero_d_memrefs() {
   }
   return
 }
+
+// CHECK-LABEL: func @vector_store_dead_elim_different_types
+// A vector_store with a different vector type at the same base index must NOT
+// cause the earlier vector_store to be treated as a dead store.
+// (GitHub issue #113687)
+func.func @vector_store_dead_elim_different_types(%arg0: memref<20x1xi64>) {
+  %c0 = arith.constant 0 : index
+  // CHECK-DAG: %[[CST1:.+]] = arith.constant dense<1>
+  // CHECK-DAG: %[[CST2:.+]] = arith.constant dense<2>
+  %cst1 = arith.constant dense<1> : vector<5xi64>
+  %cst2 = arith.constant dense<2> : vector<1xi64>
+  // Both stores must be preserved: the second (narrow) store does not fully
+  // overwrite the first (wide) store.
+  // CHECK: affine.vector_store %[[CST1]], %arg0[%c0, %c0] : memref<20x1xi64>, vector<5xi64>
+  affine.vector_store %cst1, %arg0[%c0, %c0] : memref<20x1xi64>, vector<5xi64>
+  // CHECK: affine.vector_store %[[CST2]], %arg0[%c0, %c0] : memref<20x1xi64>, vector<1xi64>
+  affine.vector_store %cst2, %arg0[%c0, %c0] : memref<20x1xi64>, vector<1xi64>
+  return
+}
+
+// CHECK-LABEL: func @vector_store_dead_elim_same_type
+// A vector_store with the same type at the same base index should still cause
+// the earlier store to be treated as a dead store.
+func.func @vector_store_dead_elim_same_type(%arg0: memref<20x1xi64>) {
+  %c0 = arith.constant 0 : index
+  %cst1 = arith.constant dense<1> : vector<5xi64>
+  %cst2 = arith.constant dense<2> : vector<5xi64>
+  // CHECK-DAG: %[[CST2:.+]] = arith.constant dense<2>
+  // The first store is dead — the second store (same type, same index)
+  // fully overwrites it.
+  // CHECK-NOT: arith.constant dense<1>
+  // CHECK-NOT: affine.vector_store {{.*}} dense<1>
+  // CHECK: affine.vector_store %[[CST2]], %arg0[%c0, %c0] : memref<20x1xi64>, vector<5xi64>
+  affine.vector_store %cst1, %arg0[%c0, %c0] : memref<20x1xi64>, vector<5xi64>
+  affine.vector_store %cst2, %arg0[%c0, %c0] : memref<20x1xi64>, vector<5xi64>
+  return
+}

// Check that the store types match. If types differ, writeB may not cover
// all bytes written by writeA (e.g. a narrower vector type), so
// conservatively assume writeA is not dead.
if (writeA.getValueToStore().getType() !=
Copy link
Copy Markdown
Member

@hockyy hockyy Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less conservative solution is to check whether the bits written in writeA is smaller than writeB scalarTypeBitWidth * vectorSize. But this one is not wrong

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are surprising cases with things like vector<4xi6> vs vector<3xi8> depending on how bitpacking is interpreted, which in turn may be data layout-specific. So I'd rather have exact type match until there's a need to relax.

Copy link
Copy Markdown
Member

@hockyy hockyy Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I never considered that. By bit packing you mean that it's possible that there are padding in the former vector<4xi6> either container wise or element wise right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make such a note in the comment here.

// Check that the store types match. If types differ, writeB may not cover
// all bytes written by writeA (e.g. a narrower vector type), so
// conservatively assume writeA is not dead.
if (writeA.getValueToStore().getType() !=
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are surprising cases with things like vector<4xi6> vs vector<3xi8> depending on how bitpacking is interpreted, which in turn may be data layout-specific. So I'd rather have exact type match until there's a need to relax.

@hockyy
Copy link
Copy Markdown
Member

hockyy commented Apr 1, 2026

lgtm~

…erent types

affine-scalrep's findUnusedStore incorrectly classified an affine.vector_store
as dead when a subsequent store wrote to the same base index but with a smaller
vector type. A vector<1xi64> store at [0,0] does not fully overwrite a
vector<5xi64> store at [0,0], so the first store must be preserved.

The loadCSE function in the same file already had the correct type-equality
check for loads; this patch adds the analogous check for stores in
findUnusedStore.

Fixes llvm#113687

Assisted-by: Claude Code
@joker-eph joker-eph enabled auto-merge (squash) April 1, 2026 10:29
@joker-eph joker-eph merged commit f6ffdbc into llvm:main Apr 1, 2026
10 checks passed
joaovam pushed a commit to joaovam/llvm-project that referenced this pull request Apr 2, 2026
…erent types (llvm#189248)

affine-scalrep's findUnusedStore incorrectly classified an
affine.vector_store as dead when a subsequent store wrote to the same
base index but with a smaller vector type. A vector<1xi64> store at
[0,0] does not fully overwrite a vector<5xi64> store at [0,0], so the
first store must be preserved.

The loadCSE function in the same file already had the correct
type-equality check for loads; this patch adds the analogous check for
stores in findUnusedStore.

Fixes llvm#113687

Assisted-by: Claude Code
zwu-2025 pushed a commit to zwu-2025/llvm-project that referenced this pull request May 17, 2026
…erent types (llvm#189248)

affine-scalrep's findUnusedStore incorrectly classified an
affine.vector_store as dead when a subsequent store wrote to the same
base index but with a smaller vector type. A vector<1xi64> store at
[0,0] does not fully overwrite a vector<5xi64> store at [0,0], so the
first store must be preserved.

The loadCSE function in the same file already had the correct
type-equality check for loads; this patch adds the analogous check for
stores in findUnusedStore.

Fixes llvm#113687

Assisted-by: Claude Code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[mlir] Inconsistent results for affine.vector_store

4 participants