Skip to content

Conversation

cofibrant
Copy link
Contributor

@cofibrant cofibrant commented Oct 16, 2025

a1ef81d added overloads for llvm.matrix.column.major.store and llvm.matrix.column.major.load that allow strides to occupy an arbitrary bitwidth. This change wasn't reflected in the verifier, causing an assertion to trip when given strides overflowing 64-bit. This patch explicitly caps the bitwidth at 64, repairing the crash and avoiding future complexity dealing with strides that overflow 64 bits.

CC @fhahn

@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-ir

Author: Nathan Corbyn (cofibrant)

Changes

a1ef81d added overloads for llvm.matrix.column.major.store and llvm.matrix.column.major.load that allow strides to occupy an arbitrary bit-width. This change wasn't reflected in the verifier, causing an assertion to trip when given strides overflowing 64-bit. This patch updates the verifier to avoid this crash.

CC @fhahn


Full diff: https://github.com/llvm/llvm-project/pull/163729.diff

3 Files Affected:

  • (modified) llvm/lib/IR/Verifier.cpp (+7-2)
  • (added) llvm/test/Verifier/matrix-intrinsics-strides.ll (+29)
  • (modified) llvm/test/Verifier/matrix-intrinsics.ll (+2-3)
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index c79a95087dbdd..6f38020cdd33b 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -6479,9 +6479,14 @@ void Verifier::visitIntrinsicCall(Intrinsic::ID ID, CallBase &Call) {
               NumRows->getZExtValue() * NumColumns->getZExtValue(),
           "Result of a matrix operation does not fit in the returned vector!");
 
-    if (Stride)
-      Check(Stride->getZExtValue() >= NumRows->getZExtValue(),
+    if (Stride) {
+      // Stride can occupy an arbitrary bit-width, while rows and columns are
+      // always 32-bit, so zero extend to the largest common bit-width to
+      // compare.
+      unsigned BitWidth = std::max(Stride->getBitWidth(), NumRows->getBitWidth());
+      Check(Stride->getValue().zext(BitWidth).uge(NumRows->getValue().zext(BitWidth)),
             "Stride must be greater or equal than the number of rows!", IF);
+    }
 
     break;
   }
diff --git a/llvm/test/Verifier/matrix-intrinsics-strides.ll b/llvm/test/Verifier/matrix-intrinsics-strides.ll
new file mode 100644
index 0000000000000..5ba324eebe090
--- /dev/null
+++ b/llvm/test/Verifier/matrix-intrinsics-strides.ll
@@ -0,0 +1,29 @@
+; RUN: opt %s -p verify -S -disable-output
+
+; This test ensures that verifier correctly handles very wide and very narrows
+; strides.
+
+define <4 x float> @column.major_load_stride_i8(ptr %m, i32 %arg) {
+  %result.1 = call <4 x float> @llvm.matrix.column.major.load.v4f32.i128(ptr %m, i8 16, i1 false, i32 2, i32 2)
+  ret <4 x float> %result.1
+}
+
+define <4 x float> @column.major_load_stride_i128(ptr %m, i32 %arg) {
+  %result.1 = call <4 x float> @llvm.matrix.column.major.load.v4f32.i128(ptr %m, i128 u0x10000000000000000, i1 false, i32 2, i32 2)
+  ret <4 x float> %result.1
+}
+
+define void @column.major_store_stride_i8(ptr %m, i64 %arg) {
+  call void @llvm.matrix.column.major.store.v4f32.i128(<4 x float> zeroinitializer, ptr %m, i8 16, i1 false, i32 2, i32 2)
+  ret void
+}
+
+define void @column.major_store_stride_i128(ptr %m, i64 %arg) {
+  call void @llvm.matrix.column.major.store.v4f32.i128(<4 x float> zeroinitializer, ptr %m, i128 u0x10000000000000000, i1 false, i32 2, i32 2)
+  ret void
+}
+
+declare <6 x float> @llvm.matrix.column.major.load.v6f32.i8(ptr, i8, i1, i32, i32)
+declare void @llvm.matrix.column.major.store.v4p0.i8(<4 x ptr>, ptr, i8, i1, i32, i32)
+declare <6 x float> @llvm.matrix.column.major.load.v6f32.i128(ptr, i64, i1, i32, i32)
+declare void @llvm.matrix.column.major.store.v4p0.i128(<4 x ptr>, ptr, i64, i1, i32, i32)
diff --git a/llvm/test/Verifier/matrix-intrinsics.ll b/llvm/test/Verifier/matrix-intrinsics.ll
index b6d5ad9a3cc49..e208d47c1d88e 100644
--- a/llvm/test/Verifier/matrix-intrinsics.ll
+++ b/llvm/test/Verifier/matrix-intrinsics.ll
@@ -1,8 +1,7 @@
-; RUN: not llvm-as < %s -o /dev/null 2>&1 | FileCheck %s
+; RUN: not opt -S %s -p verify 2>&1 | FileCheck %s
 
 define <4 x float> @transpose(<4 x float> %m, i32 %arg) {
-; CHECK: assembly parsed, but does not verify as correct!
-; CHECK-NEXT: Result of a matrix operation does not fit in the returned vector!
+; CHECK: Result of a matrix operation does not fit in the returned vector!
 ; CHECK-NEXT: Result of a matrix operation does not fit in the returned vector!
 ; CHECK-NEXT: Result of a matrix operation does not fit in the returned vector!
 ; CHECK-NEXT: immarg operand has non-immediate parameter

@github-actions
Copy link

github-actions bot commented Oct 16, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@fhahn fhahn requested review from farzonl, jroelofs and nikic October 16, 2025 10:27
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you could also verify that the bit width is <= 64, which might save additional handling elsewhere. IIRC you don't need more than 64 bits as long as that's the maximum size_t type. No strong opinion though.

@cofibrant cofibrant force-pushed the cofibrant/matrix-ir-verifier-large-strides branch from 8436cb4 to 3b06181 Compare October 16, 2025 10:55
@cofibrant
Copy link
Contributor Author

Alternatively you could also verify that the bit width is <= 64, which might save additional handling elsewhere. IIRC you don't need more than 64 bits as long as that's the maximum size_t type. No strong opinion though.

I will defer to @fhahn on this

@cofibrant cofibrant force-pushed the cofibrant/matrix-ir-verifier-large-strides branch from 3b06181 to 4d7ea22 Compare October 16, 2025 10:58
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be fine to restrict the stride to be <= 64 bits

@cofibrant cofibrant force-pushed the cofibrant/matrix-ir-verifier-large-strides branch from 4d7ea22 to 8bfaff3 Compare October 16, 2025 12:06
@cofibrant cofibrant changed the title [Matrix][IR] Don't crash when verifying strides with more than 64 bits [Matrix][IR] Cap stride bitwidth at 64 Oct 16, 2025
@cofibrant
Copy link
Contributor Author

Ok that's a simpler change, I've updated the PR to reflect this.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also update LangRef together with the verifier change

@cofibrant cofibrant force-pushed the cofibrant/matrix-ir-verifier-large-strides branch from 8bfaff3 to 1594632 Compare October 16, 2025 13:47
@cofibrant cofibrant force-pushed the cofibrant/matrix-ir-verifier-large-strides branch from 1594632 to a0f9895 Compare October 16, 2025 13:48

if (Stride)
if (Stride) {
Check(Stride->getBitWidth() <= 64, "Stride bitwidth cannot exceed 64!",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks fine to me. Are there any cases where it could be less than 64 where we might want this to be configurable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in practice on targets with 32 bit index width (or lower), it may be smaller, matching the index width

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood Farzon's question as asking whether we might want the cap to be configurable so that, e.g., we can reject 64-bit strides when verifying IR targeting 32-bit platforms

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restriction in the verifier is mostly just so we can use getZExtValue() on it.

Regardless of the type width, stride, stride, columns, rows must be such that we can access all data without wrapping the address space. Restricting to i32 on 32 bit targets wouldn't really help to enforce that, as it would still be possible to provide arguments that would cause the accesses to wrap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks!


if (Stride)
if (Stride) {
Check(Stride->getBitWidth() <= 64, "Stride bitwidth cannot exceed 64!",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in practice on targets with 32 bit index width (or lower), it may be smaller, matching the index width

@cofibrant cofibrant force-pushed the cofibrant/matrix-ir-verifier-large-strides branch from a0f9895 to a455a6c Compare October 17, 2025 09:07
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@fhahn
Copy link
Contributor

fhahn commented Oct 17, 2025

Test failure is unrelated #163937

@fhahn fhahn merged commit b00c4ff into llvm:main Oct 17, 2025
9 of 11 checks passed
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 17, 2025
a1ef81d added overloads for `llvm.matrix.column.major.store` and
`llvm.matrix.column.major.load` that allow strides to occupy an
arbitrary bitwidth. This change wasn't reflected in the verifier,
causing an assertion to trip when given strides overflowing 64-bit. This
patch explicitly caps the bitwidth at 64, repairing the crash and
avoiding future complexity dealing with strides that overflow 64 bits.

PR: llvm/llvm-project#163729
@cofibrant cofibrant deleted the cofibrant/matrix-ir-verifier-large-strides branch October 22, 2025 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants