[Clang] Use under for padded vectors before storing. #164821

fhahn · 2025-10-23T13:59:40Z

Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.

Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.

llvmbot · 2025-10-23T14:00:22Z

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Florian Hahn (fhahn)

Changes

Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.

Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.

Full diff: https://github.com/llvm/llvm-project/pull/164821.diff

2 Files Affected:

(modified) clang/lib/CodeGen/CGExpr.cpp (+3)
(added) clang/test/CodeGen/AArch64/ext-vector-coercion.c (+43)

diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 301d5770cf78f..a581cf821092f 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -2300,6 +2300,9 @@ void CodeGenFunction::EmitStoreOfScalar(llvm::Value *Value, Address Addr,
         SmallVector<int, 16> Mask(NewVecTy->getNumElements(), -1);
         std::iota(Mask.begin(), Mask.begin() + VecTy->getNumElements(), 0);
         Value = Builder.CreateShuffleVector(Value, Mask, "extractVec");
+        // The extra lanes will be poison. Freeze the whole vector to make sure
+        // the padding memory is not poisoned, which may break coercion.
+        Value = Builder.CreateFreeze(Value);
         SrcTy = NewVecTy;
       }
       if (Addr.getElementType() != SrcTy)
diff --git a/clang/test/CodeGen/AArch64/ext-vector-coercion.c b/clang/test/CodeGen/AArch64/ext-vector-coercion.c
new file mode 100644
index 0000000000000..44638fac1560a
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/ext-vector-coercion.c
@@ -0,0 +1,43 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 6
+// RUN: %clang_cc1 -fenable-matrix -triple arm64-apple-macosx %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+
+typedef float float3 __attribute__((ext_vector_type(3)));
+struct Vec3 {
+  union {
+    struct {
+      float x;
+      float y;
+      float z;
+    };
+  float vec __attribute__((ext_vector_type(3)));
+  };
+};
+
+// CHECK-LABEL: define i128 @add(
+// CHECK-SAME: i128 [[A_COERCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[RETVAL:%.*]] = alloca [[STRUCT_VEC3:%.*]], align 16
+// CHECK-NEXT:    [[A:%.*]] = alloca [[STRUCT_VEC3]], align 16
+// CHECK-NEXT:    [[COERCE_DIVE:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[A]], i32 0, i32 0
+// CHECK-NEXT:    store i128 [[A_COERCE]], ptr [[COERCE_DIVE]], align 16
+// CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[A]], i32 0, i32 0
+// CHECK-NEXT:    [[LOADVECN:%.*]] = load <4 x float>, ptr [[TMP0]], align 16
+// CHECK-NEXT:    [[EXTRACTVEC:%.*]] = shufflevector <4 x float> [[LOADVECN]], <4 x float> poison, <3 x i32> <i32 0, i32 1, i32 2>
+// CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[A]], i32 0, i32 0
+// CHECK-NEXT:    [[LOADVECN1:%.*]] = load <4 x float>, ptr [[TMP1]], align 16
+// CHECK-NEXT:    [[EXTRACTVEC2:%.*]] = shufflevector <4 x float> [[LOADVECN1]], <4 x float> poison, <3 x i32> <i32 0, i32 1, i32 2>
+// CHECK-NEXT:    [[ADD:%.*]] = fadd <3 x float> [[EXTRACTVEC]], [[EXTRACTVEC2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[RETVAL]], i32 0, i32 0
+// CHECK-NEXT:    [[EXTRACTVEC3:%.*]] = shufflevector <3 x float> [[ADD]], <3 x float> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 poison>
+// CHECK-NEXT:    [[TMP3:%.*]] = freeze <4 x float> [[EXTRACTVEC3]]
+// CHECK-NEXT:    store <4 x float> [[TMP3]], ptr [[TMP2]], align 16
+// CHECK-NEXT:    [[COERCE_DIVE4:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[RETVAL]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP4:%.*]] = load i128, ptr [[COERCE_DIVE4]], align 16
+// CHECK-NEXT:    ret i128 [[TMP4]]
+//
+struct Vec3 add(struct Vec3 a) {
+  struct Vec3 res;
+  res.vec = a.vec + a.vec;
+  return res;
+}
+

nikic · 2025-10-23T14:04:18Z

clang/lib/CodeGen/CGExpr.cpp

        Value = Builder.CreateShuffleVector(Value, Mask, "extractVec");
+        // The extra lanes will be poison. Freeze the whole vector to make sure
+        // the padding memory is not poisoned, which may break coercion.
+        Value = Builder.CreateFreeze(Value);


Instead of freezing, can we shuffle in undef instead? Or does that end up being worse?

Happy to go with either. Not having the freeze is likely to be slightly better (e.g. we may be able to prove some lanes of the non-padding lanes could be poison), I just thought initially we might want to stay consistent w.r.t. shuffle with poison.

Updated now

github-actions · 2025-10-23T18:35:48Z

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:

git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef([^a-zA-Z0-9_-]|$)|UndefValue::get)' 'HEAD~1' HEAD clang/test/CodeGen/AArch64/ext-vector-coercion.c clang/lib/CodeGen/CGExpr.cpp clang/test/CodeGenCXX/matrix-vector-bit-int.cpp

The following files introduce new uses of undef:

clang/lib/CodeGen/CGExpr.cpp

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

clang/test/CodeGen/AArch64/ext-vector-coercion.c

Currently Clang usually leaves padding bits uninitialized, which means they are undef at the moment. When expanding stores of vector types to include padding, the padding lanes will be poison, hence the padding bits will be poison. This interacts badly with coercion of arguments and return values, where 3 x float vectors will be loaded as i128 integer; poisoning the padding bits will make the whole value poison.

nikic

LGTM

Currently Clang usually leaves padding bits uninitialized, which means they are undef at the moment. When expanding stores of vector types to include padding, the padding lanes will be poison, hence the padding bits will be poison. This interacts badly with coercion of arguments and return values, where 3 x float vectors will be loaded as i128 integer; poisoning the padding bits will make the whole value poison. Not sure if there's a better way, but I think we have a number of places that currently rely on the padding being undef, not poison. PR: llvm/llvm-project#164821

Currently Clang usually leaves padding bits uninitialized, which means they are undef at the moment. When expanding stores of vector types to include padding, the padding lanes will be poison, hence the padding bits will be poison. This interacts badly with coercion of arguments and return values, where 3 x float vectors will be loaded as i128 integer; poisoning the padding bits will make the whole value poison. Not sure if there's a better way, but I think we have a number of places that currently rely on the padding being undef, not poison. PR: llvm#164821 rdar://162655662 (cherry-picked from 5378584)

Currently Clang usually leaves padding bits uninitialized, which means they are undef at the moment. When expanding stores of vector types to include padding, the padding lanes will be poison, hence the padding bits will be poison. This interacts badly with coercion of arguments and return values, where 3 x float vectors will be loaded as i128 integer; poisoning the padding bits will make the whole value poison. Not sure if there's a better way, but I think we have a number of places that currently rely on the padding being undef, not poison. PR: llvm#164821

fhahn requested review from AaronBallman, aemerson, efriedma-quic, nikic and rjmccall October 23, 2025 13:59

llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 23, 2025

nikic reviewed Oct 23, 2025

View reviewed changes

fhahn force-pushed the clang-freeze-padding-in-vectors branch from fd3df1e to ae3e4af Compare October 23, 2025 18:33

fhahn requested a review from jroelofs October 23, 2025 19:36

jroelofs reviewed Oct 27, 2025

View reviewed changes

clang/test/CodeGen/AArch64/ext-vector-coercion.c Outdated Show resolved Hide resolved

clang/test/CodeGen/AArch64/ext-vector-coercion.c Outdated Show resolved Hide resolved

fhahn added 4 commits October 28, 2025 02:52

[Clang] Add test

982b215

!fixup use undef instead of poison for padding.

8ef57e7

!fixup don't use poison mask

dd821bb

fhahn force-pushed the clang-freeze-padding-in-vectors branch from ae3e4af to dd821bb Compare October 28, 2025 03:20

nikic approved these changes Oct 28, 2025

View reviewed changes

jroelofs approved these changes Oct 28, 2025

View reviewed changes

Merge branch 'main' into clang-freeze-padding-in-vectors

19809d1

fhahn merged commit 5378584 into llvm:main Oct 29, 2025
9 of 10 checks passed

fhahn changed the title ~~[Clang] Freeze padded vectors before storing.~~ [Clang] Use under for padded vectors before storing. Oct 29, 2025

fhahn deleted the clang-freeze-padding-in-vectors branch October 29, 2025 03:25

fhahn mentioned this pull request Oct 30, 2025

[Clang] Freeze padded vectors before storing. (#164821) swiftlang/llvm-project#11721

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Clang] Use under for padded vectors before storing. #164821

[Clang] Use under for padded vectors before storing. #164821

fhahn commented Oct 23, 2025

Uh oh!

llvmbot commented Oct 23, 2025 •

edited

Loading

Uh oh!

nikic Oct 23, 2025

Uh oh!

fhahn Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

Uh oh!

Uh oh!

nikic left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Clang] Use under for padded vectors before storing. #164821

[Clang] Use under for padded vectors before storing. #164821

Conversation

fhahn commented Oct 23, 2025

Uh oh!

llvmbot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

Uh oh!

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

llvmbot commented Oct 23, 2025 •

edited

Loading