Skip to content

Conversation

@fhahn
Copy link
Contributor

@fhahn fhahn commented Oct 23, 2025

Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.

Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 23, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2025

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Florian Hahn (fhahn)

Changes

Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.

Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.


Full diff: https://github.com/llvm/llvm-project/pull/164821.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGExpr.cpp (+3)
  • (added) clang/test/CodeGen/AArch64/ext-vector-coercion.c (+43)
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 301d5770cf78f..a581cf821092f 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -2300,6 +2300,9 @@ void CodeGenFunction::EmitStoreOfScalar(llvm::Value *Value, Address Addr,
         SmallVector<int, 16> Mask(NewVecTy->getNumElements(), -1);
         std::iota(Mask.begin(), Mask.begin() + VecTy->getNumElements(), 0);
         Value = Builder.CreateShuffleVector(Value, Mask, "extractVec");
+        // The extra lanes will be poison. Freeze the whole vector to make sure
+        // the padding memory is not poisoned, which may break coercion.
+        Value = Builder.CreateFreeze(Value);
         SrcTy = NewVecTy;
       }
       if (Addr.getElementType() != SrcTy)
diff --git a/clang/test/CodeGen/AArch64/ext-vector-coercion.c b/clang/test/CodeGen/AArch64/ext-vector-coercion.c
new file mode 100644
index 0000000000000..44638fac1560a
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/ext-vector-coercion.c
@@ -0,0 +1,43 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 6
+// RUN: %clang_cc1 -fenable-matrix -triple arm64-apple-macosx %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+
+typedef float float3 __attribute__((ext_vector_type(3)));
+struct Vec3 {
+  union {
+    struct {
+      float x;
+      float y;
+      float z;
+    };
+  float vec __attribute__((ext_vector_type(3)));
+  };
+};
+
+// CHECK-LABEL: define i128 @add(
+// CHECK-SAME: i128 [[A_COERCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[RETVAL:%.*]] = alloca [[STRUCT_VEC3:%.*]], align 16
+// CHECK-NEXT:    [[A:%.*]] = alloca [[STRUCT_VEC3]], align 16
+// CHECK-NEXT:    [[COERCE_DIVE:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[A]], i32 0, i32 0
+// CHECK-NEXT:    store i128 [[A_COERCE]], ptr [[COERCE_DIVE]], align 16
+// CHECK-NEXT:    [[TMP0:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[A]], i32 0, i32 0
+// CHECK-NEXT:    [[LOADVECN:%.*]] = load <4 x float>, ptr [[TMP0]], align 16
+// CHECK-NEXT:    [[EXTRACTVEC:%.*]] = shufflevector <4 x float> [[LOADVECN]], <4 x float> poison, <3 x i32> <i32 0, i32 1, i32 2>
+// CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[A]], i32 0, i32 0
+// CHECK-NEXT:    [[LOADVECN1:%.*]] = load <4 x float>, ptr [[TMP1]], align 16
+// CHECK-NEXT:    [[EXTRACTVEC2:%.*]] = shufflevector <4 x float> [[LOADVECN1]], <4 x float> poison, <3 x i32> <i32 0, i32 1, i32 2>
+// CHECK-NEXT:    [[ADD:%.*]] = fadd <3 x float> [[EXTRACTVEC]], [[EXTRACTVEC2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[RETVAL]], i32 0, i32 0
+// CHECK-NEXT:    [[EXTRACTVEC3:%.*]] = shufflevector <3 x float> [[ADD]], <3 x float> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 poison>
+// CHECK-NEXT:    [[TMP3:%.*]] = freeze <4 x float> [[EXTRACTVEC3]]
+// CHECK-NEXT:    store <4 x float> [[TMP3]], ptr [[TMP2]], align 16
+// CHECK-NEXT:    [[COERCE_DIVE4:%.*]] = getelementptr inbounds nuw [[STRUCT_VEC3]], ptr [[RETVAL]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP4:%.*]] = load i128, ptr [[COERCE_DIVE4]], align 16
+// CHECK-NEXT:    ret i128 [[TMP4]]
+//
+struct Vec3 add(struct Vec3 a) {
+  struct Vec3 res;
+  res.vec = a.vec + a.vec;
+  return res;
+}
+

Value = Builder.CreateShuffleVector(Value, Mask, "extractVec");
// The extra lanes will be poison. Freeze the whole vector to make sure
// the padding memory is not poisoned, which may break coercion.
Value = Builder.CreateFreeze(Value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of freezing, can we shuffle in undef instead? Or does that end up being worse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to go with either. Not having the freeze is likely to be slightly better (e.g. we may be able to prove some lanes of the non-padding lanes could be poison), I just thought initially we might want to stay consistent w.r.t. shuffle with poison.

Updated now

@fhahn fhahn force-pushed the clang-freeze-padding-in-vectors branch from fd3df1e to ae3e4af Compare October 23, 2025 18:33
@github-actions
Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef([^a-zA-Z0-9_-]|$)|UndefValue::get)' 'HEAD~1' HEAD clang/test/CodeGen/AArch64/ext-vector-coercion.c clang/lib/CodeGen/CGExpr.cpp clang/test/CodeGenCXX/matrix-vector-bit-int.cpp

The following files introduce new uses of undef:

  • clang/lib/CodeGen/CGExpr.cpp

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@fhahn fhahn requested a review from jroelofs October 23, 2025 19:36
fhahn added 4 commits October 28, 2025 02:52
Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.
@fhahn fhahn force-pushed the clang-freeze-padding-in-vectors branch from ae3e4af to dd821bb Compare October 28, 2025 03:20
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fhahn fhahn merged commit 5378584 into llvm:main Oct 29, 2025
9 of 10 checks passed
@fhahn fhahn changed the title [Clang] Freeze padded vectors before storing. [Clang] Use under for padded vectors before storing. Oct 29, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 29, 2025
Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.

Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.

PR: llvm/llvm-project#164821
@fhahn fhahn deleted the clang-freeze-padding-in-vectors branch October 29, 2025 03:25
fhahn added a commit to fhahn/llvm-project that referenced this pull request Oct 30, 2025
Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.

Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.

PR: llvm#164821

rdar://162655662

(cherry-picked from 5378584)
aokblast pushed a commit to aokblast/llvm-project that referenced this pull request Oct 30, 2025
Currently Clang usually leaves padding bits uninitialized, which means
they are undef at the moment.

When expanding stores of vector types to include padding, the padding
lanes will be poison, hence the padding bits will be poison.

This interacts badly with coercion of arguments and return values, where
3 x float vectors will be loaded as i128 integer; poisoning the padding
bits will make the whole value poison.

Not sure if there's a better way, but I think we have a number of places
that currently rely on the padding being undef, not poison.

PR: llvm#164821
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants