[InstCombine] Avoid Allocating Arrays Too Large For the Target #70980

qiongsiwu · 2023-11-01T21:00:57Z

The current logic in simplifyAllocaArraySize may result in an array too large for a target. For example, %1 = alloca i32, i32 -1 may result in an array type of [4294967295 x i32]. Such an array can cause a crash in the code generator when the 32 bit mode is in effect, because the array's size is larger than what a 32 bit pointer can index. This PR teaches instcombine to avoid creating an over-sized array.

llvmbot · 2023-11-01T21:01:29Z

@llvm/pr-subscribers-llvm-transforms

Author: Qiongsi Wu (qiongsiwu)

Changes

The current logic in simplifyAllocaArraySize may result in an array to large for a target. For example, %1 = alloca i32, i32 -1 may result in an array type of [4294967295 x i32]. Such an array can cause a crash in the code generator. This PR adds logic to check the size of the array type, and sets the array size to 0 if the array is too big for a target. This behaviour is consistent with what LLVM does when an alloca's first operand is too large for the target.

Full diff: https://github.com/llvm/llvm-project/pull/70980.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp (+12-3)
(modified) llvm/test/Transforms/InstCombine/alloca-big.ll (+21-1)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp b/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
index b72b68c68d985bd..8d90fc9b63ca049 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
@@ -210,14 +210,23 @@ static Instruction *simplifyAllocaArraySize(InstCombinerImpl &IC,
   if (const ConstantInt *C = dyn_cast<ConstantInt>(AI.getArraySize())) {
     if (C->getValue().getActiveBits() <= 64) {
       Type *NewTy = ArrayType::get(AI.getAllocatedType(), C->getZExtValue());
+
+      // Make sure we do not create an array type larger than pointers on the
+      // target can index.
+      unsigned MaxArrSizeBitWidth =
+          IC.getDataLayout().getPointerTypeSizeInBits(AI.getType());
+      APInt ArrayAllocSize(64, IC.getDataLayout().getTypeAllocSize(NewTy));
+      if (ArrayAllocSize.getActiveBits() > MaxArrSizeBitWidth)
+        NewTy = ArrayType::get(AI.getAllocatedType(), 0);
+
       AllocaInst *New = IC.Builder.CreateAlloca(NewTy, AI.getAddressSpace(),
                                                 nullptr, AI.getName());
       New->setAlignment(AI.getAlign());
 
       replaceAllDbgUsesWith(AI, *New, *New, DT);
 
-      // Scan to the end of the allocation instructions, to skip over a block of
-      // allocas if possible...also skip interleaved debug info
+      // Scan to the end of the allocation instructions, to skip over a block
+      // of allocas if possible...also skip interleaved debug info
       //
       BasicBlock::iterator It(New);
       while (isa<AllocaInst>(*It) || isa<DbgInfoIntrinsic>(*It))
@@ -243,7 +252,7 @@ static Instruction *simplifyAllocaArraySize(InstCombinerImpl &IC,
     return IC.replaceInstUsesWith(AI, Constant::getNullValue(AI.getType()));
 
   // Ensure that the alloca array size argument has type equal to the offset
-  // size of the alloca() pointer, which, in the tyical case, is intptr_t,
+  // size of the alloca() pointer, which, in the typical case, is intptr_t,
   // so that any casting is exposed early.
   Type *PtrIdxTy = IC.getDataLayout().getIndexType(AI.getType());
   if (AI.getArraySize()->getType() != PtrIdxTy) {
diff --git a/llvm/test/Transforms/InstCombine/alloca-big.ll b/llvm/test/Transforms/InstCombine/alloca-big.ll
index 6925f1ba988dc1e..2ccfb5afa67493d 100644
--- a/llvm/test/Transforms/InstCombine/alloca-big.ll
+++ b/llvm/test/Transforms/InstCombine/alloca-big.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+; RUN: opt < %s -passes=instcombine -data-layout="p:64:64" -S | FileCheck %s
+; RUN: opt < %s -passes=instcombine -data-layout="p:32:32" -S | FileCheck --check-prefix=CHECK32 %s
 
 ; OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5223
 define void @test_bigalloc(ptr %dst) {
@@ -7,8 +8,27 @@ define void @test_bigalloc(ptr %dst) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = alloca [18446744069414584320 x i8], align 1
 ; CHECK-NEXT:    store ptr [[TMP1]], ptr [[DST:%.*]], align 8
 ; CHECK-NEXT:    ret void
+; CHECK32-LABEL: @test_bigalloc(
+; CHECK32-NEXT:    [[TMP1:%.*]] = alloca [0 x i8], align 1
+; CHECK32-NEXT:    store ptr [[TMP1]], ptr [[DST:%.*]], align 4
+; CHECK32-NEXT:    ret void
 ;
   %1 = alloca i8, i864 -4294967296
   store ptr %1, ptr %dst
   ret void
 }
+
+define void @test_negalloc(ptr %dst) {
+; CHECK-LABEL: @test_negalloc(
+; CHECK-NEXT:    [[TMP1:%.*]] = alloca [4294967295 x i32], align 4
+; CHECK-NEXT:    store ptr [[TMP1]], ptr [[DST:%.*]], align 8
+; CHECK-NEXT:    ret void
+; CHECK32-LABEL: @test_negalloc(
+; CHECK32-NEXT:    [[TMP1:%.*]] = alloca [0 x i32], align 4
+; CHECK32-NEXT:    store ptr [[TMP1]], ptr [[DST:%.*]], align 4
+; CHECK32-NEXT:    ret void
+;
+  %1 = alloca i32, i32 -1
+  store ptr %1, ptr %dst
+  ret void
+}

arsenm

I don't think this is really a solution to codegen dying on oversized arrays. The underlying issue still exists regardless of whether InstCombine cleans up constant sized array allocations.

I think a better fix would be to adjust codegen to treat oversized allocations like dynamic stack allocations

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp

qiongsiwu · 2023-11-02T14:31:42Z

I don't think this is really a solution to codegen dying on oversized arrays. The underlying issue still exists regardless of whether InstCombine cleans up constant sized array allocations.

I think a better fix would be to adjust codegen to treat oversized allocations like dynamic stack allocations

Thanks for chiming in Matt @arsenm !! I agree that this may not fix all the cases that the codegen does not like an oversized constant. That said, it also seems incorrect that we turn alloca i32, i32 -1 into alloca [4294967295 x i32], i32 1 (due to zero extending i32 -1 to uint64_t). An earlier fix I tried was to avoid this optimization when the alloca NumElements is negative. But that failed the existing test case. Should we do nothing when the alloca NumElements value is negative?

qiongsiwu · 2023-11-03T15:25:14Z

@arsenm @nikic Could you comment on what is the defined behaviour for an alloca when NumElements is negative? Is it an undefined behaviour? If so, can we replace such allocas with a poison value?

Thanks so much!

arsenm · 2023-11-03T15:26:43Z

The allocation size is unsigned, it's merely printed as a negative value

nikic · 2023-11-03T15:30:58Z

It's generally understood that an allocation larger than half the address space is undefined behavior -- using a negative alloca size would result in such an allocation.

LangRef isn't super explicit on this though, it only has this note in the GEP specification:

These rules are based on the assumption that no allocated object may cross the unsigned address space boundary, and no allocated object may be larger than half the pointer index type space.

nikic · 2023-11-03T15:32:46Z

LangRef also isn't clear on what exactly the semantics are around that type size * num elements multiplication -- in what bit width is it performed? What happens on overflow?

We should probably clarify these things, otherwise it's hard to say what the correct behavior is.

qiongsiwu · 2023-11-03T16:14:57Z

Thanks for the comment @nikic and @arsenm !! The PR is updated to address Matt's comments. As mentioned #70980 (comment) and #70980 (comment), it is not clear what the correct behivour is without further discussion, so this PR now leaves the allocas alone if the implied array type is too big to minimize the changes to current behaviours.

qiongsiwu · 2023-11-07T01:24:28Z

Closing this PR. A different fix (#71472) is done in the code generator. @arsenm Could you help me take a look? Thanks so much!

Stop instcombine from alloca-ing arrays that are too large

2ac39f7

qiongsiwu requested a review from nikic as a code owner November 1, 2023 21:00

qiongsiwu self-assigned this Nov 1, 2023

llvmbot added the llvm:transforms label Nov 1, 2023

qiongsiwu requested review from RKSimon, w2yehia, scui-ibm and RolandF77 November 1, 2023 21:01

arsenm requested changes Nov 2, 2023

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp Outdated Show resolved Hide resolved

Use DL.getIndexSizeInBits instead of DL.getPointerTypeSizeInBits

6cdf12a

Do not clamp array size to 0 when the size overflows

583cbe9

qiongsiwu requested a review from arsenm November 3, 2023 16:33

qiongsiwu closed this Nov 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Avoid Allocating Arrays Too Large For the Target #70980

[InstCombine] Avoid Allocating Arrays Too Large For the Target #70980

qiongsiwu commented Nov 1, 2023 •

edited

llvmbot commented Nov 1, 2023

arsenm left a comment

qiongsiwu commented Nov 2, 2023 •

edited

qiongsiwu commented Nov 3, 2023

arsenm commented Nov 3, 2023

nikic commented Nov 3, 2023

nikic commented Nov 3, 2023

qiongsiwu commented Nov 3, 2023

qiongsiwu commented Nov 7, 2023

[InstCombine] Avoid Allocating Arrays Too Large For the Target #70980

[InstCombine] Avoid Allocating Arrays Too Large For the Target #70980

Conversation

qiongsiwu commented Nov 1, 2023 • edited

llvmbot commented Nov 1, 2023

arsenm left a comment

Choose a reason for hiding this comment

qiongsiwu commented Nov 2, 2023 • edited

qiongsiwu commented Nov 3, 2023

arsenm commented Nov 3, 2023

nikic commented Nov 3, 2023

nikic commented Nov 3, 2023

qiongsiwu commented Nov 3, 2023

qiongsiwu commented Nov 7, 2023

qiongsiwu commented Nov 1, 2023 •

edited

qiongsiwu commented Nov 2, 2023 •

edited