Skip to content

Conversation

qiongsiwu
Copy link
Contributor

@qiongsiwu qiongsiwu commented Nov 7, 2023

Situations may arise leading to negative NumElements argument of an alloca instruction. In this case the NumElements is treated as a large unsigned value. Such large arrays may cause the size constant to overflow during code generation under 32 bit mode, leading to a crash. This PR limits the constant's bit width to the width of the pointer on the target. With this fix,

alloca i32, i32 -1

and

alloca [4294967295 x i32], i32 1

generates the exact same PowerPC assembly code under 32 bit mode.

@qiongsiwu qiongsiwu self-assigned this Nov 7, 2023
@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Nov 7, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2023

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-selectiondag

Author: Qiongsi Wu (qiongsiwu)

Changes

instcombine currently generates large arrays when the NumElements argument of an alloca instruction is negative. Such large arrays may cause the size constant to overflow during code generation under 32 bit mode, leading to a crash. This PR limits the constant's bit width to the width of the pointer on the target. With this fix,

alloca i32, i32 -1

and

alloca [4294967295 x i32], i32 1

generates the exact same PowerPC assembly code under 32 bit mode.


Full diff: https://github.com/llvm/llvm-project/pull/71472.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+4-3)
  • (added) llvm/test/CodeGen/PowerPC/alloca-neg-size.ll (+46)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index aab0d5c5a348bfe..d5ffaf28ca2d499 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4138,9 +4138,10 @@ void SelectionDAGBuilder::visitAlloca(const AllocaInst &I) {
                                           APInt(IntPtr.getScalarSizeInBits(),
                                                 TySize.getKnownMinValue())));
   else
-    AllocSize =
-        DAG.getNode(ISD::MUL, dl, IntPtr, AllocSize,
-                    DAG.getConstant(TySize.getFixedValue(), dl, IntPtr));
+    AllocSize = DAG.getNode(ISD::MUL, dl, IntPtr, AllocSize,
+                            DAG.getConstant(APInt(IntPtr.getScalarSizeInBits(),
+                                                  TySize.getFixedValue()),
+                                            dl, IntPtr));
 
   // Handle alignment.  If the requested alignment is less than or equal to
   // the stack alignment, ignore it.  If the size is greater than or equal to
diff --git a/llvm/test/CodeGen/PowerPC/alloca-neg-size.ll b/llvm/test/CodeGen/PowerPC/alloca-neg-size.ll
new file mode 100644
index 000000000000000..ba22c0a71294b8d
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/alloca-neg-size.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3
+; The instcombine pass can turn
+;     alloca i32, i32 -1
+; to
+;     alloca [4294967295 x i32], i32 1
+; because it zero extends the NumElements to unit64_t.
+; The zero extension can lead to oversized arrays on a 32 bit system.
+; Alloca-ing an array of size bigger than half of the address space
+; is most likely an undefined behaviour, but the code generator
+; should not crash in such situations.
+; RUN: llc < %s -mtriple=powerpc-ibm-aix-xcoff | FileCheck %s
+define void @test_negalloc(ptr %dst, i32 %cond) {
+; CHECK-LABEL: test_negalloc:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    stw 31, -4(1)
+; CHECK-NEXT:    stwu 1, -80(1)
+; CHECK-NEXT:    cmplwi 4, 0
+; CHECK-NEXT:    mr 31, 1
+; CHECK-NEXT:    beq 0, L..BB0_2
+; CHECK-NEXT:  # %bb.1: # %if.then
+; CHECK-NEXT:    li 4, 0
+; CHECK-NEXT:    addi 5, 31, 80
+; CHECK-NEXT:    stwux 5, 1, 4
+; CHECK-NEXT:    addi 4, 1, 32
+; CHECK-NEXT:    b L..BB0_3
+; CHECK-NEXT:  L..BB0_2:
+; CHECK-NEXT:    addi 4, 31, 44
+; CHECK-NEXT:  L..BB0_3: # %if.end
+; CHECK-NEXT:    stw 4, 0(3)
+; CHECK-NEXT:    lwz 1, 0(1)
+; CHECK-NEXT:    lwz 31, -4(1)
+; CHECK-NEXT:    blr
+entry:
+  %0 = alloca [8 x i32], i32 1, align 4
+  %tobool = icmp ne i32 %cond, 0
+  br i1 %tobool, label %if.then, label %if.end
+
+if.then:
+  %vla1 = alloca [4294967295 x i32], i32 1, align 4
+  br label %if.end
+
+if.end:
+  %arr = phi ptr [%0, %entry], [%vla1, %if.then]
+  store ptr %arr, ptr %dst
+  ret void
+}

@arsenm
Copy link
Contributor

arsenm commented Nov 7, 2023

Does GlobalISel need the same change?

@qiongsiwu
Copy link
Contributor Author

qiongsiwu commented Nov 7, 2023

Does GlobalISel need the same change?

Thanks for pointing out GlobalISel. I suspect we do not need the exact same change because IRTranslator already takes care of restricting the constant width. See

getOrCreateVReg(*ConstantInt::get(IntPtrIRTy, DL->getTypeAllocSize(Ty)));

The new test case can proceed to the legalizer with -mtriple=riscv[64|32]-unknown-linux-gnu -global-isel, but both fail in the legalizer with errors

# 64 bit
LLVM ERROR: unable to legalize instruction: %14:_(p0) = G_DYN_STACKALLOC %13:_(s64), 1 (in function: test_oversized)
# 32 bit
LLVM ERROR: unable to legalize instruction: %12:_(p0) = G_DYN_STACKALLOC %11:_(s32), 1 (in function: test_oversized)

This does not look like the same issue as the one this PR is fixing. If it is reasonable, I prefer leaving this PR fixing the selection DAG, and open an separate issue for GlobalISel.

Does this sound reasonable?

@qiongsiwu qiongsiwu changed the title [CodeGen] Handling Oversized Alloca Types under 32 bit Mode to Avoid Code Generator Crash [SelectionDAG] Handling Oversized Alloca Types under 32 bit Mode to Avoid Code Generator Crash Nov 7, 2023
@qiongsiwu
Copy link
Contributor Author

@arsenm Hi Matt! The PR description is also updated to avoid mentioning any specific optimization passes. Earlier change requests are either addressed, or I have some follow up questions and I would like to seek your clarification (here and here).

Could you get back to me on the two questions? Thanks so much!

@qiongsiwu qiongsiwu requested a review from arsenm November 8, 2023 14:28
@qiongsiwu
Copy link
Contributor Author

Hi @arsenm! Could you provide some feedback/clarifications? We would like to get this fix going since there are other work that depends on it. I appreciate your timely feedback!

Thanks!

@arsenm
Copy link
Contributor

arsenm commented Nov 10, 2023

This does not look like the same issue as the one this PR is fixing. If it is reasonable, I prefer leaving this PR fixing the selection DAG, and open an separate issue for GlobalISel.

Does this sound reasonable?

Sounds like there's no issue, and could just use a test. The target also doesn't matter, you should be able to use aarch64 with fewer issues

@qiongsiwu
Copy link
Contributor Author

This does not look like the same issue as the one this PR is fixing. If it is reasonable, I prefer leaving this PR fixing the selection DAG, and open an separate issue for GlobalISel.
Does this sound reasonable?

Sounds like there's no issue, and could just use a test. The target also doesn't matter, you should be able to use aarch64 with fewer issues

Thanks! The test is revised to cover aarch64 GlobalISel in addition to powerpc.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nit

@qiongsiwu
Copy link
Contributor Author

Landing this PR now since it technically is approved (#71472 (review)) and it passes all checks. Thanks again for your inputs @arsenm !

@qiongsiwu qiongsiwu merged commit c8b1109 into llvm:main Nov 14, 2023
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…void Code Generator Crash (llvm#71472)

Situations may arise leading to negative `NumElements` argument of an
`alloca` instruction. In this case the `NumElements` is treated as a
large unsigned value. Such large arrays may cause the size constant to
overflow during code generation under 32 bit mode, leading to a crash.
This PR limits the constant's bit width to the width of the pointer on
the target. With this fix,
```
alloca i32, i32 -1
```
and
```
alloca [4294967295 x i32], i32 1
```
generates the exact same PowerPC assembly code under 32 bit mode.
@chfast
Copy link
Member

chfast commented Nov 21, 2023

This is related to #63377.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants