-
Notifications
You must be signed in to change notification settings - Fork 14k
[GlobalOpt] Prevent widenDestArray from shrinking an alloca. #144641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
If the destination alloca for one of the memcpy calls we are modifying is already larger than our desired size, we shouldn't replace it with a smaller alloca.
@llvm/pr-subscribers-llvm-transforms Author: Craig Topper (topperc) ChangesIf the destination alloca for one of the memcpy calls we are Full diff: https://github.com/llvm/llvm-project/pull/144641.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
index 7db0586386506..f0cd00df23959 100644
--- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp
+++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
@@ -2103,6 +2103,10 @@ static void widenDestArray(CallInst *CI, const unsigned NumBytesToPad,
unsigned ElementByteWidth = SourceDataArray->getElementByteSize();
unsigned int TotalBytes = NumBytesToCopy + NumBytesToPad;
unsigned NumElementsToCopy = divideCeil(TotalBytes, ElementByteWidth);
+ // Don't change size if already wide enough.
+ if (Alloca->getAllocatedType()->getArrayNumElements() >= NumElementsToCopy)
+ return;
+
// Update destination array to be word aligned (memcpy(X,...,...))
IRBuilder<> BuildAlloca(Alloca);
AllocaInst *NewAlloca = BuildAlloca.CreateAlloca(ArrayType::get(
diff --git a/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-large-alloca.ll b/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-large-alloca.ll
new file mode 100644
index 0000000000000..4fca1ffefdcaf
--- /dev/null
+++ b/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-large-alloca.ll
@@ -0,0 +1,27 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -mtriple=arm-none-eabi -passes=globalopt -S | FileCheck %s
+
+@.i8 = private unnamed_addr constant [3 x i8] [i8 1, i8 2, i8 3] , align 1
+
+define void @memcpy() {
+; CHECK-LABEL: define void @memcpy() local_unnamed_addr {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[ALLOCA1:%.*]] = alloca [4 x i8], align 1
+; CHECK-NEXT: [[ALLOCA2:%.*]] = alloca [5 x i8], align 1
+; CHECK-NEXT: [[CALL1:%.*]] = call i32 @bar(ptr nonnull [[ALLOCA1]])
+; CHECK-NEXT: [[CALL2:%.*]] = call i32 @bar(ptr nonnull [[ALLOCA2]])
+; CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(3) [[ALLOCA1]], ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 4, i1 false)
+; CHECK-NEXT: call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(5) [[ALLOCA2]], ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 4, i1 false)
+; CHECK-NEXT: ret void
+;
+entry:
+ %alloca1 = alloca [3 x i8], align 1
+ %alloca2 = alloca [5 x i8], align 1
+ %call1 = call i32 @bar(ptr nonnull %alloca1)
+ %call2 = call i32 @bar(ptr nonnull %alloca2)
+ call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(3) %alloca1, ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 3, i1 false)
+ call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(5) %alloca2, ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 3, i1 false)
+ ret void
+}
+
+declare i32 @bar(...)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is also wrong in the case where one of the memcpys previously copied less than the full size of the global. It gets changes to copy the new size, which is larger and may overwrite other memory.
I believe you need to replicate all the checks from this condition (which is currently only checked for a single alloca, instead of all of them):
llvm-project/llvm/lib/Transforms/IPO/GlobalOpt.cpp
Lines 2185 to 2189 in 4f5b59f
// For safety purposes lets add a constraint and only pad when | |
// NumElementsToCopy == destination array size == | |
// source which is a constant | |
if (NumElementsToCopy != DZSize || DZSize != SZSize) | |
continue; |
Actually, that check is also incorrect because of the incorrect element instead of byte based logic. |
TBH, the more I look at this code, the more unhappy I become. I submitted a full revert of the GlobalOpt code at #144652. |
If the destination alloca for one of the memcpy calls we are
modifying is already larger than our desired size, we shouldn't
replace it with a smaller alloca.