Skip to content

[GlobalOpt] Prevent widenDestArray from shrinking an alloca. #144641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Jun 18, 2025

If the destination alloca for one of the memcpy calls we are
modifying is already larger than our desired size, we shouldn't
replace it with a smaller alloca.

topperc added 2 commits June 17, 2025 22:35
If the destination alloca for one of the memcpy calls we are
modifying is already larger than our desired size, we shouldn't
replace it with a smaller alloca.
@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Craig Topper (topperc)

Changes

If the destination alloca for one of the memcpy calls we are
modifying is already larger than our desired size, we shouldn't
replace it with a smaller alloca.


Full diff: https://github.com/llvm/llvm-project/pull/144641.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/GlobalOpt.cpp (+4)
  • (added) llvm/test/Transforms/GlobalOpt/ARM/arm-widen-large-alloca.ll (+27)
diff --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
index 7db0586386506..f0cd00df23959 100644
--- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp
+++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
@@ -2103,6 +2103,10 @@ static void widenDestArray(CallInst *CI, const unsigned NumBytesToPad,
     unsigned ElementByteWidth = SourceDataArray->getElementByteSize();
     unsigned int TotalBytes = NumBytesToCopy + NumBytesToPad;
     unsigned NumElementsToCopy = divideCeil(TotalBytes, ElementByteWidth);
+    // Don't change size if already wide enough.
+    if (Alloca->getAllocatedType()->getArrayNumElements() >= NumElementsToCopy)
+      return;
+
     // Update destination array to be word aligned (memcpy(X,...,...))
     IRBuilder<> BuildAlloca(Alloca);
     AllocaInst *NewAlloca = BuildAlloca.CreateAlloca(ArrayType::get(
diff --git a/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-large-alloca.ll b/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-large-alloca.ll
new file mode 100644
index 0000000000000..4fca1ffefdcaf
--- /dev/null
+++ b/llvm/test/Transforms/GlobalOpt/ARM/arm-widen-large-alloca.ll
@@ -0,0 +1,27 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -mtriple=arm-none-eabi -passes=globalopt -S | FileCheck %s
+
+@.i8 = private unnamed_addr constant [3 x i8] [i8 1, i8 2, i8 3] , align 1
+
+define void @memcpy()  {
+; CHECK-LABEL: define void @memcpy() local_unnamed_addr {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[ALLOCA1:%.*]] = alloca [4 x i8], align 1
+; CHECK-NEXT:    [[ALLOCA2:%.*]] = alloca [5 x i8], align 1
+; CHECK-NEXT:    [[CALL1:%.*]] = call i32 @bar(ptr nonnull [[ALLOCA1]])
+; CHECK-NEXT:    [[CALL2:%.*]] = call i32 @bar(ptr nonnull [[ALLOCA2]])
+; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(3) [[ALLOCA1]], ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 4, i1 false)
+; CHECK-NEXT:    call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(5) [[ALLOCA2]], ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 4, i1 false)
+; CHECK-NEXT:    ret void
+;
+entry:
+  %alloca1 = alloca [3 x i8], align 1
+  %alloca2 = alloca [5 x i8], align 1
+  %call1 = call i32 @bar(ptr nonnull %alloca1)
+  %call2 = call i32 @bar(ptr nonnull %alloca2)
+  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(3) %alloca1, ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 3, i1 false)
+  call void @llvm.memcpy.p0.p0.i32(ptr noundef nonnull align 1 dereferenceable(5) %alloca2, ptr noundef nonnull align 1 dereferenceable(3) @.i8, i32 3, i1 false)
+  ret void
+}
+
+declare i32 @bar(...)

@topperc topperc changed the title [GlobalOpt] Preven widenDestArray from shrinking an alloca. [GlobalOpt] Prevent widenDestArray from shrinking an alloca. Jun 18, 2025
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation is also wrong in the case where one of the memcpys previously copied less than the full size of the global. It gets changes to copy the new size, which is larger and may overwrite other memory.

I believe you need to replicate all the checks from this condition (which is currently only checked for a single alloca, instead of all of them):

// For safety purposes lets add a constraint and only pad when
// NumElementsToCopy == destination array size ==
// source which is a constant
if (NumElementsToCopy != DZSize || DZSize != SZSize)
continue;

@nikic
Copy link
Contributor

nikic commented Jun 18, 2025

Actually, that check is also incorrect because of the incorrect element instead of byte based logic.

@nikic
Copy link
Contributor

nikic commented Jun 18, 2025

TBH, the more I look at this code, the more unhappy I become. I submitted a full revert of the GlobalOpt code at #144652.

@topperc topperc closed this Jun 18, 2025
nikic added a commit that referenced this pull request Jun 30, 2025
Partially reverts e37d736.

The transform has a number of correctness and code quality issues, and
will benefit from a from-scratch re-review more than incremental fixes.

The correctness issues are hinted at in
#144641, but I think it needs a
larger rework to stop working on ArrayTypes and the implementation could
use some other improvements (like callInstIsMemcpy should just be
`dyn_cast<MemCpyInst>`). I can comment in more detail on a resubmission
of the patch.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 30, 2025
Partially reverts e37d736.

The transform has a number of correctness and code quality issues, and
will benefit from a from-scratch re-review more than incremental fixes.

The correctness issues are hinted at in
llvm/llvm-project#144641, but I think it needs a
larger rework to stop working on ArrayTypes and the implementation could
use some other improvements (like callInstIsMemcpy should just be
`dyn_cast<MemCpyInst>`). I can comment in more detail on a resubmission
of the patch.
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
Partially reverts e37d736.

The transform has a number of correctness and code quality issues, and
will benefit from a from-scratch re-review more than incremental fixes.

The correctness issues are hinted at in
llvm#144641, but I think it needs a
larger rework to stop working on ArrayTypes and the implementation could
use some other improvements (like callInstIsMemcpy should just be
`dyn_cast<MemCpyInst>`). I can comment in more detail on a resubmission
of the patch.
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
Partially reverts e37d736.

The transform has a number of correctness and code quality issues, and
will benefit from a from-scratch re-review more than incremental fixes.

The correctness issues are hinted at in
llvm#144641, but I think it needs a
larger rework to stop working on ArrayTypes and the implementation could
use some other improvements (like callInstIsMemcpy should just be
`dyn_cast<MemCpyInst>`). I can comment in more detail on a resubmission
of the patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants