Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argument Promotion needs to be smarter #1259

Closed
llvmbot opened this issue Aug 26, 2006 · 18 comments · Fixed by #78735
Closed

Argument Promotion needs to be smarter #1259

llvmbot opened this issue Aug 26, 2006 · 18 comments · Fixed by #78735
Assignees
Labels
bugzilla Issues migrated from bugzilla confirmed Verified by a second party ipo Interprocedural optimizations quality-of-implementation

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 26, 2006

Bugzilla Link 887
Version trunk
OS All
Reporter LLVM Bugzilla Contributor
CC @lattner,@jdoerfert,@nlewycky

Extended Description

in the case below, argument promotion should promote the int* x to be passed by
value.

internal int %foo(int* %x, int %n, int %m) {
entry:
        %tmp = setne int %n, 0          ; <bool> [#uses=1]
        br bool %tmp, label %cond_true, label %cond_false

cond_true:              ; preds = %entry
        %tmp2 = load int* %x            ; <int> [#uses=1]
        br label %return

cond_false:             ; preds = %entry
        %tmp5 = load int* %x            ; <int> [#uses=1]
        %tmp7 = sub int %n, 1           ; <int> [#uses=1]
        %tmp9 = call int %foo( int* %x, int %tmp7, int %tmp5 )          ; <int>
[#uses=1]
        %tmp11 = sub int %n, 2          ; <int> [#uses=1]
        %tmp14 = call int %foo( int* %x, int %tmp11, int %m )           ; <int>
[#uses=1]
        %tmp15 = add int %tmp9, %tmp14          ; <int> [#uses=1]
        br label %return

cond_next:              ; No predecessors!
        br label %return

return:         ; preds = %cond_next, %cond_false, %cond_true
        %retval.0 = phi int [ %tmp2, %cond_true ], [ %tmp15, %cond_false ], [
undef, %cond_next ]               ; <int> [#uses=1]
        ret int %retval.0
}

int %bar(int* %x, int %n, int %m) {
entry:
        %tmp3 = call int %foo( int* %x, int %n, int %m )                ; <int>
[#uses=1]
        br label %return

return:         ; preds = %entry
        ret int %tmp3
}
@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 26, 2006

assigned to @jdoerfert

@lattner
Copy link
Collaborator

lattner commented Aug 26, 2006

t'would be nice.  What sort of program does this occur in?

-Chris

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 28, 2006

This causes an extra load (and global address resolve) in every recursive call
in pointer compressed olden. This adds about 5% more instructions to those code
paths.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 30, 2006

I'll take a stab at this.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 30, 2006

this patch, while being utterly unsafe, does catch most of the cases I've seen,
so it may or may not help you.

--- lib/Transforms/IPO/ArgumentPromotion.cpp   
8737d6632f76363034264b8c3d20d28757e69ab2
+++ lib/Transforms/IPO/ArgumentPromotion.cpp   
dabd426db30ca38abf3282618f9036ee70ed9923
@@ -240,6 +240,9 @@ bool ArgPromotion::isSafeToPromoteArgume
         }
         GEPIndices.push_back(Operands);
       }
+    } else if (CallInst* CA = dyn_cast<CallInst>(*UI)) {
+      if (CA->getCalledFunction() != Arg->getParent())
+       return false;
     } else {
       return false;  // Not a load or a GEP.
     }
@@ -360,15 +363,17 @@ Function *ArgPromotion::DoPromotion(Func
       for (Value::use_iterator UI = I->use_begin(), E = I->use_end(); UI != E;
            ++UI) {
         Instruction *User = cast<Instruction>(*UI);
-        assert(isa<LoadInst>(User) || isa<GetElementPtrInst>(User));
-        std::vector<Value*> Indices(User->op_begin()+1, User->op_end());
-        ArgIndices.insert(Indices);
-        LoadInst *OrigLoad;
-        if (LoadInst *L = dyn_cast<LoadInst>(User))
-          OrigLoad = L;
-        else
-          OrigLoad = cast<LoadInst>(User->use_back());
-        OriginalLoads[Indices] = OrigLoad;
+       if (!isa<CallInst>(User)) {
+         assert(isa<LoadInst>(User) || isa<GetElementPtrInst>(User));
+         std::vector<Value*> Indices(User->op_begin()+1, User->op_end());
+         ArgIndices.insert(Indices);
+         LoadInst *OrigLoad;
+         if (LoadInst *L = dyn_cast<LoadInst>(User))
+           OrigLoad = L;
+         else
+           OrigLoad = cast<LoadInst>(User->use_back());
+         OriginalLoads[Indices] = OrigLoad;
+       }
       }

       // Add a parameter to the function for each element passed in.
@@ -504,7 +509,8 @@ Function *ArgPromotion::DoPromotion(Func
           LI->getParent()->getInstList().erase(LI);
           DEBUG(std::cerr << "*** Promoted load of argument '" << I->getName()
                           << "' in function '" << F->getName() << "'\n");
-        } else {
+        } else if (CallInst *CI = dyn_cast<CallInst>(I->use_back())) {
+       } else {
           GetElementPtrInst *GEP = cast<GetElementPtrInst>(I->use_back());
           std::vector<Value*> Operands(GEP->op_begin()+1, GEP->op_end());

@llvmbot
Copy link
Collaborator Author

llvmbot commented Sep 3, 2006

The recursion issue is fixed here:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20060828/037393.html

Your actual testcase won't work just yet. ArgumentPromotion still needs to get a bit smarter about what
values are promotable. Particularly, it won't promote *x for you because it's not loaded in the entry block.
I'll try to make that better soon.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Oct 17, 2006

This is not fixed. The previous patch had to be reverted, and the solution seems more complicated that I
initially believed. I do intend to get this done at some point, but I don't have the time to do it soon.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Jan 23, 2007

I'm not going to have the time to do this any time soon, so I'm marking this unassigned.

@jdoerfert
Copy link
Member

I'll look into this.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
@arsenm
Copy link
Contributor

arsenm commented Aug 4, 2023

Modernized testcase:

define internal i32 @foo(ptr %x, i32 %n, i32 %m) {
entry:
  %tmp = icmp ne i32 %n, 0
  br i1 %tmp, label %cond_true, label %cond_false

cond_true:                                        ; preds = %entry
  %tmp2 = load i32, ptr %x, align 4
  br label %return

cond_false:                                       ; preds = %entry
  %tmp5 = load i32, ptr %x, align 4
  %tmp7 = sub i32 %n, 1
  %tmp9 = call i32 @foo(ptr %x, i32 %tmp7, i32 %tmp5)
  %tmp11 = sub i32 %n, 2
  %tmp14 = call i32 @foo(ptr %x, i32 %tmp11, i32 %m)
  %tmp15 = add i32 %tmp9, %tmp14
  br label %return

cond_next:                                        ; No predecessors!
  br label %return

return:                                           ; preds = %cond_next, %cond_false, %cond_true
  %retval.0 = phi i32 [ %tmp2, %cond_true ], [ %tmp15, %cond_false ], [ undef, %cond_next ]
  ret i32 %retval.0
}

define i32 @bar(ptr %x, i32 %n, i32 %m) {
entry:
  %tmp3 = call i32 @foo(ptr %x, i32 %n, i32 %m)
  br label %return

return:                                           ; preds = %entry
  ret i32 %tmp3
}


@Endilll Endilll added the confirmed Verified by a second party label Aug 5, 2023
@vedantparanjape-amd
Copy link
Member

I am looking at this, I was able to get it working in a hacky way. It passes the testcase added, but fails two different testcase. Can someone please take a look at it, and help me improve the code ?

From 2adde4fac85aa8e594478cd9cff08e87879900d6 Mon Sep 17 00:00:00 2001
From: Vedant Paranjape <vedant.paranjape@amd.com>
Date: Thu, 18 Jan 2024 19:52:06 +0000
Subject: [PATCH] Arg promotion promote recursive args as well

---
 llvm/lib/Transforms/IPO/ArgumentPromotion.cpp | 81 ++++++++++++++++++-
 1 file changed, 78 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
index 8058282c4225..b71bafae52cf 100644
--- a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
+++ b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
@@ -436,8 +436,11 @@ static bool allCallersPassValidPointerForArgument(Argument *Arg,
   // direct callees.
   return all_of(Callee->users(), [&](User *U) {
     CallBase &CB = cast<CallBase>(*U);
-    return isDereferenceableAndAlignedPointer(CB.getArgOperand(Arg->getArgNo()),
+    CB.dump();
+    bool a = isDereferenceableAndAlignedPointer(CB.getArgOperand(Arg->getArgNo()),
                                               NeededAlign, Bytes, DL);
+    dbgs() << a << "\n";
+    return a;
   });
 }

@@ -549,6 +552,66 @@ static bool findArgParts(Argument *Arg, const DataLayout &DL, AAResults &AAR,
     return true;
   };

+  auto HandleRecursiveEndUser = [&](auto *I, Type *Ty,
+                           bool GuaranteedToExecute, unsigned ArgNo) -> std::optional<bool> {
+    Value *Ptr = I->getArgOperand(ArgNo);
+    Align PtrAlign = Ptr->getPointerAlignment(DL);
+    APInt Offset(DL.getIndexTypeSizeInBits(Ptr->getType()), 0);
+    Ptr = Ptr->stripAndAccumulateConstantOffsets(DL, Offset,
+                                                 /* AllowNonInbounds */ true);
+    // if (Ptr != Arg)
+    //   return std::nullopt;
+
+    if (Offset.getSignificantBits() >= 64)
+      return false;
+
+    TypeSize Size = DL.getTypeStoreSize(Ty);
+    // Don't try to promote scalable types.
+    if (Size.isScalable())
+      return false;
+
+    int64_t Off = Offset.getSExtValue();
+    auto Pair = ArgParts.try_emplace(
+        Off, ArgPart{Ty, PtrAlign, GuaranteedToExecute ? I : nullptr});
+    ArgPart &Part = Pair.first->second;
+    // bool OffsetNotSeenBefore = Pair.second;
+
+    // We limit promotion to only promoting up to a fixed number of elements of
+    // the aggregate.
+    if (MaxElements > 0 && ArgParts.size() > MaxElements) {
+      LLVM_DEBUG(dbgs() << "ArgPromotion of " << *Arg << " failed: "
+                        << "more than " << MaxElements << " parts\n");
+      return false;
+    }
+
+    // // If this instruction is not guaranteed to execute, and we haven't seen a
+    // // load or store at this offset before (or it had lower alignment), then we
+    // // need to remember that requirement.
+    // // Note that skipping instructions of previously seen offsets is only
+    // // correct because we only allow a single type for a given offset, which
+    // // also means that the number of accessed bytes will be the same.
+    // if (!GuaranteedToExecute &&
+    //     (OffsetNotSeenBefore || Part.Alignment < PtrAlign)) {
+    //   // We won't be able to prove dereferenceability for negative offsets.
+    //   if (Off < 0)
+    //     return false;
+
+    //   // If the offset is not aligned, an aligned base pointer won't help.
+    //   if (!isAligned(PtrAlign, Off))
+    //     return false;
+    //   dbgs() << "ndb: " << NeededDerefBytes << "\n";
+    //   dbgs() << "na: " << NeededAlign.value();
+
+    //   NeededDerefBytes = std::max(NeededDerefBytes, Off + Size.getFixedValue());
+    //   NeededAlign = std::max(NeededAlign, PtrAlign);
+    // }
+
+    // dbgs() << "ndb: " << NeededDerefBytes << "\n";
+    // dbgs() << "na: " << NeededAlign.value() << "\n";
+    Part.Alignment = std::max(Part.Alignment, PtrAlign);
+    return true;
+  };
+
   // Look for loads and stores that are guaranteed to execute on entry.
   for (Instruction &I : Arg->getParent()->getEntryBlock()) {
     std::optional<bool> Res{};
@@ -589,10 +652,13 @@ static bool findArgParts(Argument *Arg, const DataLayout &DL, AAResults &AAR,
       AppendUses(V);
       continue;
     }
-
+    dbgs() << "recursion: " << IsRecursive << "\n";
+    V->dump();
     if (auto *LI = dyn_cast<LoadInst>(V)) {
+      LI->dump();
       if (!*HandleEndUser(LI, LI->getType(), /* GuaranteedToExecute */ false))
         return false;
+      dbgs() << "Load didn't return\n";
       Loads.push_back(LI);
       continue;
     }
@@ -608,14 +674,23 @@ static bool findArgParts(Argument *Arg, const DataLayout &DL, AAResults &AAR,
       // Only stores TO the argument is allowed, all the other stores are
       // unknown users
     }
+    dbgs() << "Store didn't return\n";

+    auto *CI = dyn_cast<CallInst>(V);
+    if (IsRecursive && CI && CI->getFunction() == Arg->getParent()) {
+      dbgs() << "Found recursive call\n";
+      if (!HandleRecursiveEndUser(CI, CI->getType(), U->getOperandNo(),
+                          /* GuaranteedToExecute */ false).value_or(false))
+        return false;
+      continue;
+    }
     // Unknown user.
     LLVM_DEBUG(dbgs() << "ArgPromotion of " << *Arg << " failed: "
                       << "unknown user " << *V << "\n");
     return false;
   }

-  if (NeededDerefBytes || NeededAlign > 1) {
+  if (!IsRecursive && (NeededDerefBytes || NeededAlign > 1)) {
     // Try to prove a required deref / aligned requirement.
     if (!allCallersPassValidPointerForArgument(Arg, NeededAlign,
                                                NeededDerefBytes)) {
--
2.41.0

@Endilll
Copy link
Contributor

Endilll commented Jan 19, 2024

@vedantparanjape-amd You should open a PR. You'll get more feedback this way.

@vedantparanjape-amd
Copy link
Member

@vedantparanjape-amd You should open a PR. You'll get more feedback this way.

Sure, will do so.

@vedantparanjape-amd
Copy link
Member

vedantparanjape-amd commented Jan 19, 2024

@Endilll @jdoerfert @arsenm I have added a PR now.

@vedantparanjape-amd
Copy link
Member

ping!

@Endilll
Copy link
Contributor

Endilll commented Jan 22, 2024

You should wait for at least a week before pinging people. And it's better done in PR

@vedantparanjape-amd
Copy link
Member

You should wait for at least a week before pinging people. And it's better done in PR

Okay, will keep in mind, cheers!

@vedantparanjape-amd
Copy link
Member

This issue is now fixed, closing it. (happy to close the 2nd oldest issue on LLVM)

PR: #78735

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla confirmed Verified by a second party ipo Interprocedural optimizations quality-of-implementation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants