Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MemProf] Fix tailcall discovery checking for multiple callee chains #92632

Merged
merged 2 commits into from
May 24, 2024

Conversation

teresajohnson
Copy link
Contributor

When looking for missing frames due to tail calls, we were not checking
the output parameter of the recursive call in the correct place.
Make sure we check for the case when that recursive call returned false
due to multiple possible callee chains.

Extended the existing test a bit to catch this case.

When looking for missing frames due to tail calls, we were not checking
the output parameter of the recursive call in the correct place.
Make sure we check for the case when that recursive call returned false
due to multiple possible callee chains.

Extended the existing test a bit to catch this case.
@llvmbot llvmbot added LTO Link time optimization (regular/full LTO or ThinLTO) llvm:transforms labels May 18, 2024
@llvmbot
Copy link

llvmbot commented May 18, 2024

@llvm/pr-subscribers-lto

@llvm/pr-subscribers-llvm-transforms

Author: Teresa Johnson (teresajohnson)

Changes

When looking for missing frames due to tail calls, we were not checking
the output parameter of the recursive call in the correct place.
Make sure we check for the case when that recursive call returned false
due to multiple possible callee chains.

Extended the existing test a bit to catch this case.


Full diff: https://github.com/llvm/llvm-project/pull/92632.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp (+6-6)
  • (modified) llvm/test/ThinLTO/X86/memprof-tailcall-nonunique.ll (+20-21)
  • (modified) llvm/test/Transforms/MemProfContextDisambiguation/tailcall-nonunique.ll (+16-19)
diff --git a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
index b9d84d583f495..23e067714e754 100644
--- a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+++ b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
@@ -1889,15 +1889,15 @@ bool ModuleCallsiteContextGraph::findProfiledCalleeThroughTailCalls(
       } else if (findProfiledCalleeThroughTailCalls(
                      ProfiledCallee, CalledFunction, Depth + 1,
                      FoundCalleeChain, FoundMultipleCalleeChains)) {
-        if (FoundMultipleCalleeChains)
-          return false;
+        assert(!FoundMultipleCalleeChains);
         if (FoundSingleCalleeChain) {
           FoundMultipleCalleeChains = true;
           return false;
         }
         FoundSingleCalleeChain = true;
         SaveCallsiteInfo(&I, CalleeFunc);
-      }
+      } else if (FoundMultipleCalleeChains)
+        return false;
     }
   }
 
@@ -2004,8 +2004,7 @@ bool IndexCallsiteContextGraph::findProfiledCalleeThroughTailCalls(
       } else if (findProfiledCalleeThroughTailCalls(
                      ProfiledCallee, CallEdge.first, Depth + 1,
                      FoundCalleeChain, FoundMultipleCalleeChains)) {
-        if (FoundMultipleCalleeChains)
-          return false;
+        assert(!FoundMultipleCalleeChains);
         if (FoundSingleCalleeChain) {
           FoundMultipleCalleeChains = true;
           return false;
@@ -2015,7 +2014,8 @@ bool IndexCallsiteContextGraph::findProfiledCalleeThroughTailCalls(
         // Add FS to FSToVIMap  in case it isn't already there.
         assert(!FSToVIMap.count(FS) || FSToVIMap[FS] == FSVI);
         FSToVIMap[FS] = FSVI;
-      }
+      } else if (FoundMultipleCalleeChains)
+        return false;
     }
   }
 
diff --git a/llvm/test/ThinLTO/X86/memprof-tailcall-nonunique.ll b/llvm/test/ThinLTO/X86/memprof-tailcall-nonunique.ll
index d7cfafec89feb..49c22bf590e6b 100644
--- a/llvm/test/ThinLTO/X86/memprof-tailcall-nonunique.ll
+++ b/llvm/test/ThinLTO/X86/memprof-tailcall-nonunique.ll
@@ -14,10 +14,11 @@
 ; RUN:  -r=%t.o,_Z4baz1v,plx \
 ; RUN:  -r=%t.o,_Z4baz2v,plx \
 ; RUN:  -r=%t.o,_Z3foob,plx \
+; RUN:  -r=%t.o,xyz,plx \
 ; RUN:  -r=%t.o,main,plx \
 ; RUN:  -r=%t.o,_Znam, \
 ; RUN:  -stats -debug -save-temps \
-; RUN:  -o %t.out 2>&1 | FileCheck %s --check-prefix=STATS
+; RUN:  -o %t.out 2>&1 | FileCheck %s --check-prefix=STATS --check-prefix=DEBUG
 
 ; RUN: llvm-dis %t.out.1.4.opt.bc -o - | FileCheck %s --check-prefix=IR
 
@@ -31,22 +32,20 @@
 ; RUN:  -r=%t.o,_Z4baz1v,plx \
 ; RUN:  -r=%t.o,_Z4baz2v,plx \
 ; RUN:  -r=%t.o,_Z3foob,plx \
+; RUN:  -r=%t.o,xyz,plx \
 ; RUN:  -r=%t.o,main,plx \
 ; RUN:  -r=%t.o,_Znam, \
 ; RUN:  -stats -debug \
-; RUN:  -o %t2.out 2>&1 | FileCheck %s --check-prefix=STATS
+; RUN:  -o %t2.out 2>&1 | FileCheck %s --check-prefix=STATS --check-prefix=DEBUG
 
 ;; Run ThinLTO backend
 ; RUN: opt -passes=memprof-context-disambiguation \
 ; RUN:  -memprof-import-summary=%t.o.thinlto.bc \
 ; RUN:  -stats %t.o -S 2>&1 | FileCheck %s --check-prefix=IR
 
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
+; DEBUG: Not found through unique tail call chain: 17377440600225628772 (_Z3barv) from 15822663052811949562 (main) that actually called 8716735811002003409 (xyz) (found multiple possible chains)
 
-; STATS: 4 memprof-context-disambiguation - Number of profiled callees found via multiple tail call chains
+; STATS: 1 memprof-context-disambiguation - Number of profiled callees found via multiple tail call chains
 
 ;; Check that all calls in the IR are to the original functions, leading to a
 ;; non-cold operator new call.
@@ -125,17 +124,24 @@ return:                                           ; preds = %if.else, %if.then
 }
 
 ; Function Attrs: noinline
-; IR-LABEL: @main()
-define dso_local i32 @main() local_unnamed_addr #0 {
+; IR-LABEL: @xyz()
+define dso_local i32 @xyz() local_unnamed_addr #0 {
 delete.end13:
   ; IR: call ptr @_Z3foob(i1 true)
-  %call = tail call ptr @_Z3foob(i1 true), !callsite !10
+  %call = tail call ptr @_Z3foob(i1 true)
   ; IR: call ptr @_Z3foob(i1 true)
-  %call1 = tail call ptr @_Z3foob(i1 true), !callsite !11
+  %call1 = tail call ptr @_Z3foob(i1 true)
   ; IR: call ptr @_Z3foob(i1 false)
-  %call2 = tail call ptr @_Z3foob(i1 false), !callsite !12
+  %call2 = tail call ptr @_Z3foob(i1 false)
   ; IR: call ptr @_Z3foob(i1 false)
-  %call3 = tail call ptr @_Z3foob(i1 false), !callsite !13
+  %call3 = tail call ptr @_Z3foob(i1 false)
+  ret i32 0
+}
+
+define dso_local i32 @main() local_unnamed_addr #0 {
+delete.end13:
+  ; IR: call i32 @xyz()
+  %call1 = tail call i32 @xyz(), !callsite !11
   ret i32 0
 }
 
@@ -145,17 +151,10 @@ attributes #0 = { noinline }
 attributes #1 = { nobuiltin allocsize(0) }
 attributes #2 = { builtin allocsize(0) }
 
-!0 = !{!1, !3, !5, !7}
-!1 = !{!2, !"notcold"}
-!2 = !{i64 3186456655321080972, i64 6307901912192269588}
-!3 = !{!4, !"cold"}
-!4 = !{i64 3186456655321080972, i64 6792096022461663180}
+!0 = !{!5, !7}
 !5 = !{!6, !"notcold"}
 !6 = !{i64 3186456655321080972, i64 8632435727821051414}
 !7 = !{!8, !"cold"}
 !8 = !{i64 3186456655321080972, i64 -3421689549917153178}
 !9 = !{i64 3186456655321080972}
-!10 = !{i64 8632435727821051414}
 !11 = !{i64 -3421689549917153178}
-!12 = !{i64 6307901912192269588}
-!13 = !{i64 6792096022461663180}
diff --git a/llvm/test/Transforms/MemProfContextDisambiguation/tailcall-nonunique.ll b/llvm/test/Transforms/MemProfContextDisambiguation/tailcall-nonunique.ll
index b49c9a139cfc1..13c056fc0b8c3 100644
--- a/llvm/test/Transforms/MemProfContextDisambiguation/tailcall-nonunique.ll
+++ b/llvm/test/Transforms/MemProfContextDisambiguation/tailcall-nonunique.ll
@@ -9,10 +9,7 @@
 ; RUN:  -stats -debug %s -S 2>&1 | FileCheck %s --check-prefix=STATS \
 ; RUN:  --check-prefix=IR --check-prefix=DEBUG
 
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
-; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called _Z3foob (found multiple possible chains)
+; DEBUG: Not found through unique tail call chain: _Z3barv from main that actually called xyz (found multiple possible chains)
 
 ;; Check that all calls in the IR are to the original functions, leading to a
 ;; non-cold operator new call.
@@ -91,39 +88,39 @@ return:                                           ; preds = %if.else, %if.then
 }
 
 ; Function Attrs: noinline
-; IR-LABEL: @main()
-define dso_local i32 @main() local_unnamed_addr #0 {
+; IR-LABEL: @xyz()
+define dso_local i32 @xyz() local_unnamed_addr #0 {
 delete.end13:
   ; IR: call ptr @_Z3foob(i1 true)
-  %call = tail call ptr @_Z3foob(i1 true), !callsite !10
+  %call = tail call ptr @_Z3foob(i1 true)
   ; IR: call ptr @_Z3foob(i1 true)
-  %call1 = tail call ptr @_Z3foob(i1 true), !callsite !11
+  %call1 = tail call ptr @_Z3foob(i1 true)
   ; IR: call ptr @_Z3foob(i1 false)
-  %call2 = tail call ptr @_Z3foob(i1 false), !callsite !12
+  %call2 = tail call ptr @_Z3foob(i1 false)
   ; IR: call ptr @_Z3foob(i1 false)
-  %call3 = tail call ptr @_Z3foob(i1 false), !callsite !13
+  %call3 = tail call ptr @_Z3foob(i1 false)
+  ret i32 0
+}
+
+define dso_local i32 @main() local_unnamed_addr #0 {
+delete.end13:
+  ; IR: call i32 @xyz()
+  %call1 = tail call i32 @xyz(), !callsite !11
   ret i32 0
 }
 
 ; IR: attributes #[[NOTCOLD]] = { builtin allocsize(0) "memprof"="notcold" }
 
-; STATS: 4 memprof-context-disambiguation - Number of profiled callees found via multiple tail call chains
+; STATS: 1 memprof-context-disambiguation - Number of profiled callees found via multiple tail call chains
 
 attributes #0 = { noinline }
 attributes #1 = { nobuiltin allocsize(0) }
 attributes #2 = { builtin allocsize(0) }
 
-!0 = !{!1, !3, !5, !7}
-!1 = !{!2, !"notcold"}
-!2 = !{i64 3186456655321080972, i64 6307901912192269588}
-!3 = !{!4, !"cold"}
-!4 = !{i64 3186456655321080972, i64 6792096022461663180}
+!0 = !{!5, !7}
 !5 = !{!6, !"notcold"}
 !6 = !{i64 3186456655321080972, i64 8632435727821051414}
 !7 = !{!8, !"cold"}
 !8 = !{i64 3186456655321080972, i64 -3421689549917153178}
 !9 = !{i64 3186456655321080972}
-!10 = !{i64 8632435727821051414}
 !11 = !{i64 -3421689549917153178}
-!12 = !{i64 6307901912192269588}
-!13 = !{i64 6792096022461663180}

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

}

define dso_local i32 @main() local_unnamed_addr #0 {
delete.end13:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think this delete.end13 label is referenced anywhere in this or other tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed from this test.

@teresajohnson teresajohnson merged commit b2f6d32 into llvm:main May 24, 2024
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants