Skip to content

Conversation

@boomanaiden154
Copy link
Contributor

@boomanaiden154 boomanaiden154 commented Nov 29, 2025

This removes the presence of global state from within the pass which is blocking some efforts around test daemonization and is not good design practice in general for LLVM. See
https://discourse.llvm.org/t/rfc-reducing-process-creation-overhead-in-llvm-regression-tests/88612/11 for more discussion.

This patch replaces the usage of global state with a DebugCounter, which helps fix the global state problem and also increases the flexibility of the option as now an explicit range can be passed.

Co-authored-by: Mingming Liu mingmingl@google.com

This removes the presence of global state from within the pass which is
blocking some efforts around test daemonization and is not good design
practice in general for LLVM. See
https://discourse.llvm.org/t/rfc-reducing-process-creation-overhead-in-llvm-regression-tests/88612/11
for more discussion.

This patch replaces the usage of global state with a DebugCounter, which
helps fix the global state problem and also increases the flexibility of
the option as now an explicit range can be passed.
@llvmbot
Copy link
Member

llvmbot commented Nov 29, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Aiden Grossman (boomanaiden154)

Changes

This removes the presence of global state from within the pass which is blocking some efforts around test daemonization and is not good design practice in general for LLVM. See
https://discourse.llvm.org/t/rfc-reducing-process-creation-overhead-in-llvm-regression-tests/88612/11 for more discussion.

This patch replaces the usage of global state with a DebugCounter, which helps fix the global state problem and also increases the flexibility of the option as now an explicit range can be passed.


Full diff: https://github.com/llvm/llvm-project/pull/170009.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp (+5-15)
  • (modified) llvm/test/Transforms/WholeProgramDevirt/import.ll (+5-5)
diff --git a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
index 2dd0fde6b34d6..4642da0abdc13 100644
--- a/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+++ b/llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
@@ -99,6 +99,7 @@
 #include "llvm/IR/ProfDataUtils.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/CommandLine.h"
+#include "llvm/Support/DebugCounter.h"
 #include "llvm/Support/Errc.h"
 #include "llvm/Support/Error.h"
 #include "llvm/Support/FileSystem.h"
@@ -130,6 +131,8 @@ STATISTIC(NumUniqueRetVal, "Number of unique return value optimizations");
 STATISTIC(NumVirtConstProp1Bit,
           "Number of 1 bit virtual constant propagations");
 STATISTIC(NumVirtConstProp, "Number of virtual constant propagations");
+DEBUG_COUNTER(CallsToDevirt, "calls-to-devirt",
+              "Controls how many calls should be devirtualized.");
 
 namespace llvm {
 
@@ -219,14 +222,6 @@ static cl::opt<bool> WholeProgramDevirtKeepUnreachableFunction(
     cl::desc("Regard unreachable functions as possible devirtualize targets."),
     cl::Hidden, cl::init(true));
 
-/// If explicitly specified, the devirt module pass will stop transformation
-/// once the total number of devirtualizations reach the cutoff value. Setting
-/// this option to 0 explicitly will do 0 devirtualization.
-static cl::opt<unsigned> WholeProgramDevirtCutoff(
-    "wholeprogramdevirt-cutoff",
-    cl::desc("Max number of devirtualizations for devirt module pass"),
-    cl::init(0));
-
 /// Mechanism to add runtime checking of devirtualization decisions, optionally
 /// trapping or falling back to indirect call on any that are not correct.
 /// Trapping mode is useful for debugging undefined behavior leading to failures
@@ -377,9 +372,6 @@ VirtualCallTarget::VirtualCallTarget(GlobalValue *Fn, const TypeMemberInfo *TM)
 
 namespace {
 
-// Tracks the number of devirted calls in the IR transformation.
-static unsigned NumDevirtCalls = 0;
-
 // A slot in a set of virtual tables. The TypeID identifies the set of virtual
 // tables, and the ByteOffset is the offset in bytes from the address point to
 // the virtual function pointer.
@@ -1216,15 +1208,13 @@ void DevirtModule::applySingleImplDevirt(VTableSlotInfo &SlotInfo,
         continue;
 
       // Stop when the number of devirted calls reaches the cutoff.
-      if (WholeProgramDevirtCutoff.getNumOccurrences() > 0 &&
-          NumDevirtCalls >= WholeProgramDevirtCutoff)
-        return;
+      if (!DebugCounter::shouldExecute(CallsToDevirt))
+        continue;
 
       if (RemarksEnabled)
         VCallSite.emitRemark("single-impl",
                              TheFn->stripPointerCasts()->getName(), OREGetter);
       NumSingleImpl++;
-      NumDevirtCalls++;
       auto &CB = VCallSite.CB;
       assert(!CB.getCalledFunction() && "devirtualizing direct call?");
       IRBuilder<> Builder(&CB);
diff --git a/llvm/test/Transforms/WholeProgramDevirt/import.ll b/llvm/test/Transforms/WholeProgramDevirt/import.ll
index de25bc10a7c12..153f18e4fbc3a 100644
--- a/llvm/test/Transforms/WholeProgramDevirt/import.ll
+++ b/llvm/test/Transforms/WholeProgramDevirt/import.ll
@@ -8,12 +8,12 @@
 ; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-vcp-branch-funnel.yaml < %s | FileCheck --check-prefixes=CHECK,VCP,VCP-X86,VCP64,BRANCH-FUNNEL %s
 ; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -wholeprogramdevirt-read-summary=%S/Inputs/import-branch-funnel.yaml < %s | FileCheck --check-prefixes=CHECK,BRANCH-FUNNEL,BRANCH-FUNNEL-NOVCP %s
 
-; Cutoff value is not explicitly set. Expect 3 remark messages.
+; Devirt calls debug counter is not explicitly set. Expect 3 remark messages.
 ; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s  2>&1 | grep "single-impl" | count 3
-; Cutoff value is set to 1. Expect one remark messages.
-; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-cutoff=1  -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s  2>&1 | grep "single-impl" | count 1
-; Cutoff value is explicitly set to zero. Expect no remark message.
-; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-cutoff=0  -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s 2>&1  | FileCheck -implicit-check-not="remark" %s
+; Devirt calls debug counter is set to 1. Expect one remark messages.
+; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -pass-remarks=wholeprogramdevirt -debug-counter=calls-to-devirt=0  -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s  2>&1 | grep "single-impl" | count 1
+; Devirt calls debug counter is set outside the range of calls. Expect no remark message.
+; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -pass-remarks=wholeprogramdevirt -debug-counter=calls-to-devirt=9999  -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s 2>&1  | FileCheck -implicit-check-not="remark" %s
 target datalayout = "e-p:64:64"
 target triple = "x86_64-unknown-linux-gnu"
 

Copy link
Contributor

@mingmingl-llvm mingmingl-llvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the change Aiden! LGTM with some suggestions

; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-cutoff=1 -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s 2>&1 | grep "single-impl" | count 1
; Cutoff value is explicitly set to zero. Expect no remark message.
; RUN: opt -S -passes=wholeprogramdevirt -wholeprogramdevirt-summary-action=import -pass-remarks=wholeprogramdevirt -wholeprogramdevirt-cutoff=0 -wholeprogramdevirt-read-summary=%S/Inputs/import-single-impl.yaml < %s 2>&1 | FileCheck -implicit-check-not="remark" %s
; Devirt calls debug counter is set to 1. Expect one remark messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: setting -debug-counter=calls-to-devirt=2 -print-debug-counter-queries devirts the third callsite, and grep "single-impl" also gives 1.

Could you tighten up the test to add -print-debug-counter-queries and also test that the debug counter output?

-debug-counter=calls-to-devirt=0 -print-debug-counter-queries gives the following output.

DebugCounter calls-to-devirt=0 execute
remark: <unknown>:0:0: single-impl: devirtualized a call to singleimpl1
DebugCounter calls-to-devirt=1 skip
DebugCounter calls-to-devirt=2 skip

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of

static bool isCountingEnabled() {
// Compile to nothing when debugging is off
#ifdef NDEBUG
return false;
#else
return instance().Enabled || instance().ShouldPrintCounter;
#endif
}

is that debug counter needs -DLLVM_ENABLE_ASSERTIONS=ON, and the test will need something like 'REQUIRES: asserts'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've split this into a separate test so that we can require assertions only where needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we can no longer use this option to bisect on a release compiler?

Copy link
Contributor

@mingmingl-llvm mingmingl-llvm Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we can no longer use this option to bisect on a release compiler?

We cannot bisect on a release compiler.

Fortunately, our internal infrastructure has a compiler that enables debug assertions and also supports enabling assertions on a per-file basis, which should help us work around the limitation internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an alternative that would have maintained the ability to bisect with a release compiler? Could the counter have been moved into the DevirtModule class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably just make DebugCounters always available in release builds. I tried this a while ago (https://llvm-compile-time-tracker.com/compare.php?from=100a1d0c4caad0d0f2ec26b07d3cc73f59b9a9a8&to=341c03268229f2982df1fd7ddb5e3db7a2e7731a&stat=instructions:u) and while there is measurable overhead, it's probably acceptably small (and possibly optimizable).

@mingmingl-llvm
Copy link
Contributor

fyi I'm planning to land https://github.com/llvm/llvm-project/pull/170011/files after this change to make the multi-line RUN statement easier to modify.

@boomanaiden154
Copy link
Contributor Author

fyi I'm planning to land https://github.com/llvm/llvm-project/pull/170011/files after this change to make the multi-line RUN statement easier to modify.

I ended up subsuming that patch into this one with the test splitting. I've added you as a coauthor.

@github-actions
Copy link

github-actions bot commented Dec 1, 2025

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef([^a-zA-Z0-9_-]|$)|UndefValue::get)' 'HEAD~1' HEAD llvm/test/Transforms/WholeProgramDevirt/calls-to-devirt.ll llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp llvm/test/Transforms/WholeProgramDevirt/import.ll

The following files introduce new uses of undef:

  • llvm/test/Transforms/WholeProgramDevirt/calls-to-devirt.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@boomanaiden154
Copy link
Contributor Author

I'll submit a follow up patch to fix all the undef uses in the WPD tests at the same time.

@boomanaiden154 boomanaiden154 enabled auto-merge (squash) December 1, 2025 15:10
@boomanaiden154 boomanaiden154 merged commit 979a987 into llvm:main Dec 1, 2025
8 of 10 checks passed
@boomanaiden154 boomanaiden154 deleted the wpd-global-state-11-29-25 branch December 1, 2025 15:35
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 1, 2025

LLVM Buildbot has detected a new failure on builder mlir-nvidia running on mlir-nvidia while building llvm at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/138/builds/22519

Here is the relevant piece of the build log for the reference
Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin"  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void -O0  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0
# .---command stderr------------
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# `-----------------------------
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input
# |  // CHECK: [84, 84]
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Unranked Memref base@ = 0x5c507ec81b90 rank = 1 offset = 0 sizes = [2] strides = [1] data = 
# | ^
# | <stdin>:2:1: note: possible intended match here
# | [42, 42]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Unranked Memref base@ = 0x5c507ec81b90 rank = 1 offset = 0 sizes = [2] strides = [1] data =  
# | check:68'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: [42, 42] 
# | check:68'0     ~~~~~~~~~
# | check:68'1     ?         possible intended match
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants