[FnSpecialization] Enable function specialization of call chains #163891

bababuck · 2025-10-16T23:44:54Z

Currently, we don't optimize function-specialize cases like the following:

int foo(int a) {
  return a + 1;
}

int bar(int a) {
  return foo(a);
}

int main() {
  return bar(3);
}

because when specializing for the 3 passed to bar(), the value isn't propagated into foo() to gauge benefit from specializing foo() as well. With this patch, the optimized code would be:

int foo.specialized.1() {
  return 4;
}

int bar.specialized.2() {
  return foo.specialized.1();
}

int main() {
  return bar.specialized.2();
}

This patch required a fair amount of refactoring before making my changes. That refactoring was done as a series of commits before the main changes for this patch. I'm assuming that (if accepted) this should get merged in as a series of MRs.

The series of commits (x/6) all belong together since they are the same functional change, but I left them as separate commits for easier reading.

At a high level, each Spec element can have SubSpecs which are functions that it forwards its constant argument(s) to and should be specialized along with it. In the above exampled, the Spec for bar(3) would have a SubSpec for foo(3).

…han the minimum amount If the knob for minimum code size is turned down low enough, for small functions: `MinCodeSizeSavings * FuncSize / 100` will evaluate to `0`, and then with strict less than we will accept Specialization that doesn't lead to any benefit.

…ucture The data structure will eventually contain extra data for chained and indirect specialization.

…ogic to its own function Will want to call recursively for chains.

…ization into macro Will need to call recursively. No functional change.

Spec contains a Function, and will need to pass extra information with Chaining.

Used to be a single object within findSpecializations() since each Function only entered findSpecializations() once. But will now be going in arbitrary order with Chains.

…ation occurred

…zation Cannot rely on AllSpecs to be inorder after Chaining.

If a function is called with constants that passes those constants to another function, try to specialize both of those functions.

…en only ever part of a chain Will get specialized as part of the chain if the chain scores well enough.

… chains in NSpecs Will get specialized as part of chain, so aren't viable as a standalone.

When calculating possible Chains, use the metrics saved as part of the sub-specializations.

…ed functions Otherwise confusing with Chaining.

In the future we won't know the Function at the time of insertion, so need to store and index so we can look up the Argument later.

…of arguments, skip chaining See test/Transforms/FunctionSpecialization/compiler-crash-60191.ll

…posed by specialization

…s part of a chain This way we can still more accurately see the effect of the specialization.

…ble arguments

github-actions · 2025-10-16T23:46:25Z

✅ With the latest revision this PR passed the C/C++ code formatter.

labrinea · 2025-10-23T14:23:22Z

when specializing for the 3 passed to bar(), the value isn't propagated into foo() to gauge benefit from specializing foo() as well

The function InstCostVisitor::visitCallBase tries to compute benefit from constant folding if possible. In this example it cannot constant fold foo(3) while propagating the constant inside bar's body. However given a small enough codesize threshold this example gets specialized: https://godbolt.org/z/dvfMov9WM

bababuck · 2025-10-23T18:37:13Z

when specializing for the 3 passed to bar(), the value isn't propagated into foo() to gauge benefit from specializing foo() as well

The function InstCostVisitor::visitCallBase tries to compute benefit from constant folding if possible. In this example it cannot constant fold foo(3) while propagating the constant inside bar's body. However given a small enough codesize threshold this example gets specialized: https://godbolt.org/z/dvfMov9WM

Agreed, thank you for the correction! My understanding is that visitCallBase, which calls ConstantFolding.cpp::ConstandFoldCall which to my understanding is targeted at intrinsics and library calls (please correct me if I'm wrong).

The example on Godbolt only specializes on the current upstream due to some funny behavior of the function specializer, see #164867 (this change is included in this MR as well, I have begun splitting off small chunks that can stand on their own).

labrinea · 2025-10-24T14:03:31Z

I briefly looked at the patch series and I am not convinced it's the right approach. Perhaps it's best to handle non constant foldable calls in the instruction cost visitor separately, similarly to branches and switches which are not folded to a constant. I mean to compute the profitability of specializing that call instead of folding it. But then all this adds compile time complexity which I am not sure it is worth it.

bababuck · 2025-10-28T07:12:58Z

Sorry for the slow response, and thanks for taking the time to engage.

Perhaps it's best to handle non constant foldable calls in the instruction cost visitor separately, similarly to branches and switches which are not folded to a constant.

I think that is a competitive approach, here were my pro's/con's between the two approaches:
Pro's of approach in patch:

Able to handle indirect function calls that are constants. Since currently we visit each instruction one argument as a time, the second approach would require additional caching logic to handle this.
Can handle chained specialization in a single run() loop, doesn't run the risk of specializing the first layer and not the second (in the case where max iterations is hit, or maximum code-size is reached.) At least if my understanding is correct, the cost metric would allow the outer function to specialize due to the savings of the inner function. Then in the next iteration, the inner function would specialize.

Pro's of the approach you suggested:

Much cleaner implementation, only need to extend a single area of the code

But then all this adds compile time complexity which I am not sure it is worth it.

We were looking into this for a particular case in x264 which we wanted to optimize, but I can collect data on how this code behaves on other workloads.

bababuck added 22 commits October 16, 2025 12:14

[FnSpecialization] Add new test for chained specialization

92297fc

[FnSpecialization] Refactor SpecCall::CallSites to contain a data str…

5ea8869

…ucture The data structure will eventually contain extra data for chained and indirect specialization.

[FnSpecialization] Refactor main loop in run() to pull out the loop l…

ca8182b

…ogic to its own function Will want to call recursively for chains.

[FnSpecialization] Refactor logic for actually performing the special…

24cc5b0

…ization into macro Will need to call recursively. No functional change.

[FnSpecialization] Pass a Spec to runOneSpec() rather than a Function

32ef1d7

Spec contains a Function, and will need to pass extra information with Chaining.

[FnSpecialization] Use the same UniqueSpecs across entire run()

d6a2c96

Used to be a single object within findSpecializations() since each Function only entered findSpecializations() once. But will now be going in arbitrary order with Chains.

[FnSpecialization] Don't rely on UniqueSpec to determine if specializ…

37b9a5d

…ation occurred

[FnSpecialization] Modify SpecMap to hold a pointer to every speciali…

2bc068b

…zation Cannot rely on AllSpecs to be inorder after Chaining.

[FnSpecialization] (1/6) Enable function specialization chaining

6a8cfdc

If a function is called with constants that passes those constants to another function, try to specialize both of those functions.

[FnSpecialization] (2/6) Avoid creating standalone specializations wh…

6e0c258

…en only ever part of a chain Will get specialized as part of the chain if the chain scores well enough.

[FnSpecialization] (3/6) Don't consider specializations that are only…

5070b85

… chains in NSpecs Will get specialized as part of chain, so aren't viable as a standalone.

[FnSpecialization] (4/6) Cache scoring metrics as part of Spec

2cff314

When calculating possible Chains, use the metrics saved as part of the sub-specializations.

[FnSpecialization] (5/6) Use an explicit structure for tracking visit…

49725f4

…ed functions Otherwise confusing with Chaining.

[FnSpecialization] (6/6) Update tests for prior set of changes

1b5a006

[FnSpecialization] Allow chains to form via recrusive folding

841a08e

[FnSpecialization] Allow chains to form when collapsing PHI nodes

fc70c01

[FnSpecialization] Refactor CallUsersT to contain Idx/Constant pairs

fb21dc6

In the future we won't know the Function at the time of insertion, so need to store and index so we can look up the Argument later.

[FnSpecialization] If the Argument number is greater than the number …

da77758

…of arguments, skip chaining See test/Transforms/FunctionSpecialization/compiler-crash-60191.ll

[FnSpecialization] Allow specialization of indirect function calls ex…

9773c8b

…posed by specialization

[FnSpecialization] Allow functions that are too small to specailize a…

706fb4a

…s part of a chain This way we can still more accurately see the effect of the specialization.

[FnSpecialization] Don't specialize chained functions that take varia…

8f94438

…ble arguments

Lint fix

54f89a3

bababuck mentioned this pull request Oct 23, 2025

[LoopVectorizer] Pre-LTO LoopVectorization (as discussed in 10/21/25 Vectorization meeting) #164762

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FnSpecialization] Enable function specialization of call chains #163891

[FnSpecialization] Enable function specialization of call chains #163891

Uh oh!

bababuck commented Oct 16, 2025

Uh oh!

github-actions bot commented Oct 16, 2025 •

edited

Loading

Uh oh!

labrinea commented Oct 23, 2025

Uh oh!

bababuck commented Oct 23, 2025

Uh oh!

labrinea commented Oct 24, 2025

Uh oh!

bababuck commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[FnSpecialization] Enable function specialization of call chains #163891

Are you sure you want to change the base?

[FnSpecialization] Enable function specialization of call chains #163891

Uh oh!

Conversation

bababuck commented Oct 16, 2025

Uh oh!

github-actions bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

labrinea commented Oct 23, 2025

Uh oh!

bababuck commented Oct 23, 2025

Uh oh!

labrinea commented Oct 24, 2025

Uh oh!

bababuck commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Oct 16, 2025 •

edited

Loading