-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SelectionDAG] Disable FastISel for swiftasync functions #70741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SelectionDAG] Disable FastISel for swiftasync functions #70741
Conversation
Most (x86) swiftasync functions tend to use both SelectionDAGISel and FastISel lowering: * FastISel argument lowering can only handle C calling convention. * FastISel fails mid-BB in a number of ways, including in simple `ret void` instructions under certain circumstances. This dance of SelectionDAG (argument) -> FastISel (some instructions) -> SelectionDAG(remaining instructions) is lossy; in particular, Argument information lowering is cleared after that first SelectionDAG run. Since swiftasync functions rely heavily on proper Argument lowering for debug information, this patch disables the use of FastISel in such functions. This was tested by compiling a big translation unit from the Swift concurrency library, and there was no measurable performance impact: / Without patch (i.e. using FastISel) Time (mean ± σ): 2.416 s ± 0.016 s [User: 2.321 s, System: 0.068 s] Range (min … max): 2.403 s … 2.458 s 10 runs // With patch (i.e. not using FastISel) Time (mean ± σ): 2.407 s ± 0.011 s [User: 2.313 s, System: 0.067 s] Range (min … max): 2.396 s … 2.424 s 10 runs
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-selectiondag Author: Felipe de Azevedo Piovezan (felipepiovezan) ChangesMost (x86) swiftasync functions tend to use both SelectionDAGISel and FastISel lowering:
This dance of SelectionDAG (argument) -> FastISel (some instructions) -> SelectionDAG(remaining instructions) is lossy; in particular, Argument information lowering is cleared after that first SelectionDAG run. Since swiftasync functions rely heavily on proper Argument lowering for debug information, this patch disables the use of FastISel in such functions. This was tested by compiling a big translation unit from the Swift concurrency library, and there was no measurable performance impact:
Full diff: https://github.com/llvm/llvm-project/pull/70741.diff 1 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 5be9ff0300b0485..d9d1b7d21a3c528 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -1441,11 +1441,21 @@ static void processSingleLocVars(FunctionLoweringInfo &FuncInfo,
}
}
+static bool shouldEnableFastISel(const Function &Fn) {
+ // Don't enable FastISel for functions with swiftasync Arguments.
+ // Debug info on those is reliant on good Argument lowering, and FastISel is
+ // not capable of lowering the entire function. Mixing the two selectors tend
+ // to result in poor lowering of Arguments.
+ return none_of(Fn.args(), [](const Argument &Arg) {
+ return Arg.hasAttribute(Attribute::AttrKind::SwiftAsync);
+ });
+}
+
void SelectionDAGISel::SelectAllBasicBlocks(const Function &Fn) {
FastISelFailed = false;
// Initialize the Fast-ISel state, if needed.
FastISel *FastIS = nullptr;
- if (TM.Options.EnableFastISel) {
+ if (TM.Options.EnableFastISel && shouldEnableFastISel(Fn)) {
LLVM_DEBUG(dbgs() << "Enabling fast-isel\n");
FastIS = TLI->createFastISel(*FuncInfo, LibInfo);
}
|
@@ -1441,11 +1441,21 @@ static void processSingleLocVars(FunctionLoweringInfo &FuncInfo, | |||
} | |||
} | |||
|
|||
static bool shouldEnableFastISel(const Function &Fn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we had some kind of hook for this somewhere already; I would think this is a target specific decision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OptLevelChanger ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointers, I will have a look! It would be nice if this could be done on a per function basis though, as we wouldn't want to disable fast isel for the entire module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT all the TargetMachine hooks operate independently of the Function being lowered.
Regarding OptLevelChanger
, it seems to be used to force the opt level to "None" when we have skipPass(Fn) == true
, and then it queries the TM to check if FastISel is enabled for O0.
That said, some of the debug messages inside OptLevelChanger
makes me think we could add move the check from this patch in there. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth a try, it's supposed to allow opt level override at the function level, which is precisely what we're after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this only affects -Onone, and only x86_64, I'm fine with this.
(assuming all other comments are being addressed) |
Do you have any tests that you can add? Particularly the improved Argument lowering handling? |
Yup, good thing you asked, because I've just found a bug with my |
Added a test and tweaked the implementation slightly |
Any other comments / concerns with the implementation? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Most (x86) swiftasync functions tend to use both SelectionDAGISel and FastISel lowering: * FastISel argument lowering can only handle C calling convention. * FastISel fails mid-BB in a number of ways, including in simple `ret void` instructions under certain circumstances. This dance of SelectionDAG (argument) -> FastISel (some instructions) -> SelectionDAG(remaining instructions) is lossy; in particular, Argument information lowering is cleared after that first SelectionDAG run. Since swiftasync functions rely heavily on proper Argument lowering for debug information, this patch disables the use of FastISel in such functions. (cherry picked from commit 83729e6)
…le_fast_isel2 [cherry-pick][SelectionDAG] Disable FastISel for swiftasync functions (llvm#70741)
Most (x86) swiftasync functions tend to use both SelectionDAGISel and FastISel lowering: * FastISel argument lowering can only handle C calling convention. * FastISel fails mid-BB in a number of ways, including in simple `ret void` instructions under certain circumstances. This dance of SelectionDAG (argument) -> FastISel (some instructions) -> SelectionDAG(remaining instructions) is lossy; in particular, Argument information lowering is cleared after that first SelectionDAG run. Since swiftasync functions rely heavily on proper Argument lowering for debug information, this patch disables the use of FastISel in such functions.
Most (x86) swiftasync functions tend to use both SelectionDAGISel and FastISel lowering:
ret void
instructions under certain circumstances.This dance of SelectionDAG (argument) -> FastISel (some instructions) -> SelectionDAG(remaining instructions) is lossy; in particular, Argument information lowering is cleared after that first SelectionDAG run.
Since swiftasync functions rely heavily on proper Argument lowering for debug information, this patch disables the use of FastISel in such functions.
This was tested by compiling a big translation unit from the Swift concurrency library, and there was no measurable performance impact: