Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT] Enhance fixed indirect branch handling #71324

Merged
merged 1 commit into from
Nov 16, 2023

Conversation

yota9
Copy link
Member

@yota9 yota9 commented Nov 5, 2023

Previously HasFixedIndirectBranch was set in BF to set isSimple to false
later because of unreachable bb ellimination pass which might remove the
BB with it's symbols accessed by other instructions than calls. It seems
to be that better solution would be to add extra entry point on target
offset instead of marking BF as non-simple.

@yota9
Copy link
Member Author

yota9 commented Nov 5, 2023

@maksfb Would be thankful for your comments :)

@yota9 yota9 changed the title [BOLT] Enchaise fixed indirect branch handling [BOLT] Enhance fixed indirect branch handling Nov 5, 2023
@maksfb
Copy link
Contributor

maksfb commented Nov 6, 2023

Overall, makes sense to me. I need to check internal tests with this PR before approving the change. Did you encounter a fixed indirect branch recently?

@yota9
Copy link
Member Author

yota9 commented Nov 6, 2023

@maksfb Thanks! I'm looking to support it on AArch64 and saw the opportunity for enhancement :)

@yota9
Copy link
Member Author

yota9 commented Nov 6, 2023

@maksfb Answering your question - I saw a few functions here during fixing the #70771 . As I wrote to Amir #71278 it looks like O0 file linked in with the code like:
adr x0, 1f
br x0
1: ret

So last br might be omitted at least. Plus I've found out that the adr on function makes it non-simple due to "potentially escaped address" in handleAddressRef, so I'm thinking to allow it for handling such situations.

@maksfb
Copy link
Contributor

maksfb commented Nov 7, 2023

One of our internal binaries binaries is different with this patch. I need to check why.

@yota9
Copy link
Member Author

yota9 commented Nov 7, 2023

@maksfb Sure. Probably some optimisation effects on the functions with fixed indirect branch? previously they were marked as non-simple and no optimisations were made.

@@ -54,7 +54,6 @@ enum class IndirectBranchType : char {
POSSIBLE_TAIL_CALL, /// Possibly a tail call.
POSSIBLE_JUMP_TABLE, /// Possibly a switch/jump table.
POSSIBLE_PIC_JUMP_TABLE, /// Possibly a jump table for PIC.
POSSIBLE_GOTO, /// Possibly a gcc's computed goto.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't remove it. I have a WIP diff that adds support and it's known to be present in cpython interpreter loop (_PyEval_EvalFrameDefault,_PyEval_EvalFrame_AOT_Interpreter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you ask so. But may I ask what is the difference between POSSIBLE_TAIL_CALL, POSSIBLE_GOTO and POSSIBLE_FIXED_BRANCH? I think we need to do something with it, it looks nearly the same to me, at least from ARM perspective.

Copy link
Member Author

@yota9 yota9 Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand POSSIBLE_FIXED_BRANCH is signle-entry jump table. Shouldn't we add the same logic to POSSIBLE_TAIL_CALL with createUncondBranch to POSSIBLE_TAIL_CALL @maksfb? It is out of scope of this review, but still looks a bit inconsistent to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that tailcall is inter-function control transfer, computed goto is an indirect internal control transfer with a slightly more general address calculation pattern at least on x86 (jump table base address is computed and stored in a register?), and a possible fixed branch is an internal indirect branch to a single target known statically. So it's partially overlapping but pretty well defined classification.

Copy link
Member Author

@yota9 yota9 Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we're checking tailcall target after we would "classify" it as tailcall, so it might be both local and global. As for possible fixed branch - the same, it is even checked to be relaxed only if it is local, but not on global. As for x86 I imagine you would return this value on some kind of special pattern found, but generally speaking it seems like one of two anyway. I don't really mind, just think it needs to have better description, probably target-dependent, I imagine that on ARM we might use it a bit other way, right now it has some overlap target-specific meaning, but generally seems to overlap, yes

@yota9
Copy link
Member Author

yota9 commented Nov 14, 2023

@maksfb Fill free to write me if you need a help with difference analyses :)

@maksfb
Copy link
Contributor

maksfb commented Nov 15, 2023

@maksfb Fill free to write me if you need a help with difference analyses :)

The miscompare was nothing serious and everything else passes. My concern was that there could be a jump table with unused entries that were in unreachable code. Since we ignore the jump table and optimize the function with UCE, we could end-up with the jump table referencing undefined labels. But I couldn't find any binary with such an anomaly. Even if this happens, we can implement a better protection mechanism against undefined symbols.

TL;DR: LGTM except for POSSIBLE_GOTO change. Please update the summary too.

Previously HasFixedIndirectBranch was set in BF to set isSimple to false
later because of unreachable bb ellimination pass which might remove the
BB with it's symbols accessed by other instructions than calls. It seems
to be that better solution would be to add extra entry point on target
offset instead of marking BF as non-simple.
@yota9
Copy link
Member Author

yota9 commented Nov 15, 2023

@maksfb Thanks for review, done. Please approve the patch.

My concern was that there could be a jump table with unused entries that were in unreachable code.

I don't think this is the case here. We're optimising here BF with single-entry jump table the target of which we know. I think we do optimise functions with jump tables, so it probably should be fine. Maybe I miss something though..

@yota9 yota9 merged commit 5b59540 into llvm:main Nov 16, 2023
3 checks passed
sr-tream pushed a commit to sr-tream/llvm-project that referenced this pull request Nov 20, 2023
Previously HasFixedIndirectBranch was set in BF to set isSimple to false
later because of unreachable bb ellimination pass which might remove the
BB with it's symbols accessed by other instructions than calls. It seems
to be that better solution would be to add extra entry point on target
offset instead of marking BF as non-simple.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
Previously HasFixedIndirectBranch was set in BF to set isSimple to false
later because of unreachable bb ellimination pass which might remove the
BB with it's symbols accessed by other instructions than calls. It seems
to be that better solution would be to add extra entry point on target
offset instead of marking BF as non-simple.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants