-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Speed up IRGuardCondition::controlsBlock #3654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This tidies up the code, removing unnecessary repetition.
The `controlsBlock` predicate had some dramatic bulges in its tuple counts. To make matters worse, those bulges were in materialized intermediate predicates like `#shared` and `#antijoin_rhs`, not just in the middle of a pipeline. The problem was particularly evident on kamailio/kamailio, where `controlsBlock` was the slowest predicate in the IR libraries: IRGuards::IRGuardCondition::controlsBlock_dispred#fff#shared#4 ........ 58.8s IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs .... 33.4s IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs#1 .. 26.7s The first of the above relations had 201M rows, and the others had intermediate bulges of similar size. The bulges could be observed even on small projects although they did not cause measurable performance issues there. The `controlsBlock_dispred#fff#shared#4` relation had 3M rows on git/git, which is a lot for a project with only 1.5M IR instructions. This commit borrows an efficient implementation from Java's `Guards.qll`, tweaking it slightly to fit into `IRGuards`. Performance is now much better: IRGuards::IRGuardCondition::controlsBlock_dispred#fff ................... 6.1s IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff .............. 616ms IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff#antijoin_rhs . 540ms After this commit, the biggest bulge in `controlsBlock` is the size of `IRBlock::dominates`. On kamailio/kamailio this is an intermediate tuple count of 18M rows in the calculation of `controlsBlock`, which in the end produces 11M rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic LGTM, but it's very subtle, so being based on the Java solution is reassuring!
Instead of a vague reference to a code comment for another language, the `controlsBlock` predicate now has the whole comment in it directly. I've adjusted the wording so it should be reasonably correct for C/C++. As with the other comments in this file, I don't distinguish between the condition and its block. I think that makes the explanation clearer without losing any detail we care about. To make the code fit the wording of the comment, I changed the `hasBranchEdge/2` predicate into `getBranchSuccessor/1`.
exists(IRBlock branchBlock | branchBlock = this.getBranchBlock() | | ||
branchBlock.immediatelyDominates(succ) and | ||
branchBlock.getASuccessor() = succ and | ||
forall(IRBlock pred | pred = succ.getAPredecessor() and pred != branchBlock | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This drops the requirement that pred
is reachable. I think that's now a consistency requirement, but if it's violated, would that lead to a tuple explosion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well spotted. I didn't drop the requirement on purpose, but I'd like to understand whether it's needed.
Is the problem that an unreachable pred
is in principle dominated by every node because the definition of dominance says that "pred
is dominated by succ
if for every path to pred
[...]", and this holds vacuously because there are no paths to pred
?
Does our dominates
predicate actually implement that definition? We implement it as immediatelyDominates*
, where immediatelyDominates
is computed by a HOP. I expect that immediatelyDominates
does not hold for unreachable nodes since otherwise it wouldn't be a tree. Therefore our dominates
relation shouldn't hold for unreachable nodes either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That leads to some false negatives, then. If a pred
is unreachable, then succ.dominates(pred)
won't hold, and so the whole forall
will fail. We do currently have at least one situation where the exit block of a function with an unconditional throw
is kept in later stages, but that wouldn't trigger this issue (and might be a bug in any case).
@dbartol Should unreachable blocks be a consistency violation? I'm happy to merge this if so, otherwise I'd like to re-add the not pred.isReachableFromFunctionEntry()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I buy your argument, so I've restored the case along with a comment for future me.
Testing on Kamailio, adding this case had a 2s performance cost, dominated by the recursion in isReachableFromFunctionEntry
.
If isReachableFromFunctionEntry
always holds, then I suggest removing that predicate entirely and documenting on class IRBlock
that a block is always reachable from the entry point. But that's outside the scope of this PR.
This restores the code I removed in 4642037.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM, and I note the diff job showed no changes and @rdmarsh2 's comment seems to have been addressed.
This PR is a port of @aschackmull's implementation of Java guardedness to the C++ IR, giving us a performance boost that makes a big difference on kamailio/kamailio. See commit message of the second commit for details.
As evidence for the correctness of this PR, I've checked that the row count of
IRGuards::IRGuardCondition::controlsBlock_dispred#fff
is exactly the same before and after: 11,261,160 rows on the kamailio/kamailio snapshot I got from lgtm.com.