Skip to content

Conversation

jbj
Copy link
Contributor

@jbj jbj commented Jun 9, 2020

This PR is a port of @aschackmull's implementation of Java guardedness to the C++ IR, giving us a performance boost that makes a big difference on kamailio/kamailio. See commit message of the second commit for details.

As evidence for the correctness of this PR, I've checked that the row count of IRGuards::IRGuardCondition::controlsBlock_dispred#fff is exactly the same before and after: 11,261,160 rows on the kamailio/kamailio snapshot I got from lgtm.com.

jbj added 2 commits June 9, 2020 10:33
This tidies up the code, removing unnecessary repetition.
The `controlsBlock` predicate had some dramatic bulges in its tuple
counts. To make matters worse, those bulges were in materialized
intermediate predicates like `#shared` and `#antijoin_rhs`, not just in
the middle of a pipeline.

The problem was particularly evident on kamailio/kamailio, where
`controlsBlock` was the slowest predicate in the IR libraries:

    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#shared#4 ........ 58.8s
    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs .... 33.4s
    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs#1 .. 26.7s

The first of the above relations had 201M rows, and the others
had intermediate bulges of similar size.

The bulges could be observed even on small projects although they did
not cause measurable performance issues there. The
`controlsBlock_dispred#fff#shared#4` relation had 3M rows on git/git,
which is a lot for a project with only 1.5M IR instructions.

This commit borrows an efficient implementation from Java's
`Guards.qll`, tweaking it slightly to fit into `IRGuards`. Performance
is now much better:

    IRGuards::IRGuardCondition::controlsBlock_dispred#fff ................... 6.1s
    IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff .............. 616ms
    IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff#antijoin_rhs . 540ms

After this commit, the biggest bulge in `controlsBlock` is the size of
`IRBlock::dominates`. On kamailio/kamailio this is an intermediate tuple
count of 18M rows in the calculation of `controlsBlock`, which in the
end produces 11M rows.
@jbj jbj added the C++ label Jun 9, 2020
@jbj jbj requested a review from a team as a code owner June 9, 2020 10:28
@jbj
Copy link
Contributor Author

jbj commented Jun 9, 2020

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic LGTM, but it's very subtle, so being based on the Java solution is reassuring!

Instead of a vague reference to a code comment for another language, the
`controlsBlock` predicate now has the whole comment in it directly.

I've adjusted the wording so it should be reasonably correct for C/C++.
As with the other comments in this file, I don't distinguish between the
condition and its block. I think that makes the explanation clearer
without losing any detail we care about.

To make the code fit the wording of the comment, I changed the
`hasBranchEdge/2` predicate into `getBranchSuccessor/1`.
exists(IRBlock branchBlock | branchBlock = this.getBranchBlock() |
branchBlock.immediatelyDominates(succ) and
branchBlock.getASuccessor() = succ and
forall(IRBlock pred | pred = succ.getAPredecessor() and pred != branchBlock |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This drops the requirement that pred is reachable. I think that's now a consistency requirement, but if it's violated, would that lead to a tuple explosion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted. I didn't drop the requirement on purpose, but I'd like to understand whether it's needed.

Is the problem that an unreachable pred is in principle dominated by every node because the definition of dominance says that "pred is dominated by succ if for every path to pred [...]", and this holds vacuously because there are no paths to pred?

Does our dominates predicate actually implement that definition? We implement it as immediatelyDominates*, where immediatelyDominates is computed by a HOP. I expect that immediatelyDominates does not hold for unreachable nodes since otherwise it wouldn't be a tree. Therefore our dominates relation shouldn't hold for unreachable nodes either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That leads to some false negatives, then. If a pred is unreachable, then succ.dominates(pred) won't hold, and so the whole forall will fail. We do currently have at least one situation where the exit block of a function with an unconditional throw is kept in later stages, but that wouldn't trigger this issue (and might be a bug in any case).

@dbartol Should unreachable blocks be a consistency violation? I'm happy to merge this if so, otherwise I'd like to re-add the not pred.isReachableFromFunctionEntry().

Copy link
Contributor Author

@jbj jbj Jun 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy your argument, so I've restored the case along with a comment for future me.

Testing on Kamailio, adding this case had a 2s performance cost, dominated by the recursion in isReachableFromFunctionEntry.

If isReachableFromFunctionEntry always holds, then I suggest removing that predicate entirely and documenting on class IRBlock that a block is always reachable from the entry point. But that's outside the scope of this PR.

This restores the code I removed in 4642037.
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM, and I note the diff job showed no changes and @rdmarsh2 's comment seems to have been addressed.

@geoffw0 geoffw0 merged commit 03c6d7a into github:master Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants