-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Speedup product dataflow #13139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
) | ||
} | ||
|
||
pragma[assume_small_delta] | ||
pragma[nomagic] | ||
private predicate revReachableInterprocEntry(Flow1::PathNode node1, Flow2::PathNode node2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are making the reverse calculation much more complicated than it has to be. Instead of mirroring the forward calculation, you can make things much simpler by using fwdIsSuccessor
as your step relation. That way there will be no need to do anything with revLocalPathStepXXX
. You'll get slightly different node-pairs, but that's because the code as-is doesn't calculate revReachableInterprocEntry
as advertised but rather the exits, so your revReachableInterprocEntry
confusingly isn't a subset of fwdReachableInterprocEntry
. With this simplification the reverse flow will simply be a subset of fwdReachableInterprocEntry
and the node pairs from the forward and reverse calculations will be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. I've force-pushed a version that hopefully implements the simplifications you had in mind 🤞.
1947c88
to
594da1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was almost what I meant. But you don't need to inline isSuccessor
3 places if you instead just make it fwdIsSuccessor
as before.
cpp/ql/lib/experimental/semmle/code/cpp/dataflow/ProductFlow.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/experimental/semmle/code/cpp/dataflow/ProductFlow.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/experimental/semmle/code/cpp/dataflow/ProductFlow.qll
Outdated
Show resolved
Hide resolved
} | ||
|
||
private predicate localPathStep1SuccPlus(Flow1::PathNode n1, Flow1::PathNode n2) = | ||
fastTC(prunedLocalPathStep1/2)(n1, n2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait a minute, are you doing recursion through fastTC? That's going to be slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually planning on asking on Slack how to judge whether or not a fastTC
was appropriate here 😂. How do you know that this will be slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. TIL! Is it safe to replace the fastTC
with a transitive closure operator? Or should I write out the TC manually to prevent the compiler from generating a fastTC
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to replace the
fastTC
with a transitive closure operator?
If that's the only change, then no, that just puts the choice in the compilers hands about whether to use fastTC or unrolling. Manually writing out the TC might also not be the best, but that's hard to tell. At least it wouldn't be the first option I'd try. The main issue here is a proliferation of recursive dependencies in the pruning of your TCs. If you overapproximate these slightly then you may are able to do source-sink-restricted materialisations of the TCs that are independent of the recursion. Alternatively, the "speed up isSuccessor" commit that you've felt compelled to add, might not be needed at all if you tweak the join-order - currently the join-order is likely to be somewhat restricted, since the original two applications of TC are unlikely to distribute over the disjunction, which will invariably lead to a poor join-order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the motivation for that commit was that I was seeing some very large blowups on a particular project:
Tuple counts for ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor#4#ffff/4@i10#848b8wcn after 12m10s:
2500 ~0% {4} r1 = SCAN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdReachableInterprocEntry#2#ff#prev_delta OUTPUT In.0 'pred1', In.1 'pred2', In.0 'pred1', In.1 'pred2'
1498000 ~0% {4} r2 = JOIN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdReachableInterprocEntry#2#ff#prev_delta WITH #ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::localPathStep1#2Plus#ff ON FIRST 1 OUTPUT Rhs.1 'succ1', Lhs.1 'pred2', Lhs.0 'pred1', Lhs.1 'pred2'
2500 ~3% {3} r3 = SCAN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdReachableInterprocEntry#2#ff#prev_delta OUTPUT In.1 'pred2', In.0 'pred1', In.0 'pred1'
1211000 ~0% {3} r4 = JOIN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdReachableInterprocEntry#2#ff#prev_delta WITH #ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::localPathStep1#2Plus#ff ON FIRST 1 OUTPUT Lhs.1 'pred2', Lhs.0 'pred1', Rhs.1 'succ1'
1213500 ~0% {3} r5 = r3 UNION r4
3829668500 ~2% {4} r6 = JOIN r5 WITH #ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::localPathStep2#2Plus#ff ON FIRST 1 OUTPUT Lhs.2 'succ1', Rhs.1 'succ2', Lhs.1 'pred1', Lhs.0 'pred2'
3831166500 ~2% {4} r7 = r2 UNION r6
3831169000 ~2% {4} r8 = r1 UNION r7
0 ~0% {4} r9 = JOIN r8 WITH ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::isSinkPair#2#ff ON FIRST 2 OUTPUT Lhs.2 'pred1', Lhs.3 'pred2', Lhs.0 'succ1', Lhs.1 'succ2'
2000 ~0% {4} r10 = SCAN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdReachableInterprocEntry#2#ff#prev_delta OUTPUT In.0 'pred1', In.0 'pred1', In.1 'pred2', In.1 'pred2'
69000 ~0% {4} r11 = JOIN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdReachableInterprocEntry#2#ff#prev_delta WITH #ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::localPathStep1#2Plus#ff ON FIRST 1 OUTPUT Rhs.1 'succ1', Lhs.0 'pred1', Lhs.1 'pred2', Lhs.1 'pred2'
3828324000 ~3% {4} r12 = JOIN r5 WITH #ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::localPathStep2#2Plus#ff ON FIRST 1 OUTPUT Lhs.2 'succ1', Lhs.1 'pred1', Lhs.0 'pred2', Rhs.1 'succ2'
3828393000 ~3% {4} r13 = r11 UNION r12
3828395000 ~3% {4} r14 = r10 UNION r13
765712148 ~2% {7} r15 = JOIN r14 WITH ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::interprocEdge1#5#fffff_20134#join_rhs ON FIRST 1 OUTPUT Rhs.1, Rhs.2, Lhs.3, Rhs.4, Lhs.1 'pred1', Lhs.2 'pred2', Rhs.3 'succ1'
241499 ~9311% {4} r16 = JOIN r15 WITH ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::interprocEdge2#5#fffff_01243#join_rhs ON FIRST 4 OUTPUT Lhs.4 'pred1', Lhs.5 'pred2', Lhs.6 'succ1', Rhs.4 'succ2'
241499 ~9311% {4} r17 = r9 UNION r16
3016 ~18% {4} r18 = r17 AND NOT ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor#4#ffff#prev(Lhs.0 'pred1', Lhs.1 'pred2', Lhs.2 'succ1', Lhs.3 'succ2')
return r18
and it was slightly mitigated by 594da1a. However, the following join order tweaks removes that problem altogether, and I don't even need that commit anymore 🎉
Pipeline standard for ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor#4#ffff@3f482yf4 was evaluated in 7 iterations totaling 122ms (delta sizes total: 144364).
688418 ~12154% {4} r1 = JOIN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor1#6#ffffff#prev_delta WITH ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor2#6#ffffff#prev ON FIRST 6 OUTPUT Lhs.0, Lhs.1, Lhs.4, Lhs.5
688418 ~12154% {4} r2 = JOIN ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor2#6#ffffff#prev_delta WITH ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor1#6#ffffff#prev ON FIRST 6 OUTPUT Lhs.0, Lhs.1, Lhs.4, Lhs.5
1376836 ~24407% {4} r3 = r1 UNION r2
160986 ~100% {4} r4 = r3 AND NOT ProductFlow#20473ca4::ProductFlow::GlobalWithState#InvalidPointerDeref#e13af12b::AllocToInvalidPointerConfig#::fwdIsSuccessor#4#ffff#prev(Lhs.0, Lhs.1, Lhs.2, Lhs.3)
return r4
This reverts commit 594da1a.
1cf1d9c
to
7b0afd1
Compare
7b0afd1
to
e1cc7dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now.
DCA looks good. No result changes (as expected), and a 15% speedup on Linux 🚀 |
This PR speeds up the experimental product flow library in three steps:
localPathStep
calls in the forward direction,This seems to speed up the analysis quite a lot. For example, running
cpp/invalid-pointer-deref
on https://github.com/ptitSeb/box64 goes from ~30 min to ~5 min. There's still something weird on that project, but at least it's a lot more manageable now.I haven't run DCA on it yet, but I'll do so once Actions feels better 😅.