C++: Split index calculation from BB membership #123

jbj · 2018-08-31T12:08:53Z

This change gave me a great speedup. I'm not totally done testing this, but I'm opening the PR just to mark it as a 1.18 item.

Instead of computing these two things in one predicate, they are computed in separate predicates and then joined. This splits the predicate primitive_basic_block_member, which took 77s before, into predicates that together take 18s on a medium-sized db.

Instead of computing these two things in one predicate, they are computed in separate predicates and then joined. This splits the predicate `primitive_basic_block_member`, which took 77s before, into predicates that together take 18s on a medium-sized db.

This change gave a slight speed-up by eliminating an unnecessary intermediate predicate.

jbj · 2018-09-01T06:07:52Z

I've retargeted this to rc/1.18 and got it ready for merging.

If someone wants to test this for performance, be sure not to trust the "Clause timing report" printed at the end of the log in QL4E. I've observed it to underestimate the time spent on recursive predicates by a factor of 10. It's better to search through the whole log and subtract the timestamps between the beginning and end of the evaluation of the predicate(s) of interest.

kevinbackhouse · 2018-09-01T14:48:36Z

The QL is correct. It's surprising that it gives such a big speed up! The recursion in getMemberIndex looks essentially identical to the recursion in the old version of primitive_basic_block_member. Is the evaluator faster on 2-column relations, for some reason?

jbj · 2018-09-01T16:55:38Z

I can't really explain why it gets so much faster -- or rather, why it's so slow in the first place. It could be a bad join order that just happens to get fixed by changing up the definitions. I did a little bit of fiddling with pragma[noopt] on the original 3-column relation, but I wasn't able to make it faster.

I also tried using the shortestDistances HOP to assign the numbers. That took the 77s predicate down to 41s.

jbj · 2018-09-01T17:14:59Z

By the way, I've been testing for correctness by checking that this query gives the same results before and after:

import semmle.code.cpp.controlflow.internal.PrimitiveBasicBlocks

from int i
where i = any(PrimitiveBasicBlock b).length()
select i, strictcount(PrimitiveBasicBlock b | i = b.length())

I should also follow up by checking if the IRBlock construction can benefit from the same change.

felicitymay · 2018-09-04T14:09:59Z

Does this merit a change note?

jbj · 2018-09-04T14:28:31Z

I don't think there's a need for a change note. Every release makes some things slower (but better) and other things faster, and we hope it roughly balances out.

dave-bartolomeo

LGTM
I haven't had a chance to reproduce the performance improvement, but I will try to do so today. Feel free to merge before that.

jbj · 2018-09-05T12:20:51Z

I just tested this on Wireshark and saw only a reduction from 35s to 19s. Still, I think it's worth merging even if performance is already adequate on some snapshots.

The use of transitive closure for BB index calculation has been the cause of an out-of-memory error. This commit switches the calculation to use the `shortestDistances` HOP, which still has the problem that the result needs to fit in RAM, but at least the RAM requirements are sure to be linear in the size of the result. The `shortestDistances` HOP is already used for BB index calculation for the C++ IR and for C#. We could guard even better against OOM by switching the calculation to use manual recursion, but that would undo the much-needed performance improvements we got from github#123. This change improves performance on Wireshark, which is notorious for having long basic blocks. When I benchmarked `shortestDistances` for github#123, it was slower than TC. With the current evaluator, it looks like `shortestDistances` is faster. Performance before was: PrimitiveBasicBlocks::Cached::getMemberIndex#ff ................... 9.7s (executed 8027 times) #PrimitiveBasicBlocks::Cached::member_step#ffPlus ................. 6.6s PrimitiveBasicBlocks::Cached::primitive_basic_block_entry_node#f .. 3.5s PrimitiveBasicBlocks::Cached::primitive_basic_block_member#fff .... 2.3s Performance with this commit is: PrimitiveBasicBlocks::Cached::primitive_basic_block_entry_node#f ................................................................... 3.5s shortestDistances@PrimitiveBasicBlocks::Cached::primitive_basic_block_entry_node#1@PrimitiveBasicBlocks::Cached::member_step#2#fff . 3s PrimitiveBasicBlocks::Cached::primitive_basic_block_member#fff ..................................................................... 963ms

AST: ensure and else blocks

Kotlin: More operator tests, and support for more numeric ops

…-resolution Support `super` with `instanceof`

…uper-resolution Support `super` with `instanceof`

…pelineByPropertyName PS: Flow through `ValueFromPipelineByPropertyName` parameters

jbj added the C++ label Aug 31, 2018

jbj added this to the 1.18 milestone Aug 31, 2018

C++: Tidy primitive_basic_block_member calculation

98612b9

This change gave a slight speed-up by eliminating an unnecessary intermediate predicate.

jbj changed the base branch from master to rc/1.18 September 1, 2018 06:03

jbj mentioned this pull request Sep 1, 2018

C++: Split IRBlock index from IRBlock membership #133

Merged

jbj assigned dave-bartolomeo Sep 4, 2018

dave-bartolomeo approved these changes Sep 5, 2018

View reviewed changes

semmle-qlci merged commit a70526f into github:rc/1.18 Sep 5, 2018

jbj mentioned this pull request Oct 5, 2018

C++: Speed up primitive basic block calculation #285

Merged

jbj mentioned this pull request Jun 11, 2019

C++: use shortestDistances in PrimitiveBasicBlocks #1437

Merged

kamarcum unassigned dave-bartolomeo Apr 28, 2020

aibaars added a commit that referenced this pull request Oct 14, 2021

Merge pull request #123 from github/aibaars/ast-ensure

f8ce727

AST: ensure and else blocks

smowton pushed a commit to smowton/codeql that referenced this pull request Dec 6, 2021

Merge pull request github#123 from github/igfoo/ops

045d290

Kotlin: More operator tests, and support for more numeric ops

erik-krogh pushed a commit to erik-krogh/ql that referenced this pull request Dec 15, 2021

Merge pull request github#123 from github/tausbn/add-instanceof-super…

ed17bd9

…-resolution Support `super` with `instanceof`

erik-krogh pushed a commit to erik-krogh/ql that referenced this pull request Dec 15, 2021

QL: Merge pull request github#123 from github/tausbn/add-instanceof-s…

5dbeaa1

…uper-resolution Support `super` with `instanceof`

MathiasVP added a commit to MathiasVP/ql that referenced this pull request Aug 10, 2025

Merge pull request github#123 from microsoft/flow-through-ValueFromPi…

bfa9210

…pelineByPropertyName PS: Flow through `ValueFromPipelineByPropertyName` parameters

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

C++: Split index calculation from BB membership #123

C++: Split index calculation from BB membership #123

Uh oh!

jbj commented Aug 31, 2018

Uh oh!

jbj commented Sep 1, 2018

Uh oh!

kevinbackhouse commented Sep 1, 2018

Uh oh!

jbj commented Sep 1, 2018

Uh oh!

jbj commented Sep 1, 2018

Uh oh!

felicitymay commented Sep 4, 2018

Uh oh!

jbj commented Sep 4, 2018

Uh oh!

dave-bartolomeo left a comment

Uh oh!

jbj commented Sep 5, 2018

Uh oh!

Uh oh!

C++: Split index calculation from BB membership #123

C++: Split index calculation from BB membership #123

Uh oh!

Conversation

jbj commented Aug 31, 2018

Uh oh!

jbj commented Sep 1, 2018

Uh oh!

kevinbackhouse commented Sep 1, 2018

Uh oh!

jbj commented Sep 1, 2018

Uh oh!

jbj commented Sep 1, 2018

Uh oh!

felicitymay commented Sep 4, 2018

Uh oh!

jbj commented Sep 4, 2018

Uh oh!

dave-bartolomeo left a comment

Choose a reason for hiding this comment

Uh oh!

jbj commented Sep 5, 2018

Uh oh!

Uh oh!