-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: SSA flow through fields and imprecise defs #1016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I looked at the changes to |
98bd5cf
to
23a93a2
Compare
I've pushed new changes, mostly to SSAConstruction.qll. SSA construction is now grouped into separate modules for Phi Insertion and Use/Def connection, which will hopefully make it easier to review. Each module has an overview comment that outlines the general approach used. Also, I believe that all predicates in SSA construction now have QLDoc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a few more superficial comments. Let's arrange to review this with screen sharing. I'm also interested in performance.
cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasedSSA.qll
Outdated
Show resolved
Hide resolved
cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll
Outdated
Show resolved
Hide resolved
IntValue endBitOffset) { | ||
resultPointsTo(instr.getResultAddressOperand().getDefinitionInstruction(), var, startBitOffset) and | ||
type = instr.getResultType() and | ||
if exists(instr.getResultSize()) then | ||
endBitOffset = Ints::add(startBitOffset, Ints::mul(instr.getResultSize(), 8)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why Ints::mul
and not just *
? Also below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these should probably remain. The startBitOffset
bound by resultPointsTo()
could be the result of arbitrary pointer arithmetic. Overflow is unlikely, but possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that Ints::add
should remain, but my question was about Ints::mul
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overflow in the mul
calculation is even less likely, but still theoretically possible if the result of the instruction is a very large array type. Is there likely to be a significant performance cost from this use of mul
? If so, it's probably OK to remove it, but if there's not likely to be much performance difference, I'd prefer to keep it for (pedantic) correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If hasResultMemoryAccess
is fast, then it should be fine.
cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll
Outdated
Show resolved
Hide resolved
cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasedSSA.qll
Show resolved
Hide resolved
I've added a .md file that gives an overview of the whole SSA construction process, from alias analysis to the memory model to SSA itself. That doc is missing the details on the actual connection of defs to uses, but I wanted to get the other parts of it into the PR to give the necessary background to reviewers. |
086aa22
to
0bdafb1
Compare
definitionReachesRank(vvar, defBlock, defRank, useRank) | ||
/** | ||
* Holds if the specified `useLocation` is live on entry to `block`. This holds if there is a use of `useLocation` | ||
* that is reachable from the start of `block` without passing through a definition that overlaps `useLocation`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be "without passing through a definition that totally overlaps useLocation"? If not, can you add a clarification about how this interacts with Chi nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification added
Initial perf info on ChakraCore: Running constant_func.ql, which consumes aliased SSA, the total slowdown was <3%, which is actually much faster than I expected. However, it doesn't seem to be improving the precision of the use/def connections as much as I expected it to. I'm investigating that now. Wireshark numbers to come. |
We decided yesterday that this is not ready for 1.20, so I'll remove that milestone. We can retarget the PR when there's been a mergeback. |
The mergeback #1082 just went in, so I'll try to retarget this PR to |
794e1f6
to
7879a53
Compare
6618503
to
80ad11b
Compare
Can you summarise what has changed since I last looked at this? |
There have been few real changes. Mostly, it's a rebase, plus updating test expectations for IR construction improvements that have been merged since the previous review. The dataflow test expectation updates do show a couple concrete examples of where we get better results than the AST-based dataflow because we can model field accesses more precisely. |
OK, now there's been one significant change: in an IR dump, an operand that is not an exact use of its definition instruction (e.g. load of a field defined by a store of the entire struct) now has a "~" prefix. I had implemented this earlier while working on this PR, but only enabled it just now to avoid polluting the IR diffs caused by real SSA changes. |
There are still two unaddressed comments from earlier reviews by @rdmarsh2 and me. Also, you promised performance testing on Wireshark. I think this kind of change ought to be benchmarked on a handful of big snapshots since it may be sensitive to specific patterns that confuse the alias analysis in very large functions. |
d780420
to
cde3256
Compare
Latest run on Wireshark shows an 8% overall slowdown to evaluate all the way to aliased SSA. This is more than the 2.5% slowdown in ChakraCore, but still consistent with Wireshark's reputation as a stress test for this sort of change. |
@rdmarsh2 I believe this is ready for final approval and merge, if the 8% slowdown on Wireshark seems reasonable. |
Have you looked at the change in the most expensive predicates with and without this PR? I think we can merge with 8%, but we shouldn't introduce low-hanging fruit for performance improvements. |
I think I remember Dave saying that no single predicate stands out as being slow(er). Is that right? |
I kept forgetting which operand on a Chi instruction was which, so I added dump labels. I added labels for the function target of a `Call`, for positional arguments, and for address operands as well.
34512ff
to
7071692
Compare
There are a couple SSA predicates that show up near the top on Wireshark. |
@dave-bartolomeo I've investigated the slowness in |
The diffs in the SSAConstruction.qll are pretty extensive. While the overall approach is similar to what was there before, you're probably better off just looking at the new implementation without trying to worry about line-by-line differences from the old implementation.
If you look at the diffs to the test expectations commit-by-commit, you'll see how the newly-generated SSA differs from the previous one. Primarily, you'll see uses getting hooked up to exact definitions if possible, instead of always being hooked up to the result of a
Chi
. You'll also seePhi
instructions inserted for fields, instead of just for entire virtual variables.Most of the changes are described in the markdown files that I've added alongside the code.