-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Improve alias analysis for indirections #1736
base: main
Are you sure you want to change the base?
Conversation
This produces the expected results on our tests, but I still need to test it (including performance) on real snapshots. |
This is going to be very hard to review. Is there any way it can be split up into multiple commits that are reasonable in size and make sense on their own? |
@jbj As long as the individual commits don't have to actually work in isolation, I can definitely split it up.
I'll have it split in a couple hours. |
Make `bitsToBytesAndBits` omit the leftover bits if zero.
Adds `InlineExpectationsTest.qll`, which makes it easy to create a QL test where the expectations are provided as comments in the source code being tested.
a710609
to
352e750
Compare
OK, I've curated all of the changes into separate commits. Many of those commits are pretty trivial. I've saved |
This uses the autoformatter from internal PR 34050. Some files that will change in github#1736 have been spared. ./build -j4 target/jars/qlformat find ql/cpp/ql -name "*.ql" -print0 | xargs -0 target/jars/qlformat --input find ql/cpp/ql -name "*.qll" -print0 | xargs -0 target/jars/qlformat --input (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/**/*SSA*.qll') buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest
Some files that will change in github#1736 have been spared. ./build -j4 target/jars/qlformat find ql/cpp/ql -name "*.ql" -print0 | xargs -0 target/jars/qlformat --input find ql/cpp/ql -name "*.qll" -print0 | xargs -0 target/jars/qlformat --input (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/**/*SSA*.qll') buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest
/** | ||
* A memory allocation that can be tracked by the AliasedSSA alias analysis. | ||
* For now, we track all variables accessed within the function, including both local variables | ||
* and global variables. In the future, we will track indirect parameters as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about modeled allocators or default new
?
private Overlap getVariableMemoryLocationOverlap(VariableMemoryLocation def, VariableMemoryLocation use) { | ||
overlappingIRVariableMemoryLocations(def, use) and | ||
result = Interval::getOverlap(def.getStartBitOffset(), def.getEndBitOffset(), use.getStartBitOffset(), use.getEndBitOffset()) | ||
private Overlap getIndirectMemoryLocationOverlap(IndirectMemoryLocation def, IndirectMemoryLocation use) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was written carefully to avoid a quadratic blowup on https://lgtm.com/projects/g/ALaDyn/piccante/latest/files/src/sobol.cpp - can you make sure it's still performant?
*/ | ||
private string getBaseAddressString(ValueNumber baseAddress) { | ||
// If it's the value of a parameter, just use the parameter name. | ||
result = baseAddress.getAnInstruction().(InitializeParameterInstruction).getParameter().toString() or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also special-case VariableAddressInstruction
here?
I recently asked @dbartol to remind me what this PR does.
|
This PR improves our alias analysis and SSA construction for access to memory via indirection, including fields of the indirection. Before, we could get precise flow through fields as long as we knew exactly which variable and which field was being accessed. Now, we can often tell that two accesses touch the same memory location, even if we don't know exactly which memory location that is.
Alias Analysis Changes
AliasAnalysis.qll
is now parameterized base on aConfiguration
, which defines a type namedAllocation
.Allocation
represents a contiguous region of memory that is disjoint from all other instances ofAllocation
. Two memory accesses that belong to different allocations can never overlap. When building unaliased SSA,Allocation
represents an automatic (stack) variable, since that is all we care to model. When building aliased SSA,Allocation
represents aValueNumber
for aVariableAddressInstruction
, but in the future will also includeInitializeParameterInstruction
s and possibly calls tomalloc
-like functions.The entry points into
AliasAnalysis.qll
are now:allocationEscapes(Configuration::Allocation allocation)
: Holds if the specified allocation's address escapes.addressOperandBaseAndConstantOffset(AddressOperand addrOperand, Instruction base, int bitOffset)
: Holds if the specifiedAddressOperand
points to an address equal tobase
plus a constant offsetbitOffset
. Two accesses with the same base address overlap if and only if their offsets overlap.getAddressOperandAllocation(AddressOperand addrOperand)
: Gets the allocation into whichaddrOperand
points, if known.AliasedSSA.qll
has been changed to track the allocation and base address for every indirect memory access. The overlap relationship now uses the following rules to determine overlap:*p
and*p
always exactly overlap, andp->x
never overlapsp->y
, even if we don't know exactly whatp
points to.SimpleSSA.qll
has been changed to track everything in terms ofAllocation
s, but since anAllocation
inSimpleSSA
is just anIRAutomaticVariable
, nothing really changed here.Fixed handling of
PointerOffsetInstruction
in alias analysis to only propagate the pointer operand.Test Changes
Added the first SSA sanity check, for operands with more than one associated
MemoryLocation
, and tests that run this check on our existing IR test sources.Added a simple test for SSA for indirect parameter accesses.
Moved the
points_to.ql
test to its own directory, and made it use a newInlineExpectationsTest
class. This class lets you define the results of your test in terms of (tag, value) pairs, each associated with a location. The base class looks for matching comments of the form$tag=value
on the appropriate line, and reports an output row for each missing or unexpected result. The ".expected" file should always be an empty file. This allows us to modify the test sources, including inserting lines, without getting a bunch of diffs just due to line numbers changing.Miscellaneous
Made
duplicateOperand
have the output of a problem query, including a link to the containing function.Made
IRVariable
non-abstract, so that it is now possible to extendIRVariable
to restrict rather than extend.Fixed a bug where
PrintSSA
would print some properties on the wrong instruction.Refactored the integer constant code a bit. Bit offsets are now printed in the format
bytes:bits
, where:bits
is omitted ifbits
is zero. Thus, 21 bits is printed as2:5
, while 24 bits is printed as just3
.