-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Fix pointer/pointee conflation #13191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Fix pointer/pointee conflation #13191
Conversation
exists(Node adjusted | | ||
indirectConversionFlowStep*(adjusted, nodeFrom) and | ||
nodeToDefOrUse(adjusted, defOrUse, uncertain) and | ||
private predicate adjustForPointerArith(PostUpdateNode pun, UseOrPhi use) { |
Check warning
Code scanning / CodeQL
Missing QLDoc for parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point just some questions to further my understanding. A second pair of eyes would be useful.
It seems that before postUpdateFlow
depended on ssaFlowImpl
, while we have now completely disconnected the two, moving the relevant parts of ssaFlowImpl
into postUpdateFlow
. Is that correct? If so, why do we no longer need a restriction on PostUpdateNode
s in ssaFlowImpl
?
Yep, that's correct. At the end, they both end up calling char* p = /* ... */;
write_to_argument(p + n);
// ...
use(p); because
By "a restriction on Does that make sense? |
Indeed.
Not quite. My assumption was that we needed |
Ah, I see. No, we shouldn't be able to hit the bad case from the QLDoc in |
So was |
IIRC, I included that negation because the case where |
Ok, that clarifies things. |
then preUpdate = [nFrom, getAPriorDefinition(defOrUse)] | ||
else preUpdate = nFrom | ||
not exists(DataFlowCall call | | ||
isArgumentOfCallable(call, preUpdate) and isArgumentOfCallable(call, nodeTo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final question from my side. I think I mostly follow along now, but would definitely appreciate @rdmarsh2's input.
What is the reason for ignoring the argument positions here? Assuming this is important, could we add a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be important, no. The main thing is that we want to disallow flow from the PostUpdateNode
and back into the function as an argument (which violates the evaluation order). So it doesn't really matter what the argument order is since the PostUpdateNode
always represents the value after we've returned from the function and the ArgumentNode
always represents the value before we've entered the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With that said, I'd also be interested in knowing if I've missed something here (cc @rdmarsh2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there would never be a correct step directly from the postupdate to a preupdate on the same call... That ought to only be possible in a loop, and there should be an intervening phi node in that case. If it does come up I think it's an IR inconsistency problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. The closest we can come to flow directly from a PostUpdateNode
and back to the argument is something like:
int x = 0;
// ...
while(...) {
write_to_arg(&x);
}
where we'd have flow from x
's post update note, to a phi node at the loop entry and back to &x
. But as Robert says, there should always be a PhiNode
here.
I'm currently going through all the alert changes. Here are my notes so far:
result = cmd_main(argc, argv);
trace2_cmd_exit(result);
return result;
}
margin_printf(outfile, length ? "/* %s */\n" : "\n", storage); where we exit I'll continue looking at the remaining changes tomorrow. |
I've now gone through most of the lost results, and all of the ones I've looked at have been cases where we re-entered a function we just exited from through a |
This PR fixes the conflation identified in #13182.
Turns out the problem was something we've actually seen before. Consider this code
We got spurious flow from
source()
tosink(buf)
. This shouldn't happen since the tainted value is*buf
. However, we were getting flow in this case because the post-update node corresponding to the value ofbuf1
after leavingincrement_and_call_sink
re-enteredincrement_and_call_sink
and got dereferenced an additional time in the*buf2 += ...
operation.This PR fixes this problem by excluding SSA flow from a
PostUpdateNode
s to another node that's an argument to the same callable as the pre-update node's argument node's callable.Commit-by-commit review recommended. The first commit slightly changes our dataflow tests so that we can distinguish indirect sinks from non-indirect sinks, and the second commit fixes the conflation issue.