-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Use fully converted instructions as the target of modelled functions #14432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Use fully converted instructions as the target of modelled functions #14432
Conversation
@@ -82,7 +83,7 @@ Node callOutput(CallInstruction call, FunctionOutput output, int d) { | |||
// If there isn't an indirect out node for the call with indirection `d` then | |||
// we conflate this with the underlying `CallInstruction`. | |||
not exists(getIndirectReturnOutNode(call, d)) and | |||
n.asInstruction() = result.asInstruction() | |||
n = result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is a drive-by change I noticed. There's no reason to restrict this to be Instruction
s. I don't think this really impacts anything, though.
@@ -555,7 +555,7 @@ predicate instructionForFullyConvertedCall(Instruction instr, CallInstruction ca | |||
} | |||
|
|||
/** Holds if `node` represents the output node for `call`. */ | |||
private predicate simpleOutNode(Node node, CallInstruction call) { | |||
predicate simpleOutNode(Node node, CallInstruction call) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the correct way to go about this, or should the predicate actually be moved to DataFlowUtil
. Note that I'm not really sure about the difference between the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataFlowUtil
is supposed to be "things that you're actually allowed to use in queries" and DataFlowPrivate
is supposed to be "private things that's users are not supposed to know about".
You can see the distinction between the two files in the way they're imported by the dataflow library here: https://github.com/github/codeql/blob/main/cpp/ql/lib/semmle/code/cpp/dataflow/internal/DataFlowImplSpecific.qll#L7. As you can see, DataFlowPrivate
is imported in a module Private
, and DataFlowUtil
is imported in a module Public
. Somewhere (I don't actually know where) that Public
module is publicly imported so that it's available when you import the dataflow library in a query.
So I think this is actually the correct thing to do. Since everything in DataFlowPrivate
is never made public at the query level, removing the private
keyword here doesn't really change what's visible outside the dataflow library. It just means that other dataflow implementations files can access it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should say, it took me a couple of years to realize what I wrote about about the distinction between DataFlowUtil
and DataFlowPrivate
so I thoroughly misplaced many classes and predicates until then 😂.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. Going by that this seems ok.
There's a single change in the internal tests. I've opened an internal PR with those changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We were translating all calls to models of functions to the
CallInstruction
instead of the fully converted function call. This isn't inherently wrong (since there's dataflow from theCallInstruction
to its conversions), but it could give some strange start- and end-points for taint-tracking queries.cc @jketema