-
Notifications
You must be signed in to change notification settings - Fork 125
Teach extractor to distinguish calls with an ellipsis from calls without #281
Conversation
In a call of the form `f(xs...)`, when we say that `f` taints its 0th argument its ambiguous whether that means that it taints the slice `xs` or its 0th element `xs[0]`. In practice, it's usually the latter, but we have no way of expressing that using our current `FunctionOutput` implementation.
d28e8b4
to
c2a26f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this also needs a test or two verifying the dataflow behaviour is as expected?
exists(Expr callee | callee = getChildExpr(0) | not isTypeExprBottomUp(callee)) | ||
or | ||
// only calls can have an ellipsis after their last argument | ||
has_ellipsis(this) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an example where the first disjunct does not hold but the second does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could happen with databases for which we have very incomplete name-binding information. In any case it seemed like a cheap heuristic to add.
@@ -259,6 +264,8 @@ private class OutParameter extends FunctionOutput, TOutParameter { | |||
override DataFlow::Node getExitNode(DataFlow::CallNode c) { | |||
exists(DataFlow::Node arg | | |||
arg = getArgument(c, index) and | |||
not (c.hasEllipsis() and index = c.getNumArgument() - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check: Does this work for the case where the varargs slice is implied (e.g. signature is f(ss ...string)
and the call is f("one", "two", "three")
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, good point, in that case we wouldn't consider "three"
as an exit node. Let me add a condition to restore that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, hang on, no, that would actually work fine since c.hasEllipsis()
wouldn't hold in that case (note that c
is the call, not the callee).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I get it. In that case suggest adding a comment clarifying that this is more nuanced than just asking whether the callee is varargs
Fair enough, I'll add an end-to-end test. |
@smowton, done. |
This allows queries to reason about the difference between
f(xs)
andf(xs...)
.Semantically they are really quite similar, except that in the former case, the parameter into which
xs
is passed must be a slice, while in the latter case it must be a varargs parameter (which, however, behaves much like a slice).One case where it can yield unexpected results is when dealing with taint propagation through parameters: if we say that
f
taints all of its arguments, then for a callf(x, y)
we would clearly mean thatx
andy
are tainted; however, for a callf(xs...)
we probably would not mean that the slicexs
itself is tainted, but rather that its elements are. This isn't something we can conveniently model, so the second commit simply suppressesFunctionOutput
s that refer to varargs. This fixes some false positives that were surfaced (but not caused) by #274.