-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Data flow: Introduce ParameterPosition
and ArgumentPosition
#7260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data flow: Introduce ParameterPosition
and ArgumentPosition
#7260
Conversation
1faf063
to
b1bb13d
Compare
b1bb13d
to
d54ae4e
Compare
d54ae4e
to
31374b4
Compare
private predicate viableParamNonLambda(DataFlowCall call, ArgumentPosition apos, ParamNode p) { | ||
exists(ParameterPosition ppos | | ||
p.isParameterOf(viableCallable(call), ppos) and | ||
parameterMatch(ppos, apos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the right join-order, when parameterMatch
isn't the identity.
I'd expect the following to do better:
private predicate argumentPosition(DataFlowCall call, ArgNode arg, ParameterPosition ppos) {
exists(ArgumentPosition apos |
arg.argumentOf(call, apos) and
parameterMatch(ppos, apos)
)
}
private predicate viableParamNonLambda(DataFlowCall call, ParameterPosition ppos, ParamNode p) {
p.isParameterOf(viableCallable(call), ppos)
}
private predicate viableParamArgNonLambda(DataFlowCall call, ParamNode p, ArgNode arg) {
exists(ParameterPosition ppos |
viableParamNonLambda(call, ppos, p) and
argumentPosition(call, arg, ppos)
)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this i more safe; I have pushed updates.
override predicate argumentOf(DataFlowCall call, int pos) { | ||
FlowSummaryImpl::Private::summaryArgumentNode(call, this, pos) | ||
exists(ArgumentPosition apos | | ||
FlowSummaryImpl::Private::summaryArgumentNode(call, this, apos) and | ||
apos.getPosition() = pos | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this still be using an int
for pos
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is purely an internal C# implementation predicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels good, I am looking forward to seeing how much of ArgumentPassing it will be able to replace :-)
Yeah, I was expecting that we should be able to improve that code. One of the key points is that we will no longer need the call graph ( |
cpp/ql/lib/semmle/code/cpp/dataflow/internal/DataFlowImplCommon.qll
Outdated
Show resolved
Hide resolved
cpp/ql/lib/semmle/code/cpp/dataflow/internal/DataFlowImplCommon.qll
Outdated
Show resolved
Hide resolved
csharp/ql/lib/semmle/code/csharp/dataflow/internal/FlowSummaryImpl.qll
Outdated
Show resolved
Hide resolved
csharp/ql/lib/semmle/code/csharp/dataflow/internal/FlowSummaryImplSpecific.qll
Outdated
Show resolved
Hide resolved
A few more inline comments, otherwise LGTM. |
Corresponds to github#7260, though some of those changes had already been made.
Corresponds to github#7260, though some of those changes had already been made.
Corresponds to github#7260, though some of those changes had already been made.
Corresponds to github#7260, though some of those changes had already been made.
This PR generalizes argument-to-parameter matching in the shared data-flow library from simple integer position based matching to per-language defined matching.
Why
For static languages, integer based matching has worked just fine, since each call knows the signature of the callee(s), and adjustments of argument positions can be made accordingly at the call sites. For example, if we have a C# call
which targets a
Foo(int fst, int snd)
method, but swaps the order using named arguments, then we know thatx
must have position1
andy
must have position0
. Similarly, for calls to methods with variadic argumentswhere the signature is
Bar(params int[] args)
, we can treat the call as syntactic sugar forthat is,
1..4
are no longerArgumentNode
s, but instead there is a synthesized array argument node with position0
.However, for dynamic languages we do not know the signature of the callee(s), so we may have a Ruby call
that targets both a
foo(a, b, c, d)
function and afoo(first, *mid, last)
function, so whether2
and3
should be wrapped in an array depends on the callee, and what gets passed intolast
depends on the number of arguments at the call sites.How
This PR introduces two new classes
ParameterPosition
andArgumentPosition
, and a predicateparameterMatch(ParameterPosition ppos, ArgumentPosition apos)
to the interface of the shared data-flow library. EachParameterNode
has aParameterPosition
, eachArgumentNode
has anArgumentPosition
, and an argument is compatible with a parameter as dictated byparameterMatch
. (Note thatparameterMatch
does not rely on the call graph, we already haveviableCallable
for that.)Defining
ParameterPosition = ArgumentPosition = int
andparameterMatch(ParameterPosition pp, ArgumentPosition ap) { pp = ap }
yields the existing behaviour, and that is how all languages are currently implemented (except for C#, where I decided to use different IPA wrappers to catch type errors).Follow-up work
For the named argument example, assume we have two Ruby calls
both targeting
foo(fst: , snd: )
. We can assign tofst
,a
, andd
the positionsNamed(fst)
, assign tosnd
,b
, andc
the positionsNamed(snd)
, and haveparameterMatch
be the identity.For the variadic arguments example, consider the Ruby call
that targets
foo(a, b, c, d)
andfoo(first, *mid, last)
. Assuming we havePositionalArg(x, y)
mean positional argumentx
ofy
,PositionalParam(x)
mean positional parameterx
,SplatParam(x, y)
mean splat parameter at positionx
skipping the lasty
arguments,LastParam(x)
mean thex
th last parameter, andmid_synth
be a synthesized parameter node, we can assign the following positions1
PositionalArg(0, 4)
2
PositionalArg(1, 4)
3
PositionalArg(2, 4)
4
PositionalArg(3, 4)
a
PositionalParam(0)
b
PositionalParam(1)
c
PositionalParam(2)
d
PositionalParam(3)
first
PositionalParam(0)
mid_synth
SplatParam(1, 1)
last
LastParam(0)
and define
mid
will then no longer be aParameterNode
, but instead we will have an array store-step frommid_synth
tomid
, so we are effectively moving the implicit array creation from the caller to the callee, but in the context of the callee we know exactly which arguments to include.