Conversation
xiemaisi
left a comment
There was a problem hiding this comment.
A few initial thoughts, I'll take another look tomorrow.
| FunctionNode getCallback(int i) { result.flowsTo(getArgument(i)) } | ||
|
|
||
| /** | ||
| * Gets a function passed as the `i`th argument of this invocation, |
There was a problem hiding this comment.
There is no parameter i, and the meaning of param isn't documented. I also think this predicate is complex enough to merit a brief example in the doc comment.
| } | ||
|
|
||
| /** | ||
| * A data flow node that performs a partial function application. |
There was a problem hiding this comment.
| * A data flow node that performs a partial function application. | |
| * A data-flow node that performs a partial function application. |
There was a problem hiding this comment.
Are you sure? None of the other classes use that convention, but I can update all of them if you like.
There was a problem hiding this comment.
Yeah, we're not being particularly consistent about this. I think the version with the hyphen is correct, but feel free to ignore.
|
|
||
| module PartialInvokeNode { | ||
| /** | ||
| * A data flow node that performs a partial function application. |
There was a problem hiding this comment.
| * A data flow node that performs a partial function application. | |
| * A data-flow node that performs a partial function application. |
There was a problem hiding this comment.
There are 238 occurrences of data flow node. I'll do it in a separate PR to avoid conflicts.
| result = getABoundFunctionReferenceAux(function, boundArgs, t) | ||
| } | ||
|
|
||
| pragma[noopt] |
There was a problem hiding this comment.
Oh, and it's in a recursive predicate, too! Are you absolutely positively sure this is needed?
There was a problem hiding this comment.
As a concrete suggestion, a quick experiment suggests that you can weaken the pragma[noopt] to a pragma[inline] like this:
(...)
or
exists(DataFlow::StepSummary summary, DataFlow::TypeTracker t2 |
result = getABoundFunctionReferenceAux(function, boundArgs, t2, summary) and
t = t2.append(summary)
)
}
pragma[noinline]
private
DataFlow::SourceNode getABoundFunctionReferenceAux(DataFlow::FunctionNode function, int boundArgs, DataFlow::TypeTracker t, DataFlow::StepSummary summary) {
exists(DataFlow::SourceNode prev |
prev = getABoundFunctionReference(function, boundArgs, t) and
DataFlow::StepSummary::step(prev, result, summary)
)
}
There was a problem hiding this comment.
Yes I remember it solving a performance problem, but I can run another evaluation with noinline if you like.
There was a problem hiding this comment.
That would be great. I'm not opposed to pragmas on principle, but pragma[noopt] is perhaps an overly blunt instrument. In particular, I believe it prevents us from having different orders for different delta sizes, which is generally a good thing. (And yes, it absolutely does solve a performance problem; removing the pragma entirely makes PostMessageStar grind to a halt on brackets, for instance.)
There was a problem hiding this comment.
Results on nightly look fine. I'm running another eval on default to be sure, but for now I'll merge it in the change.
There was a problem hiding this comment.
The worker ran out of disk space during the rerun :-\
xiemaisi
left a comment
There was a problem hiding this comment.
A few more thoughts. On the whole I'm fine with this PR, though I am a little concerned about the pragma[noopt]. Making InconsistentNew and IllegalInvocation more conservative is fine by me; the former never gave interesting results, and the latter is slowly outliving its usefulness now that classes are natively supported everywhere.
| /** | ||
| * A call to `goog.bind`, as a partial function invocation. | ||
| */ | ||
| private class BindCall extends DataFlow::PartialInvokeNode::Range, DataFlow::CallNode { |
There was a problem hiding this comment.
Perhaps add a test for this and angular.bind below?
| addEventListener = DataFlow::globalVarRef("addEventListener").getACall() and | ||
| addEventListener.getArgument(0).mayHaveStringValue("message") and | ||
| addEventListener.getArgument(1).analyze().getAValue().(AbstractFunction).getFunction() = this | ||
| addEventListener.getArgument(1).getABoundFunctionValue(paramIndex).getFunction() = this |
There was a problem hiding this comment.
Could you add a test demonstrating the new results this allows us to find?
| /** Gets a function value that may reach this node. */ | ||
| final FunctionNode getAFunctionValue() { | ||
| result = getAFunctionValue(_) | ||
| CallGraph::getAFunctionReference(result, 0).flowsTo(this) |
There was a problem hiding this comment.
Why do we restrict imprecision to zero now? Wasn't it unrestricted before?
There was a problem hiding this comment.
This diff is relative to an earlier commit. On master, the imprecision bit wasn't easily available and I don't think it was a deliberate choice to include imprecise values here in the first place.
It seems to me that dealing with the imprecise callees is mainly for queries that rely on seeing all the potential callees, which has become a bit of a niche thing given that data flow queries usually don't do that.
But we should make sure it behaves the same way as InvokeNode.getACallee(), which defaults to any imprecision level. I'd prefer to make the default imprecision level 0 for both 0-argument predicates, so you have to "opt in" to see the imprecise callees. WDYT?
There was a problem hiding this comment.
I'm not entirely sure. If we restrict ourselves to zero imprecision, what do we lose? Is it just calls to externs (modelling built-ins) and global functions defined elsewhere?
There was a problem hiding this comment.
Pretty much yes.
There was a problem hiding this comment.
Pushed a commit that makes imprecision zero the default for getAFunctionValue and for getACallee
04ee483
Reorganizes the call graph computation motivated by the need for library models to reason about callbacks that escape into the library. For example, in the following we could now detect that
eventrefers to an untrusted cross-window event:This consists of a number of changes seemed independent but I had to roll them up into a single change to avoid intermediate regressions or redundant work:
CallGraphs.qllInvokeNode.getACallee()andNode.getAFunctionValue()now include the call edges discovered through type tracking -- previously only available internally throughFlowSteps::calls.AdditionalPartialInvokeNodehas been renamed toPartialInvokeNodewith a::Rangecompanion.PartialInvokeNode. (To keep the scope of this PR in check, we don't currently type-track function objects in general, just the result of partial invokes.)Previously, the call graph exposed through
getACallee(0)had the nice property that if any callees were found, we almost certainly had all of them. This is doesn't quite hold anymore. Since it now includes results from type tracking, there will be cases where a proper subset of the callees could be found (where previously nothing was found). This has a few ramifications:analyze().getAValue()directly, but the results we lose didn't seem valuable.Also some changes that could in principle be landed separately, but helped me verify that things actually work:
goog.bindandangular.bindpartial invokes.PostMessageEventHandleruses the new API (as per the above example)Evaluations:
GlobalAcessPaths::isAssignedInUniqueFile).