-
Notifications
You must be signed in to change notification settings - Fork 1.8k
JS: Add data-flow steps for arrays using a pseudo-property #3019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'm fixing up the tests (also found a bug from that). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments after a cursory read through - I'll give it a more thorough look later this week
@@ -598,18 +598,23 @@ class ArrayConstructorInvokeNode extends DataFlow::InvokeNode { | |||
* new Array('apple', 'orange') | |||
* Array(16) | |||
* new Array(16) | |||
* Array.from(1,2,3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not how Array.from works. Was this addition needed for something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it was not, that was a mistake from reading the taint steps on arrays a little too quickly.
@@ -52,7 +52,7 @@ private DataFlow::SourceNode argumentList(SystemCommandExecution sys, DataFlow:: | |||
result = pred.backtrack(t2, t) | |||
or | |||
t = t2.continue() and | |||
TaintTracking::arrayFunctionTaintStep(result, pred, _) | |||
ArrayTaintTracking::arrayFunctionTaintStep(result, pred, _) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I think the API would be nicer if we expose all our taint steps as TaintTracking::xxxStep
(we could be better at doing this, but let's not regress from what we already have).
If moving the code into Arrays.qll
is mainly for internal organization, it should not be reflected in the public API.
|
||
/** | ||
* Holds if `pred` should be stored in the object `succ` under the property `prop`. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
override
predicates inherit the qldoc and should rarely have a qldoc comment of their own.
It seems like In an ordinary Also, I think I would need some more refactor, getting more focus on the enumerated object and the accesses on the object, rather than the property-name. lodash.forEach(arr, (e) => sink(e)) |
Sorry, I was thinking about the |
Yes it would, but I would have preferred to support a I'll use the |
And that destroyed performance in |
A quick evaluation looks good. |
A more detailed evaluation shows a less clear picture. Specifically the |
Since #3070 landed I guess the performance should be good now? Are you running a final evaluation? |
Yes, it should be ready tonight. |
I think the But even if we ignore |
I have a branch that makes exploratory flow more precise. I've confirmed that merging it with your PR eliminates the overhead you observed in As for the rest of the evaluation, there's obviously a biased noise in the wall clock timings, but it could be hiding a genuine slowdown. Could you pick one of the slowest ones from that report other than |
Here is an evaluation of just one of the slowest ones: https://git.semmle.com/erik/dist-compare-reports/tree/profiling-js-esben.northeurope.cloudapp.azure.com_1584809774080 It looks like the wallclock was biased in the previous evaluation. |
I moved both the taint-steps and the new data-flow-steps for arrays into an
Arrays.qll
file.Gets us a TP for CVE-2018-3726.
A slightly outdated evaluation shows reasonable performance.
The new result from the evaluation is a TP.
I tried to see how our analysis worked if I removed the taint-steps and only relied on the new data-flow steps.
The result was that we missed a lot of TPs due to taint-sources being arrays, and the data-flow steps need a write/read pair.
Here are examples of the new data-flow edges that are added.