-
Notifications
You must be signed in to change notification settings - Fork 1.8k
JS: Add precise data-flow steps for Promise.all #3478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a bit of polish required, and a change note for "promises" and bluebird
.
The taint step precision can be addressed later - lets continue that discussion in https://github.com/github/codeql-javascript-team/issues/117.
exists(int i | | ||
element = this.getElement(i) and | ||
obj = this and | ||
prop = i.toString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have the following instead of the ad-hoc conversion with .toString
, and also in the name of hypothetical performance.
string arrayElement(int i) { i < 5 and result = i.toString() or result = arrayElement() }
Co-authored-by: Esben Sparre Andreasen <esbena@github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. See inline comment.
exists(int i | | ||
element = this.getElement(i) and | ||
obj = this and | ||
prop = i.toString() | ||
prop = arrayElement(i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can eliminate getAnElement()
because we happen to always know the element order in an ArrayCreationNode
?
I think that is forwards compatible in practice, but please indicate with a 👍 that this change was intentional.
Hm, I'm a little sceptical of doing something that only works for pure data-flow, not taint-tracking. I was thinking we could locally recognize tuple steps through predicate promiseAllStep(Node pred, Node succ) { ... }
predicate heapStep(Node pred, Node succ) {
// ...
exists(ArrayExpr array | array = succ.asExpr() |
not promiseAllStep(pred, _)
pred = array.getAnElement().flow()
)
) |
I like the idea, I'll look at that afterwards. |
|
||
var [clean2, tainted2] = await Promise.resolve(Promise.all(["clean", source])); | ||
sink(clean2); // OK - but flagged by taint-tracking | ||
sink(tainted2); // NOT OK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test case where the inputs to Promise.all
are promises?
I can't quite picture how this would work. It seems to me we end up with the sequence of steps:
- Store step into promise with
promise-value
pseudo-property - Store step into array object with array-index property
- Load step from the array object to
Promise.all
withpromise-value
pseudo-property - Load step from
Promise.all
to array access with array-index property
The above sequence is not reducible to plan data flow because the innermost store/load pair have different property names.
The way I see it, Promise.all
converts an "array of promises" to a "promise of an array", which in general requires us to swap the order of two properties. This requires two loads steps followed by two store steps (which we unfortunately can't do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see it, Promise.all converts an "array of promises" to a "promise of an array", which in general requires us to swap the order of two properties. This requires two loads steps followed by two store steps (which we unfortunately can't do).
Your analysis is correct, and we end up not tracking data-flow when the inputs to Promise.all
are promises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's really unfortunate. This PR is a bit of a hard sell with that restriction. Do you mind withholding this PR for a bit while trying out the local heuristic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's really unfortunate. This PR is a bit of a hard sell with that restriction. Do you mind withholding this PR for a bit while trying out the local heuristic?
I think I got it working nicely.
The last test-case is still not flagged by a DataFlow::Configuration
, but the TaintTracking::Configuration
no longer has any FPs in the test.
An evaluation came back with mixed results (redo of the 10 worst). I tried to restrict the precise array-elements to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM now
@asgerf can I get a re-approve? |
@esbena can I get a re-approve/merge? |
We can now precisely analyze something like the below in a
DataFlow::Configuration
.A
TaintTracking::Configuration
will still mix up the two properties.This is caused both by the heap/directory taint-steps in
TaintTracking.qll
, and by the taint-steps inPromise.qll
.(You have to remove both to get rid of the imprecision).