Skip to content

JS: Add precise data-flow steps for Promise.all #3478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 20, 2020

Conversation

erik-krogh
Copy link
Contributor

We can now precisely analyze something like the below in a DataFlow::Configuration.

var [clean, tainted] = await Promise.all(["clean", source]);
sink(clean); // OK - but flagged by taint-tracking
sink(tainted); // NOT OK

A TaintTracking::Configuration will still mix up the two properties.
This is caused both by the heap/directory taint-steps in TaintTracking.qll, and by the taint-steps in Promise.qll.
(You have to remove both to get rid of the imprecision).

@erik-krogh erik-krogh added JS Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish labels May 14, 2020
@erik-krogh erik-krogh requested a review from a team as a code owner May 14, 2020 17:05
Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a bit of polish required, and a change note for "promises" and bluebird.

The taint step precision can be addressed later - lets continue that discussion in https://github.com/github/codeql-javascript-team/issues/117.

exists(int i |
element = this.getElement(i) and
obj = this and
prop = i.toString()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have the following instead of the ad-hoc conversion with .toString, and also in the name of hypothetical performance.

string arrayElement(int i) { i < 5 and result = i.toString() or result = arrayElement() }

esbena
esbena previously approved these changes May 15, 2020
Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. See inline comment.

exists(int i |
element = this.getElement(i) and
obj = this and
prop = i.toString()
prop = arrayElement(i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can eliminate getAnElement() because we happen to always know the element order in an ArrayCreationNode?
I think that is forwards compatible in practice, but please indicate with a 👍 that this change was intentional.

@asgerf
Copy link
Contributor

asgerf commented May 15, 2020

Hm, I'm a little sceptical of doing something that only works for pure data-flow, not taint-tracking.

I was thinking we could locally recognize tuple steps through Promise.all, add those as taint steps, and use them to filter out the usual heap steps. With pseudo-code, something like this:

predicate promiseAllStep(Node pred, Node succ) { ... }

predicate heapStep(Node pred, Node succ) {
  // ...
   exists(ArrayExpr array | array = succ.asExpr() |
     not promiseAllStep(pred, _)
     pred = array.getAnElement().flow()
    )
)

@erik-krogh
Copy link
Contributor Author

I was thinking we could locally recognize tuple steps through Promise.all, add those as taint steps, and use them to filter out the usual heap steps. With pseudo-code, something like this:

I like the idea, I'll look at that afterwards.
For now I think this is good to merge after an evaluation.


var [clean2, tainted2] = await Promise.resolve(Promise.all(["clean", source]));
sink(clean2); // OK - but flagged by taint-tracking
sink(tainted2); // NOT OK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case where the inputs to Promise.all are promises?

I can't quite picture how this would work. It seems to me we end up with the sequence of steps:

  • Store step into promise with promise-value pseudo-property
  • Store step into array object with array-index property
  • Load step from the array object to Promise.all with promise-value pseudo-property
  • Load step from Promise.all to array access with array-index property

The above sequence is not reducible to plan data flow because the innermost store/load pair have different property names.

The way I see it, Promise.all converts an "array of promises" to a "promise of an array", which in general requires us to swap the order of two properties. This requires two loads steps followed by two store steps (which we unfortunately can't do).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it, Promise.all converts an "array of promises" to a "promise of an array", which in general requires us to swap the order of two properties. This requires two loads steps followed by two store steps (which we unfortunately can't do).

Your analysis is correct, and we end up not tracking data-flow when the inputs to Promise.all are promises.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's really unfortunate. This PR is a bit of a hard sell with that restriction. Do you mind withholding this PR for a bit while trying out the local heuristic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's really unfortunate. This PR is a bit of a hard sell with that restriction. Do you mind withholding this PR for a bit while trying out the local heuristic?

I think I got it working nicely.
The last test-case is still not flagged by a DataFlow::Configuration, but the TaintTracking::Configuration no longer has any FPs in the test.

@erik-krogh
Copy link
Contributor Author

erik-krogh commented May 17, 2020

An evaluation came back with mixed results (redo of the 10 worst).

I tried to restrict the precise array-elements to Promise.all.
Evaluation on that looks good. (After I initially got this weird result).

@erik-krogh erik-krogh removed the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label May 17, 2020
asgerf
asgerf previously approved these changes May 18, 2020
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM now

asgerf
asgerf previously approved these changes May 18, 2020
@erik-krogh
Copy link
Contributor Author

@asgerf can I get a re-approve?

@erik-krogh
Copy link
Contributor Author

@esbena can I get a re-approve/merge?
QLucie won't let me merge without your approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants