Skip to content

JS: Precise data-flow for returns from async functions#4019

Merged
codeql-ci merged 15 commits intogithub:mainfrom
erik-krogh:asyncCalls
Aug 24, 2020
Merged

JS: Precise data-flow for returns from async functions#4019
codeql-ci merged 15 commits intogithub:mainfrom
erik-krogh:asyncCalls

Conversation

@erik-krogh
Copy link
Contributor

@erik-krogh erik-krogh commented Aug 6, 2020

Previously async function were modeled imprecisely, such that there was flow from foo to bar in the below.

async function f() {
  return foo;
}
let bar = f(); // f returns a promise that contains foo.

With this PR the precision is better.

The pseudo-properties from the Promise model is used the achieve this.

A new FunctionReturnNode is introduced, and this node has to be SourceNode because we need to store properties on it (the pseudo-properties).

@erik-krogh erik-krogh added JS WIP This is a work-in-progress, do not merge yet! labels Aug 6, 2020
@erik-krogh erik-krogh requested a review from a team as a code owner August 6, 2020 13:40
@erik-krogh erik-krogh added the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label Aug 7, 2020
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments so far. I gave up on commit-by-commit review after I noticed my comments becoming obsolete by later commits. Let me know when you want another round of reviews.

function.getLocation().hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
}

override BasicBlock getBasicBlock() { result = function.(ExprOrStmt).getBasicBlock() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basic block belongs to the enclosing function. How about function.getExit().getBasicBlock()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the basic-block of the enclosing function is consistent with how ExceptionalFunctionReturnNode works.
If we change it, then we should change both.

I think the current behavior is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. I still think the current behavior is completely wrong, but I understand if you don't want to fix it in this PR. Opened https://github.com/github/codeql-javascript-team/issues/214.

*/
cached
predicate returnStep(DataFlow::Node pred, DataFlow::Node succ) {
// Note: FlowSteps::CachedSteps::returnStep/2 has copy-paste children
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was quite confusing when doing commit-by-commit review since the function doesn't appear similar to this at all until later in history 😕

I think we can eliminate the duplication by introducing a shared predicate

pragma[inline]
predicate returnStepWithAsyncFlag(DataFlow::Node pred, DataFlow::Node succ, boolean isAsync) {
    // ... bind isAsync to the async-ness of the function being returned from ...
}

and then call that from returnStep/2:

returnStepWithAsyncFlag(pred, succ, false)

and from the Promises.qll library

returnStepWithAsyncFlag(pred, succ, true)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end the copy-paste was with the exception part of the predicate.
I've shared the implementation of that part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a slight but of almost copy-pasted code between the predicate and DataFlow::localFlowStep, but I think that is OK.

await throwAsync(source());
} catch (e) {
sink(e); // NOT OK
sink(e); // NOT OK - but not flagged
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we miss this now? Are we missing a load step from the await operand to the catch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is because of the mix of calls and property store/read.

The flow is: flow into throwAsync() -> store in pseudo-property -> return from throwAsync() -> read pseudo-property -> local flow to e.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the new code in loadStep/storeStep doesn't catch it, because they only handle ordinary returns and not exceptional returns.

@erik-krogh
Copy link
Contributor Author

erik-krogh commented Aug 7, 2020

A few comments so far. I gave up on commit-by-commit review after I noticed my comments becoming obsolete by later commits.

I thought I had opened a draft PR. Maybe the PR got un-drafted when I accepted the review comments from intrigus-lgtm, or maybe I misclicked 🤷‍♂️.

I had meant to do a rebase before undrafting the PR, sorry about that.

Thanks for the comments.

@erik-krogh erik-krogh removed the WIP This is a work-in-progress, do not merge yet! label Aug 7, 2020
@erik-krogh
Copy link
Contributor Author

A quick smoke test looks good in terms of performance.

@asgerf ready for next round.
I've rebased the entire PR (and included your review comments).
Everything after update expected output is new.

@erik-krogh
Copy link
Contributor Author

Evaluation on nightly and big-apps show good performance, except for gecko-dev that timed out on TargetBlank.ql.

I'll investigate the difference in results.

@erik-krogh
Copy link
Contributor Author

I'll investigate the difference in results.

The new results look like TPs.

The missed result is for the same reason as this.

Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, mostly LGTM now but we'll obviously need to fix gecko.

Also, do you see a way to recover the lost result? That was a really good result IMO and I'd be sad to lose it.

function.getLocation().hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
}

override BasicBlock getBasicBlock() { result = function.(ExprOrStmt).getBasicBlock() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof. I still think the current behavior is completely wrong, but I understand if you don't want to fix it in this PR. Opened https://github.com/github/codeql-javascript-team/issues/214.

@max-schaefer
Copy link
Contributor

except for gecko-dev that timed out on TargetBlank.ql.

FWIW, gecko-dev has been hovering at the edge of timing out for me for a while now. In particular I've seen it time out on master on a few occasions (cf #3980).

@erik-krogh
Copy link
Contributor Author

Also, do you see a way to recover the lost result? That was a really good result IMO and I'd be sad to lose it.

I managed to recover both the missed result from the evaluation and our test-case by adding a summary step for exceptions in immediately awaited async function calls.

@erik-krogh
Copy link
Contributor Author

I managed to recover both the missed result from the evaluation and our test-case by adding a summary step for exceptions in immediately awaited async function calls.

An evaluation still looks good.

@erik-krogh erik-krogh removed the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label Aug 11, 2020
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final nit, then LGTM

Co-authored-by: Asger F <asgerf@github.com>
@erik-krogh
Copy link
Contributor Author

I did a final round of evaluations: nightly and big-apps.

gecko-dev took equally long to time out both with and without my changes.

@asgerf can I get a final approve?

@adityasharad adityasharad changed the base branch from master to main August 14, 2020 18:33
Copy link
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this fell off my radar for a bit. LGTM 👍

@codeql-ci codeql-ci merged commit 765c40e into github:main Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants