Skip to content

JS: add Deferred model in js/use-of-returnless-function#2102

Merged
asger-semmle merged 11 commits intogithub:masterfrom
erik-krogh:deferredModel
Nov 6, 2019
Merged

JS: add Deferred model in js/use-of-returnless-function#2102
asger-semmle merged 11 commits intogithub:masterfrom
erik-krogh:deferredModel

Conversation

@erik-krogh
Copy link
Contributor

Motivated by the last FP's in js/use-of-returnless-function

Turns out that Deferred is a tricky size, as it is not a single library, but rather a silent agreement between library developers to implement the same API.
(You can find more implementations by looking through these results: https://lgtm.com/query/4525530373913797062/)

Therefore this new model for Deferred looks for values named "Deferred" that are used like a promise library.

The Deferred library is used somewhat often: https://lgtm.com/query/8585906951513722374/.
Although most uses are from imports of Angular.

I evaluated performance using the js/use-of-returnless-function query: https://git.semmle.com/erik/dist-compare-reports/tree/florian.ti.semmle.com_1570605281526
Looks like there is no performance degradation, and the FP's are no longer reported.

@erik-krogh erik-krogh added the JS label Oct 9, 2019
@erik-krogh erik-krogh requested a review from a team as a code owner October 9, 2019 09:24
Copy link
Contributor

@asger-semmle asger-semmle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some tests in library-tests/TaintTracking to show that taint can propagate through these promises?

For context: If a promise resolves to a tainted value, we just consider the whole promise to be tainted. The abstract classes you're extending should give rise to some new taint steps.

## Changes to QL libraries

* `Expr.getDocumentation()` now handles chain assignments.
* Added `Deferred` as a promise library in Promises.qll
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improvements to library models aren't usually listed in this section. This is mainly for API changes that are relevant for QL writers. There is no need to use the Deferred module directly, and certainly no reason to import Promises.qll directly.

You could add a bullet in the general improvements section, saying that promises derived from a Deferred object are now recognized.

ghost
ghost previously requested changes Oct 9, 2019
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this requires a bit more work to reduce our reliance on heuristics. @asger-semmle WDYT?

module Deferred {
private DataFlow::SourceNode deferred() {
exists(VarAccess var, DataFlow::NewNode instantiation |
var.getName() = "Deferred" and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be a more semantic definition. We have heuristics in .ql files, and very, very, rarely in .qll files. This separation principle is in place to prevent a slow, catastrophic erosion of precision caused by interplay between multiple heuristics in the library files. I am glad to see the sanity check, but I do not think it is enough.

Some thoughts:

Can you perhaps use globalVarRef("Deferred") instead? We use globalVarRef in a few places if it is common that the object of interest is introduced globally instead of through an imported library.

I note that something like class Deferred is used in a few places in your query console hits. I am inclined to accept small class definitions (perhaps also pseudo class definitions with object literals and/or prototype hackery) with both resolve and reject fields as sources for deferred objects, but it depends on the results. These clases should also have at least one instance with your exists(instantiation.getAMemberCall("resolve")) heuristic, and perhaps also at least one instance with exists(instantiation.getAMemberCall("reject")). It may be necessary to use type tracking to identify such classes properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you perhaps use globalVarRef("Deferred") instead?

Not really. It happens often enough that Deferred is not a global variable (including in the FP that motivated this PR).
I might be able to use globalVarRef("Deferred") together with any(ParameterNode p | p.getName() = "Deferred") to capture most uses.
I'll look into creating something more precise.

I note that something like class Deferred is used in a few places in your query console hits.

But I don't think its used often enough to rely solely on that pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I identified the patterns used within the libraries and changed the model accordingly: 4ec825b

I don't think its much better than before, but it more precisely describes the patterns observed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that something like class Deferred is used in a few places in your query console hits. I am inclined to accept small class definitions (perhaps also pseudo class definitions with object literals and/or prototype hackery) with both resolve and reject fields as sources for deferred objects, but it depends on the results. These clases should also have at least one instance with your exists(instantiation.getAMemberCall("resolve")) heuristic, and perhaps also at least one instance with exists(instantiation.getAMemberCall("reject")). It may be necessary to use type tracking to identify such classes properly.

Something like this: 31009d9?

I still don't check if the "class"-definition itself has resolve/reject methods. Mostly because I can't do that check when Deferred is a parameter.

@asger-semmle
Copy link
Contributor

Hm, actually the Deferred object is not itself a promise - the underlying promise is usually exposed through a property named promise.

Tainting the entire Deferred object is possibly acceptable in the short term though, but will need an evaluation. Looking at the examples in the query console run, there seems to be a case for using type tracking to track aliases of the Deferred object, but I'll leave it up to you if you want to pursue that or just focus on eliminating FPs from the other query.

@asger-semmle
Copy link
Contributor

After reading @esben-semmle's comment I'm wondering if it's better to install a name-based FP filter in the js/use-of-returnless-function query and then later do a dedicated piece of work to support for the Deferred pattern with a focus on taint tracking.

@erik-krogh erik-krogh added the WIP This is a work-in-progress, do not merge yet! label Oct 22, 2019
@erik-krogh erik-krogh removed the WIP This is a work-in-progress, do not merge yet! label Oct 24, 2019
@erik-krogh
Copy link
Contributor Author

So type tracking has been added.
The current results on our "standard" benchmark of 236 projects is here: https://lgtm.com/query/8048324438418409201/

@esbena I ended up going more in your direction, which ended up simplifying the implementation.
I felt that requiring that reject was called was too restrictive, as I could find multiple Deferred implementations where a call to reject was nowhere to be found.
The heuristic therefore only requires a call to resolve on from something that was constructed from a callsite with calleeName() = "Deferred".

I did try something more complex, that looks for specific variants of how a Deferred class can appear. But that only removed TP's compared to the heuristic I've pushed here (e.g. it didn't catch dojo.Deferred and a Deferred implementation in jQuery mobile).

@esbena
Copy link
Contributor

esbena commented Oct 25, 2019

The implementation looks good now (except for some missing QL doc on DeferredInstance, and a mention about it being heuristic). But we still have a few higher level nits:

Hm, actually the Deferred object is not itself a promise - the underlying promise is usually exposed through a property named promise.

Tainting the entire Deferred object is possibly acceptable in the short term though, but will need an evaluation.

We have heuristics in .ql files, and very, very, rarely in .qll files. This separation principle is in place to prevent a slow, catastrophic erosion of precision caused by interplay between multiple heuristics in the library files.

Ultimately, I think we should split this PR into two:

  1. add the entire new Deferred module in the ql file it would be used to eliminate some FPs for
  2. move the Deferred module to a qll file once we have convinced ourselves that the heuristics and promise-interplay are good enough

@erik-krogh
Copy link
Contributor Author

Hm, actually the Deferred object is not itself a promise - the underlying promise is usually exposed through a property named promise. Tainting the entire Deferred object is possibly acceptable in the short term though, but will need an evaluation.

Yes, but its not always exposed through a .promise property. I want to keep it as is for now.

add the entire new Deferred module in the ql file it would be used to eliminate some FPs for

Yeah, lets do that for now.
I moved the module and the tests to the relevant query.

Copy link
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request title is misleading now, and the change note is no longer required. Other than that, I think this is ready to merge.

/**
* A promise object created by a Deferred constructor
*/
private class DeferredPromiseDefinition extends PromiseDefinition, DeferredInstance {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we are intending to use this as a FP filter, we will effectively also be extending the dataflow reasoning that deals with promises. This may lead to surprising dataflow in this query caused by a bad heuristic one day. I am willing to accept that risk though.

Co-Authored-By: Esben Sparre Andreasen <esbena@github.com>
esbena
esbena previously approved these changes Nov 6, 2019
@erik-krogh erik-krogh changed the title JS: add model for Deferred in Promises.qll JS: add Deferred model in js/use-of-returnless-function Nov 6, 2019
@erik-krogh erik-krogh dismissed ghost ’s stale review November 6, 2019 12:32

esbena approved

module Deferred {
/**
* An instance of a `Deferred` class.
* E.g. the result from `new Deferred()` or `new $.Deferred()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally use abbreviations like "e.g." in comments; see https://wiki.semmle.com/pages/viewpage.action?pageId=17105939 (internal link).

@asger-semmle asger-semmle merged commit d9beb54 into github:master Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants