JS: add Deferred model in js/use-of-returnless-function#2102
JS: add Deferred model in js/use-of-returnless-function#2102asger-semmle merged 11 commits intogithub:masterfrom
Conversation
asger-semmle
left a comment
There was a problem hiding this comment.
Could you add some tests in library-tests/TaintTracking to show that taint can propagate through these promises?
For context: If a promise resolves to a tainted value, we just consider the whole promise to be tainted. The abstract classes you're extending should give rise to some new taint steps.
| ## Changes to QL libraries | ||
|
|
||
| * `Expr.getDocumentation()` now handles chain assignments. | ||
| * Added `Deferred` as a promise library in Promises.qll |
There was a problem hiding this comment.
Improvements to library models aren't usually listed in this section. This is mainly for API changes that are relevant for QL writers. There is no need to use the Deferred module directly, and certainly no reason to import Promises.qll directly.
You could add a bullet in the general improvements section, saying that promises derived from a Deferred object are now recognized.
ghost
left a comment
There was a problem hiding this comment.
I think this requires a bit more work to reduce our reliance on heuristics. @asger-semmle WDYT?
| module Deferred { | ||
| private DataFlow::SourceNode deferred() { | ||
| exists(VarAccess var, DataFlow::NewNode instantiation | | ||
| var.getName() = "Deferred" and |
There was a problem hiding this comment.
This has to be a more semantic definition. We have heuristics in .ql files, and very, very, rarely in .qll files. This separation principle is in place to prevent a slow, catastrophic erosion of precision caused by interplay between multiple heuristics in the library files. I am glad to see the sanity check, but I do not think it is enough.
Some thoughts:
Can you perhaps use globalVarRef("Deferred") instead? We use globalVarRef in a few places if it is common that the object of interest is introduced globally instead of through an imported library.
I note that something like class Deferred is used in a few places in your query console hits. I am inclined to accept small class definitions (perhaps also pseudo class definitions with object literals and/or prototype hackery) with both resolve and reject fields as sources for deferred objects, but it depends on the results. These clases should also have at least one instance with your exists(instantiation.getAMemberCall("resolve")) heuristic, and perhaps also at least one instance with exists(instantiation.getAMemberCall("reject")). It may be necessary to use type tracking to identify such classes properly.
There was a problem hiding this comment.
Can you perhaps use globalVarRef("Deferred") instead?
Not really. It happens often enough that Deferred is not a global variable (including in the FP that motivated this PR).
I might be able to use globalVarRef("Deferred") together with any(ParameterNode p | p.getName() = "Deferred") to capture most uses.
I'll look into creating something more precise.
I note that something like class Deferred is used in a few places in your query console hits.
But I don't think its used often enough to rely solely on that pattern.
There was a problem hiding this comment.
So I identified the patterns used within the libraries and changed the model accordingly: 4ec825b
I don't think its much better than before, but it more precisely describes the patterns observed.
There was a problem hiding this comment.
I note that something like class Deferred is used in a few places in your query console hits. I am inclined to accept small class definitions (perhaps also pseudo class definitions with object literals and/or prototype hackery) with both
resolveandrejectfields as sources for deferred objects, but it depends on the results. These clases should also have at least one instance with yourexists(instantiation.getAMemberCall("resolve"))heuristic, and perhaps also at least one instance withexists(instantiation.getAMemberCall("reject")). It may be necessary to use type tracking to identify such classes properly.
Something like this: 31009d9?
I still don't check if the "class"-definition itself has resolve/reject methods. Mostly because I can't do that check when Deferred is a parameter.
|
Hm, actually the Tainting the entire |
|
After reading @esben-semmle's comment I'm wondering if it's better to install a name-based FP filter in the |
|
So type tracking has been added. @esbena I ended up going more in your direction, which ended up simplifying the implementation. I did try something more complex, that looks for specific variants of how a |
|
The implementation looks good now (except for some missing QL doc on
Ultimately, I think we should split this PR into two:
|
Yes, but its not always exposed through a
Yeah, lets do that for now. |
esbena
left a comment
There was a problem hiding this comment.
The pull request title is misleading now, and the change note is no longer required. Other than that, I think this is ready to merge.
| /** | ||
| * A promise object created by a Deferred constructor | ||
| */ | ||
| private class DeferredPromiseDefinition extends PromiseDefinition, DeferredInstance { |
There was a problem hiding this comment.
Even though we are intending to use this as a FP filter, we will effectively also be extending the dataflow reasoning that deals with promises. This may lead to surprising dataflow in this query caused by a bad heuristic one day. I am willing to accept that risk though.
Co-Authored-By: Esben Sparre Andreasen <esbena@github.com>
| module Deferred { | ||
| /** | ||
| * An instance of a `Deferred` class. | ||
| * E.g. the result from `new Deferred()` or `new $.Deferred()`. |
There was a problem hiding this comment.
We don't generally use abbreviations like "e.g." in comments; see https://wiki.semmle.com/pages/viewpage.action?pageId=17105939 (internal link).
Motivated by the last FP's in
js/use-of-returnless-functionTurns out that
Deferredis a tricky size, as it is not a single library, but rather a silent agreement between library developers to implement the same API.(You can find more implementations by looking through these results: https://lgtm.com/query/4525530373913797062/)
Therefore this new model for
Deferredlooks for values named "Deferred" that are used like a promise library.The
Deferredlibrary is used somewhat often: https://lgtm.com/query/8585906951513722374/.Although most uses are from imports of Angular.
I evaluated performance using the
js/use-of-returnless-functionquery: https://git.semmle.com/erik/dist-compare-reports/tree/florian.ti.semmle.com_1570605281526Looks like there is no performance degradation, and the FP's are no longer reported.