JavaScript: Add libraries for forward and backward data-flow exploration.#2236
Conversation
erik-krogh
left a comment
There was a problem hiding this comment.
LGTM.
I played around with both forward/backwards exploration and can confirm that it works as advertised.
This seems like a harmless addition, as no production code will import these anyway.
I'm not quite sure how the isSink/isSource overrides work, as there are multiple configurations overriding the same method.
But I can observe that the isSink/isSource from the forward/backwards exploration is used, and that it therefore works.
|
Thanks, @erik-krogh!
Indeed. The new definitions of But note that they do so for every configuration, hence the caveat about queries involving multiple configurations. |
esbena
left a comment
There was a problem hiding this comment.
LGTM.
Have you considered putting this in the semmle.javascript.meta module instead, perhaps even all the way down in semmle.javascript.meta.analysisQuality.dataflow ?
|
Hm, I'm not sure I'd be able to find it myself if I did. |
esbena
left a comment
There was a problem hiding this comment.
OK. Lets keep it where it is then.
This follows a different approach from #1759. Instead of adding an alternative API with partial path nodes and partial paths, we provide a library that one can import into an existing query and thereby switch on (forward) data-flow exploration. Another difference is that instead of relying on an exploration limit we report all maximal paths starting at source nodes, that is, all paths that begin with a source node and end at a node that has no successor in the path graph.
Once imported, the library will, in fact, switch all configurations to exploration mode. This would yield extremely confusing results for queries using configurations that depend on themselves or each other, but due to the way the exploration library is implemented this will cause a negative-recursion error anyway. Not the best failure mode, but it is documented in the library's header comment. (We have a small number of standard queries that use mutually recursive configurations, so they cannot currently use data-flow exploration. I am working on changing that, but that's for a separate PR.)
Of course, the whole concept can be dualised to backward flow-exploration, looking for paths starting a nodes without a predecessor and ending at a sink. This PR also includes a library for doing that, but with the current implementation of the data-flow library this only scales on very small snapshots (as explained in the library's header comment).
I'm not entirely sure this is the right approach to data-flow exploration. Past experience suggests that cleverness using abstract classes eventually comes back to bite us, and that may well be the case here. On the bright side, the exploration libraries are entirely orthogonal to the rest of the data-flow library, so if they turn out to be a bad idea we can deprecate and remove them very easily.