Skip to content

JavaScript: Avoid duplicate source/sink nodes due to flow labels.#2201

Merged
semmle-qlci merged 11 commits intogithub:masterfrom
max-schaefer:js/avoid-duplicate-source-and-sink-nodes
Oct 31, 2019
Merged

JavaScript: Avoid duplicate source/sink nodes due to flow labels.#2201
semmle-qlci merged 11 commits intogithub:masterfrom
max-schaefer:js/avoid-duplicate-source-and-sink-nodes

Conversation

@max-schaefer
Copy link
Copy Markdown
Contributor

Flow labels form part of the identity of a PathNode, leading to duplicate sources and/or sinks in queries that use them (though some of our tools have settings to collapse them).

This PR proposes to fix this by introducing synthetic source and sink nodes that do not incorporate information about flow labels, and have the real source/sink as their successor/predecessor. The real source and sink are hidden in the path viewer to avoid having confusing edges between identical-looking nodes.

A preliminary dist-compare on nightly.slugs shows no effects on performance or results, but I'm running a full dist-compare just in case.

Another thing I'd like to point out is that this PR turns paths of length zero into paths of length one: if a source node is also a sink node, we still introduce both a synthetic source and a synthetic sink with an edge between them (skipping over the original source-cum-sink node). This is less nice than before, but I don't think it happens that much in practice, and avoiding it is a bit fiddly. I'm happy to be convinced otherwise, though.

Commit-by-commit review is strongly encouraged.

@max-schaefer max-schaefer added JS WIP This is a work-in-progress, do not merge yet! labels Oct 25, 2019
@max-schaefer max-schaefer requested a review from a team as a code owner October 25, 2019 08:53
or
MkSinkNode(DataFlow::Node nd, DataFlow::Configuration cfg) { mkSinkNode(nd, cfg, _) }

private predicate mkSourceNode(DataFlow::Node nd, DataFlow::Configuration cfg, PathSummary summary) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qldoc please, it's not really obvious what this does

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also badly named; will fix.

)
}

private predicate mkSinkNode(DataFlow::Node nd, DataFlow::Configuration cfg, PathSummary summary) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

nd = any(AddExpr add).getAnOperand().flow()
or
// Skip mid node immediately following a source node
exists(MkSourceNode(nd, cfg))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be really confusing when a path from one source steps through another source, as the path will skip over the intermediate source. This could happen when using all parameters of a certain name as taint source.

A similar thing can occur with the sink node.

Can't we have a separate mechanism for contracting the edges going out of source nodes, and going into sink nodes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good point. I was going to hide the edges (not the nodes) in the follow-up PR to do with path exploration; I'll just add that commit to this PR instead.

Copy link
Copy Markdown
Contributor

@esbena esbena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Will I now be able to write a query like below?

/**
 * @kind problem
 */

import javascript
import semmle.javascript.security.dataflow.CommandInjection::CommandInjection
import DataFlow::PathGraph

from SinkPathNode n
select n, "There is some flow to this node from a source"

result = getASuccessor(nd)
}

/**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the diff bad, or did you include two qldoc starters here?

/**
/**

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter; will fix.

@max-schaefer max-schaefer force-pushed the js/avoid-duplicate-source-and-sink-nodes branch 3 times, most recently from e26024e to 776aa61 Compare October 28, 2019 15:41
@asger-semmle
Copy link
Copy Markdown
Contributor

The new commits LGTM. It's a bit unfortunate that it wasn't part of the evaluation, though. Could you make sure it's exercised in at least one evaluation at some point (possibly together with the next PR as per your original plan).

@max-schaefer
Copy link
Copy Markdown
Contributor Author

I'll redo the whole evaluation. I'd like to get as much validation as possible, considering how many mistakes I made trying to get the right edges suppressed in that last commit 😞

Max Schaefer added 11 commits October 29, 2019 15:03
This makes it more obvious to the evaluator that it is a good predicate to pick as a sentinel, and in practice we mostly just have one configuration in scope anyway.
…AHiddenSuccessor` into top-level predicates.
They should really only be hidden for display purposes.
Instead of skipping over initial and final nodes, we now introduce edges from source and to sink nodes that circumvent these nodes entirely.
@max-schaefer max-schaefer force-pushed the js/avoid-duplicate-source-and-sink-nodes branch from 776aa61 to b42026a Compare October 29, 2019 15:36
@max-schaefer
Copy link
Copy Markdown
Contributor Author

A re-run (internal link) on nightly.slugs shows no impact on results or performance.

@max-schaefer max-schaefer removed the WIP This is a work-in-progress, do not merge yet! label Oct 31, 2019
@max-schaefer
Copy link
Copy Markdown
Contributor Author

Same story for default.slugs, so let's give this a try.

@semmle-qlci semmle-qlci merged commit 2a39802 into github:master Oct 31, 2019
@max-schaefer max-schaefer deleted the js/avoid-duplicate-source-and-sink-nodes branch November 6, 2019 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants