Skip to content

JavaScript: Improve handling of regular expressions in taint tracking. #910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 12, 2019

Conversation

xiemaisi
Copy link

@xiemaisi xiemaisi commented Feb 8, 2019

Previously, we were a little inconsistent in our handling of regular expressions: while we treated any guard involving a regular expression test as a sanitiser, we would always track taint through regular expression replacements, even reasonably complete sanitisers.

This PR proposes to treat regexp replaces involving HTML metacharacters as sanitisers for the XSS queries (but not for anything else). In particular, this fixes #205.

Of course, we might lose true positives from incomplete manual sanitisation, but as previously discussed that should be flagged by a separate query.

Since I was looking at regexps anyway, I also added a taint step through RegExp.prototype.exec (a conceptually unrelated change).

The evaluation (internal link) shows some performance gains from the XSS library reorganisation in the first commit, and the results confirm that the FP reported in #205 has been fixed. The additional taint tracking through exec gains us a few more results. The one on ace is a false positive, but that's a problem with the tainted-path query, not the new taint tracking.

Max Schaefer added 4 commits February 8, 2019 09:53
…pts.

As its first application, this library makes it possible for `StoredXss` to reuse the `Source` classes of `DomBasedXss` and `ReflectedXss` without having to pull in their libraries (which contain their `Configuration` classes, causing `StoredXss` to recompute all flow information for the other two queries).
@xiemaisi xiemaisi added the JS label Feb 8, 2019
@xiemaisi xiemaisi requested a review from a team as a code owner February 8, 2019 10:03
@xiemaisi
Copy link
Author

Ping @Semmle/js.

@ghost ghost self-assigned this Feb 12, 2019
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM.
I think MetacharEscapeSanitizer restricts too much flow though (see comment), could we make it more permissive initially and make it more restrictive if we find the need?

@@ -0,0 +1,225 @@
/**
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit confusing that Xss.ql and Xss.qll does not correspond directly to each other, but I guess renaming Xss.ql to DomBasedXss.ql is out of the question for compatibility reasons.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may be able to rename it soon, but I'm not sure whether it's safe to do just yet.

abstract class Sanitizer extends DataFlow::Node { }
}

/** Provides classes and predicates for the DOM-based XSS query. */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the three modules below are mostly cut-pasted, and have not looked into their details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

getMethodName() = "replace" and
exists(RegExpConstant c |
c.getLiteral() = getArgument(0).asExpr() and
c.getValue().regexpMatch("['\"&<>]")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we avoid matching on the single and double quotes? I fear they will make too many false XSS sanitizers since I think it is common to strip or replace one quote type with the other. This change motivates a slight renaming of the class to reflect that it does not identify all meta-characters.

Besides, the single quote is not a meta character, unless the browser supports it as an attribute value delimiter for compatibility reasons. I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll take out the quotes.

the single quote is not a meta character

https://www.w3.org/TR/2012/WD-html-markup-20120320/syntax.html#syntax-attr-single-quoted

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, no, wait, I can't do that, otherwise it'll start flagging uses of verifyURL from #205 again...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok then.

class MetacharEscapeSanitizer extends Sanitizer, DataFlow::MethodCallNode {
MetacharEscapeSanitizer() {
getMethodName() = "replace" and
exists(RegExpConstant c |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sanitizer does not recognize the string target variant: tainted.replace('<', ...) (which is safe in a loop). Perhaps we should make a library for identifying these sanitizing calls to replace and friends (other PR), that could be useful for other security queries as well (".." for js/path-injection, for example).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but since we're aiming to make the sanitiser relatively permissive anyway I don't think there is an urgent need to cover that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

false positives: js/xss
2 participants