JavaScript: Improve handling of regular expressions in taint tracking. #910

xiemaisi · 2019-02-08T10:03:01Z

Previously, we were a little inconsistent in our handling of regular expressions: while we treated any guard involving a regular expression test as a sanitiser, we would always track taint through regular expression replacements, even reasonably complete sanitisers.

This PR proposes to treat regexp replaces involving HTML metacharacters as sanitisers for the XSS queries (but not for anything else). In particular, this fixes #205.

Of course, we might lose true positives from incomplete manual sanitisation, but as previously discussed that should be flagged by a separate query.

Since I was looking at regexps anyway, I also added a taint step through RegExp.prototype.exec (a conceptually unrelated change).

The evaluation (internal link) shows some performance gains from the XSS library reorganisation in the first commit, and the results confirm that the FP reported in #205 has been fixed. The additional taint tracking through exec gains us a few more results. The one on ace is a false positive, but that's a problem with the tainted-path query, not the new taint tracking.

…pts. As its first application, this library makes it possible for `StoredXss` to reuse the `Source` classes of `DomBasedXss` and `ReflectedXss` without having to pull in their libraries (which contain their `Configuration` classes, causing `StoredXss` to recompute all flow information for the other two queries).

…izers for XSS queries.

xiemaisi · 2019-02-12T08:24:01Z

Ping @Semmle/js.

ghost

Generally LGTM.
I think MetacharEscapeSanitizer restricts too much flow though (see comment), could we make it more permissive initially and make it more restrictive if we find the need?

ghost · 2019-02-12T11:04:16Z

javascript/ql/src/semmle/javascript/security/dataflow/Xss.qll

@@ -0,0 +1,225 @@
+/**


It is a bit confusing that Xss.ql and Xss.qll does not correspond directly to each other, but I guess renaming Xss.ql to DomBasedXss.ql is out of the question for compatibility reasons.

We may be able to rename it soon, but I'm not sure whether it's safe to do just yet.

ghost · 2019-02-12T11:07:07Z

javascript/ql/src/semmle/javascript/security/dataflow/Xss.qll

+  abstract class Sanitizer extends DataFlow::Node { }
+}
+
+/** Provides classes and predicates for the DOM-based XSS query. */


I assume the three modules below are mostly cut-pasted, and have not looked into their details.

ghost · 2019-02-12T11:18:20Z

javascript/ql/src/semmle/javascript/security/dataflow/Xss.qll

+      getMethodName() = "replace" and
+      exists(RegExpConstant c |
+        c.getLiteral() = getArgument(0).asExpr() and
+        c.getValue().regexpMatch("['\"&<>]")


Could we avoid matching on the single and double quotes? I fear they will make too many false XSS sanitizers since I think it is common to strip or replace one quote type with the other. This change motivates a slight renaming of the class to reflect that it does not identify all meta-characters.

Besides, the single quote is not a meta character, unless the browser supports it as an attribute value delimiter for compatibility reasons. I think.

Sure, I'll take out the quotes.

the single quote is not a meta character

https://www.w3.org/TR/2012/WD-html-markup-20120320/syntax.html#syntax-attr-single-quoted

Ugh, no, wait, I can't do that, otherwise it'll start flagging uses of verifyURL from #205 again...

ghost · 2019-02-12T11:31:06Z

javascript/ql/src/semmle/javascript/security/dataflow/Xss.qll

+  class MetacharEscapeSanitizer extends Sanitizer, DataFlow::MethodCallNode {
+    MetacharEscapeSanitizer() {
+      getMethodName() = "replace" and
+      exists(RegExpConstant c |


This sanitizer does not recognize the string target variant: tainted.replace('<', ...) (which is safe in a loop). Perhaps we should make a library for identifying these sanitizing calls to replace and friends (other PR), that could be useful for other security queries as well (".." for js/path-injection, for example).

Indeed, but since we're aiming to make the sanitiser relatively permissive anyway I don't think there is an urgent need to cover that case.

Max Schaefer added 4 commits February 8, 2019 09:53

JavaScript: Treat regexp replacements of HTML metacharacters as sanit…

25d06ad

…izers for XSS queries.

JavaScript: Track taint through RegExp.prototype.replace.

b314c54

JavaScript: Add change note.

6ce77ea

xiemaisi added the JS label Feb 8, 2019

xiemaisi requested a review from a team as a code owner February 8, 2019 10:03

ghost self-assigned this Feb 12, 2019

ghost suggested changes Feb 12, 2019

View reviewed changes

ghost approved these changes Feb 12, 2019

View reviewed changes

semmle-qlci merged commit c133362 into github:master Feb 12, 2019

xiemaisi mentioned this pull request Feb 12, 2019

JavaScript: Add Range.prototype.createContextualFragment as an XSS sink. #933

Merged

xiemaisi deleted the js/regexp-taint branch March 13, 2019 15:33

kamarcum unassigned ghost Apr 28, 2020

verdinjoshua1982 mentioned this pull request Oct 14, 2022

[Snyk] Upgrade entities from 1.1.2 to 4.4.0 verdinjoshua1982/codeql#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JavaScript: Improve handling of regular expressions in taint tracking. #910

JavaScript: Improve handling of regular expressions in taint tracking. #910

Uh oh!

xiemaisi commented Feb 8, 2019

Uh oh!

xiemaisi commented Feb 12, 2019

Uh oh!

ghost left a comment

Uh oh!

ghost Feb 12, 2019

Uh oh!

xiemaisi Feb 12, 2019

Uh oh!

ghost Feb 12, 2019

Uh oh!

xiemaisi Feb 12, 2019

Uh oh!

ghost Feb 12, 2019

Uh oh!

xiemaisi Feb 12, 2019

Uh oh!

xiemaisi Feb 12, 2019

Uh oh!

ghost Feb 12, 2019

Uh oh!

ghost Feb 12, 2019

Uh oh!

xiemaisi Feb 12, 2019

Uh oh!

Uh oh!

JavaScript: Improve handling of regular expressions in taint tracking. #910

JavaScript: Improve handling of regular expressions in taint tracking. #910

Uh oh!

Conversation

xiemaisi commented Feb 8, 2019

Uh oh!

xiemaisi commented Feb 12, 2019

Uh oh!

ghost left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!