Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions java/ql/src/Security/CWE/CWE-400/ReDoS.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
public class ReDoS {
private static String EXPONENTIAL_REGEX = "(a+)+";

public void regexMatchBad(HttpServletRequest request) {
// BAD: User provided input is matched against exponential regex using standard
// Java regular expression engine
java.util.regex.Pattern p = java.util.regex.Pattern.compile(EXPONENTIAL_REGEX);
java.util.regex.Matcher m = p.matcher(request.getParameter("input"));
boolean matches = m.matches();
}

public void regexMatchGood(HttpServletRequest request) {
// GOOD: RE2/J regular expression engine is being used
com.google.re2j.Pattern p = com.google.re2j.Pattern.compile(EXPONENTIAL_REGEX);
com.google.re2j.Matcher m = p.matcher(request.getParameter("input"));
boolean matches = m.matches();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}

EOL at EOF

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this.
This sample is included in the qhelp. Will this newline at the end look good in the rendered help? Other samples don't have newline at EOF.
@felicitymay, what is your view on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to be considered a valid file on some operating systems, you need a EOL at EOF.

A text file, under unix, consists of a series of lines, each of which ends with a newline character (\n). A file that is not empty and does not end with a newline is therefore not a text file.
- https://unix.stackexchange.com/a/18789

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine with me to change this. But I would like to wait for @felicitymay opinion, because other examples don't have EOL at EOF.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies for the delay in replying. I hadn't responded because I was hoping that someone more knowledgeable would answer you. (I'm a content writer, rather than a developer, and work in Windows where EOL and EOF don't really come up.)

While @JLLeitschuh is right to point out that it's normal behavior in linux to end files with an extra newline, I suspect that @ggolawski is right to suggest that for these examples the extra blank line will look strange when they're rendered. @Semmle/java does that sound right?

52 changes: 52 additions & 0 deletions java/ql/src/Security/CWE/CWE-400/ReDoS.qhelp
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>

<overview>
<p>Matching user input against a regular expression which takes exponential time in the worst case
can allow a malicious user to perform a Denial of Service ("DoS") attack by crafting input that
takes a long time to execute.</p>

<p>Most regular expression engines, including the Java standard library implementation, are
designed to work with an extended regular expression syntax. Although this provides flexibility for
the user, it can prevent the engine from constructing an efficient implementation of the matcher in
all circumstances. In particular, the "worst case time complexity" (see the references) of certain
regular expressions may be "exponential". This would allow a malicious user to provide some input
which causes the regular expression to take a very long time to execute.</p>

<p>Typically, a regular expression is vulnerable to this attack if it applies repetition to a
sub-expression which itself is repeated, or contains overlapping options. For example,
<code>(a+)+</code> is vulnerable to a string such as <code>aaaaaaaaaaaaaaaaaaaaaaaaaaab</code>.
More information about the precise circumstances can be found in the references.</p>
</overview>

<recommendation>
<p>Modify the regular expression to avoid the exponential worst case time. If this is not possible,
then kill the regular expression matching if it takes too long. Alternatively, use a different
regular expression engine, for example RE2/J which runs in time linear in the size of the input.
</p>
</recommendation>

<example>
<p>The following example shows a HTTP request parameter that is matched against a regular
expression which has exponential worst case performance. In the first case, standard Java library
is being used, which can lead to a Denial of Service. In the second case, RE2/J regular expression
engine is being used. It runs in time linear in the size of the input and won't lead to a DoS.</p>

<sample src="ReDoS.java" />
</example>

<references>
<li>
OWASP:
<a href="https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS">Regular expression Denial of Service - ReDoS</a>.
</li>
<li>
Regular-Expressions.info: <a href="https://www.regular-expressions.info/redos.html">Preventing Regular Expression Denial of Service (ReDoS)</a>.
</li>
<li>
<a href="https://github.com/google/re2j">RE2/J: linear time regular expression matching in Java</a>.
</li>
</references>
</qhelp>
26 changes: 26 additions & 0 deletions java/ql/src/Security/CWE/CWE-400/ReDoS.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/**
* @name Denial of Service from comparison of user input against expensive regex
* @description User input should not be matched against a regular expression that could require
* exponential time on certain input.
* @kind path-problem
* @problem.severity error
* @precision high
* @id java/redos
* @tags security
* external/cwe/cwe-400
*/

import java
import semmle.code.java.dataflow.DataFlow
import Regex

/** A data flow source for exponential regex. */
class ExponentialRegexLiteral extends RegexPatternSource {
ExponentialRegexLiteral() { isExponentialRegex(this.asExpr()) }
}

from DataFlow::PathNode source, DataFlow::PathNode sink, RegexInputFlowConfig conf
where conf.hasFlowPath(source, sink)
select sink.getNode(), source, sink,
"$@ flows to regular expression operation with dangerous regex.", source.getNode(),
"User-provided value"
119 changes: 119 additions & 0 deletions java/ql/src/Security/CWE/CWE-400/Regex.qll
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.dataflow.TaintTracking
import semmle.code.java.dataflow.TaintTracking2

/**
* A taint-tracking configuration for unvalidated user input that is used to match against
* malicious regex.
*/
class RegexInputFlowConfig extends TaintTracking::Configuration {
RegexInputFlowConfig() { this = "RegexInputFlowConfig" }

override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }

override predicate isSink(DataFlow::Node sink) { sink instanceof RegexInputSink }

override predicate isSanitizer(DataFlow::Node node) {
node.getType() instanceof PrimitiveType or node.getType() instanceof BoxedType
}
}

/**
* Holds if method call `ma` matches against the regex `regexExpr`. For example, method `matches`
* matches against regex `regex` in `Pattern.matches(regex, input)`.
* The following cases are handled:
* * `Pattern.matches(regex, input)`
* * `input.matches(pattern)` (where `input` is `String`)
* * pattern is created first (`Pattern.compile(regex)`) and then matcher (`p.matcher(input)`)
*
*
* Detecting the last case has some limitations:
* * it doesn't work if pattern is created in one method and matcher in another one, e.g.:
* `void match(Pattern p) { p.matcher(input); } void foo() { match(Pattern.compile(regex)); }`,
* but it works correctly if pattern is a class field and matcher is created in method, e.g.:
* `private Pattern p = Pattern.compile(regex); void foo() { p.matcher(input); }`
* * the issue is reported if `Matcher` object is created with user supplied input on malicious
* pattern. To actually trigger the DoS condition, actual matching (e.g. `matcher.matches()`) is
* required, but usually if matcher is created it will eventually be used.
*/
predicate matchesAgainstRegex(MethodAccess ma, Expr regexExpr) {
// `Pattern.matches(regex, input)`
ma.getMethod() instanceof MethodPatternMatches and
regexExpr.getParent() = ma
or
// `Pattern.compile(regex).matcher(input)` or
// `Pattern p = Pattern.compile(regex); p.matcher(input);` or
// `private Pattern p = Pattern.compile(regex); public void foo() { p.matcher(input); }`
ma.getMethod() instanceof MethodPatternMatcher and
(
DataFlow::localFlow(DataFlow::exprNode(regexExpr.getParent()),
DataFlow::exprNode(ma.getQualifier())) or
ma.getQualifier().(FieldAccess).getField().getInitializer() = regexExpr.getParent()
)
or
// `input.matches(pattern)`
ma.getMethod() instanceof MethodStringMatches and
regexExpr = ma.getAnArgument()
}

/** A data flow sink for unvalidated user input that is used to match against malicious regex. */
class RegexInputSink extends DataFlow::ExprNode {
RegexInputSink() {
exists(MethodAccess ma, RegexPatternFlowConfig cfg, DataFlow::ExprNode regexSink |
ma.getAnArgument() = this.getExpr() or ma.getQualifier() = this.getExpr()
|
cfg.hasFlow(_, regexSink) and
matchesAgainstRegex(ma, regexSink.getExpr()) and
this != regexSink
)
}
}

/** A data flow source for malicious regex. */
abstract class RegexPatternSource extends DataFlow::Node { }

/**
* A taint-tracking configuration for regex pattern that is used to create `Pattern` or is used
* as an argument to `matches` method.
*/
class RegexPatternFlowConfig extends TaintTracking2::Configuration {
RegexPatternFlowConfig() { this = "RegexPatternFlowConfig" }

override predicate isSource(DataFlow::Node source) { source instanceof RegexPatternSource }

override predicate isSink(DataFlow::Node sink) {
exists(MethodAccess ma, int index | ma.getArgument(index) = sink.asExpr() |
ma.getMethod() instanceof MethodPatternCompile
or
ma.getMethod() instanceof MethodPatternMatches and index = 0
or
ma.getMethod() instanceof MethodStringMatches
)
}

override predicate isSanitizer(DataFlow::Node node) {
node.getType() instanceof PrimitiveType or node.getType() instanceof BoxedType
}
}

/**
* An expression that represents a regular expression with potential exponential behavior.
* Couple of variants of a common pattern that leads to exponential blow-up are detected.
*/
predicate isExponentialRegex(StringLiteral s) {
// Example: ([a-z]+)+
s.getValue().regexpMatch(".*\\([^()*+\\]]+\\]?(\\*|\\+)\\)(\\*|\\+).*")
or
// Example: (([a-z])?([a-z]+))+
s.getValue().regexpMatch(".*\\((\\([^()]+\\)\\?)?\\([^()*+\\]]+\\]?(\\*|\\+)\\)\\)(\\*|\\+).*")
or
// Example: (([a-z])+)+
s.getValue().regexpMatch(".*\\(\\([^()*+\\]]+\\]?\\)(\\*|\\+)\\)(\\*|\\+).*")
or
// Example: (a|aa)+
s.getValue().regexpMatch(".*\\(([^()*+\\]]+\\]?)\\|\\1+\\??\\)(\\*|\\+).*")
or
// Example: (.*[a-z]){n} n >= 10
s.getValue().regexpMatch(".*\\(\\.\\*[^()*+\\]]+\\]?\\)\\{[1-9][0-9]+,?[0-9]*\\}.*")
}
15 changes: 15 additions & 0 deletions java/ql/src/Security/CWE/CWE-400/RegexInjection.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
public boolean isUsernameInPassword(HttpServletRequest request) {
String username = request.getParameter("username");
String pasword = request.getParameter("password");

// BAD: User provided input is being used to create regex without escaping
Pattern p = Pattern.compile(username);
Matcher m = p.matcher(password);
return m.matches();

// GOOD: User provided input escaped before being used to create regex
String safeUsername = Pattern.quote(username);
Pattern p = Pattern.compile(safeUsername);
Matcher m = p.matcher(password);
return m.matches();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}

EOL at EOF

48 changes: 48 additions & 0 deletions java/ql/src/Security/CWE/CWE-400/RegexInjection.qhelp
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>

<overview>
<p>Constructing a regular expression with unsanitized user input is dangerous as a malicious user
may be able to modify the meaning of the expression. In particular, such a user may be able to
provide a regular expression fragment that takes exponential time in the worst case, and use that
to perform a Denial of Service attack.</p>
</overview>

<recommendation>
<p>For user input that is intended to be referenced as a string literal in a regular expression,
use the <code>Pattern.quote</code> method to escape any special characters. If the regular
expression is intended to be configurable by the user, then a timeout should be used to avoid
Denial of Service attacks. Alternatively, a different regular expression engine, for example RE2/J
which runs in time linear in the size of the input can be used.</p>
</recommendation>

<example>
<p>The following example uses regular expression engine to check if the provided username is part
of the provided password.</p>

<p>In the first case, the username which is being used to construct the regex is not escaped. If
a malicious user provides a regex that has exponential worst case performance, then this could
lead to a Denial of Service.</p>

<p>In the second case, the user input is escaped using <code>Pattern.quote</code> before being
included in the regular expression. This ensures that the user cannot insert characters which have
a special meaning in regular expressions.</p>

<sample src="RegexInjection.cs" />
</example>

<references>
<li>
OWASP:
<a href="https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS">Regular expression Denial of Service - ReDoS</a>.
</li>
<li>
Regular-Expressions.info: <a href="https://www.regular-expressions.info/redos.html">Preventing Regular Expression Denial of Service (ReDoS)</a>.
</li>
<li>
<a href="https://github.com/google/re2j">RE2/J: linear time regular expression matching in Java</a>.
</li>
</references>
</qhelp>
28 changes: 28 additions & 0 deletions java/ql/src/Security/CWE/CWE-400/RegexInjection.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
/**
* @name Denial of Service from comparison of user input against regex built from user-controlled
* source
* @description User input should not be matched against a regular expression built from
* user-controlled source, because that could require exponential time on certain
* input.
* @kind path-problem
* @problem.severity error
* @precision high
* @id java/regex-injection
* @tags security
* external/cwe/cwe-400
*/

import java
import semmle.code.java.dataflow.DataFlow
import Regex

/** A data flow source for user-controlled regex. */
class RemoteRegexPatternSource extends RegexPatternSource {
RemoteRegexPatternSource() { this instanceof RemoteFlowSource }
}

from DataFlow::PathNode source, DataFlow::PathNode sink, RegexInputFlowConfig conf
where conf.hasFlowPath(source, sink)
select sink.getNode(), source, sink,
"$@ flows to regular expression operation with user-provided regex.", source.getNode(),
"User-provided value"
45 changes: 45 additions & 0 deletions java/ql/src/semmle/code/java/JDK.qll
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,11 @@ class TypeFile extends Class {
TypeFile() { this.hasQualifiedName("java.io", "File") }
}

/** The class `java.util.regex.Pattern`. */
class TypePattern extends Class {
TypePattern() { this.hasQualifiedName("java.util.regex", "Pattern") }
}

// --- Standard methods ---
/**
* Any of the methods named `command` on class `java.lang.ProcessBuilder`.
Expand Down Expand Up @@ -291,6 +296,46 @@ class MethodMathMax extends Method {
}
}

/**
* Any method named `compile` on class `java.util.regex.Pattern`.
*/
class MethodPatternCompile extends Method {
MethodPatternCompile() {
this.hasName("compile") and
this.getDeclaringType() instanceof TypePattern
}
}

/**
* Any method named `matches` on class `java.util.regex.Pattern`.
*/
class MethodPatternMatches extends Method {
MethodPatternMatches() {
this.hasName("matches") and
this.getDeclaringType() instanceof TypePattern
}
}

/**
* Any method named `matcher` on class `java.util.regex.Pattern`.
*/
class MethodPatternMatcher extends Method {
MethodPatternMatcher() {
this.hasName("matcher") and
this.getDeclaringType() instanceof TypePattern
}
}

/**
* Any method named `matches` on class `java.lang.String`.
*/
class MethodStringMatches extends Method {
MethodStringMatches() {
this.hasName("matches") and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add split, replaceAll, replaceFirst?

this.getDeclaringType() instanceof TypeString
}
}

// --- Standard fields ---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/** The field `System.in`. */
class SystemIn extends Field {
Expand Down
Loading