-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Python: Add Regular Expression Injection query #5442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
61 commits
Select commit
Hold shift + click to select a range
7adc3c2
Upload ReDoS query, qhelp and tests
jorgectf bd3d2ec
Update to match consistent naming across languages
jorgectf afc4f51
Remove CWE references
jorgectf 21f8135
Move to experimental folder
jorgectf 6cc7144
Apply suggestions from code review
jorgectf 63f708d
Apply suggestions
jorgectf 5dae920
Edit filenames to match consistent naming
jorgectf f45307f
Apply rebase
jorgectf e4736d0
Typo
jorgectf a1b5cc3
Typo
jorgectf 6d5a0f2
Limit Sanitizer to re.escape(arg)
jorgectf caaf543
Attempt to restructuring ReMethods and RegexExecution's modules
jorgectf 3daec8e
Enclose Sinks and ReMethods in a module
jorgectf b207929
RegexExecution restructuring
jorgectf 249e409
Change query ID
jorgectf b27b77c
Apply suggestions from code review
jorgectf 0f20eeb
Apply suggestions
jorgectf 444a15a
Polish imports
jorgectf 28fdeba
Structure development
jorgectf a1a3c98
Undo main Concepts.qll change
jorgectf d61adcc
Take main Concepts.qll out of the PR
jorgectf b5ea41f
Fix CompiledRegex
jorgectf ce23db2
Move Sanitizer to ReEscapeCall
jorgectf ee1d2b6
Delete DirectRegex and CompiledRegex
jorgectf 30554a1
Format
jorgectf 3d990c5
Get back to ApiGraphs
jorgectf 805f86a
Polish RegexEscape
jorgectf be09ffe
Create RegexEscape Range
jorgectf c127b10
Create re.compile().ReMethod test
jorgectf 35f1c45
Change from Attribute to DataFlow::CallCfgNode in getRegexMethod()
jorgectf 36cc7b5
Fix CompiledRegex
jorgectf 53d61c4
Use custom Sink
jorgectf 18ce257
Move RegexInjectionSink to query config (qll)
jorgectf e78e2ac
Get rid of (get)regexMethod
jorgectf a5850f4
Use getRegexModule to know used lib
jorgectf f751103
Fix Sink utilization in select
jorgectf 66ee67a
Polished select statement
jorgectf c54f08f
Improve qhelp
jorgectf 0e169ba
Format qhelp
jorgectf d49c23f
Improve tests' readability
jorgectf d4a89b2
Fix qhelp typo while converting to python's regex injection
jorgectf b672197
Improve code comments
jorgectf 3655514
Fix ambiguity
jorgectf fc27c6c
Fix RegexExecution ambiguity
jorgectf 03825a6
Add comment to Sink's predicates
jorgectf ec85ee4
Sink's predicate typo
jorgectf d401d18
Add .expected and qlref
jorgectf 81d23c0
Move tests and qlref from /src to /test
jorgectf d968eea
Move expected to /test
jorgectf 6a20a4d
Add newline to qhelp
jorgectf 3fae3fd
Take ApiGraphs out of Concepts.qll
jorgectf 05ee853
Remove wrong comment
jorgectf 12ccd7e
Update .expected
jorgectf c432284
Polish qhelp
jorgectf c0c71c5
Apply suggestions from code review
jorgectf 20b532e
Update to-cast sink's naming
jorgectf 8a80098
Remove unused class variables
jorgectf 21e01b8
Add code example in CompiledRegex
jorgectf 213d011
Edit code example in CompiledRegex
jorgectf bd4b189
Polish documentation consistency
jorgectf 78370cf
Update python/ql/src/experimental/semmle/python/frameworks/Stdlib.qll
yoff File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
45 changes: 45 additions & 0 deletions
45
python/ql/src/experimental/Security/CWE-730/RegexInjection.qhelp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
<!DOCTYPE qhelp PUBLIC | ||
"-//Semmle//qhelp//EN" | ||
"qhelp.dtd"> | ||
<qhelp> | ||
<overview> | ||
<p> | ||
Constructing a regular expression with unsanitized user input is dangerous as a malicious user may | ||
be able to modify the meaning of the expression. In particular, such a user may be able to provide | ||
a regular expression fragment that takes exponential time in the worst case, and use that to | ||
perform a Denial of Service attack. | ||
</p> | ||
</overview> | ||
|
||
<recommendation> | ||
<p> | ||
Before embedding user input into a regular expression, use a sanitization function such as | ||
<code>re.escape</code> to escape meta-characters that have a special meaning regarding | ||
regular expressions' syntax. | ||
</p> | ||
</recommendation> | ||
|
||
<example> | ||
<p> | ||
The following examples are based on a simple Flask web server environment. | ||
</p> | ||
<p> | ||
The following example shows a HTTP request parameter that is used to construct a regular expression | ||
without sanitizing it first: | ||
</p> | ||
<sample src="re_bad.py" /> | ||
<p> | ||
Instead, the request parameter should be sanitized first, for example using the function | ||
<code>re.escape</code>. This ensures that the user cannot insert characters which have a | ||
special meaning in regular expressions. | ||
</p> | ||
<sample src="re_good.py" /> | ||
</example> | ||
|
||
<references> | ||
<li>OWASP: <a href="https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS">Regular expression Denial of Service - ReDoS</a>.</li> | ||
<li>Wikipedia: <a href="https://en.wikipedia.org/wiki/ReDoS">ReDoS</a>.</li> | ||
<li>Python docs: <a href="https://docs.python.org/3/library/re.html">re</a>.</li> | ||
<li>SonarSource: <a href="https://rules.sonarsource.com/python/type/Vulnerability/RSPEC-2631">RSPEC-2631</a>.</li> | ||
</references> | ||
</qhelp> |
29 changes: 29 additions & 0 deletions
29
python/ql/src/experimental/Security/CWE-730/RegexInjection.ql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
/** | ||
* @name Regular expression injection | ||
* @description User input should not be used in regular expressions without first being escaped, | ||
* otherwise a malicious user may be able to inject an expression that could require | ||
* exponential time on certain inputs. | ||
* @kind path-problem | ||
* @problem.severity error | ||
* @id py/regex-injection | ||
* @tags security | ||
* external/cwe/cwe-730 | ||
* external/cwe/cwe-400 | ||
*/ | ||
|
||
// determine precision above | ||
import python | ||
import experimental.semmle.python.security.injection.RegexInjection | ||
import DataFlow::PathGraph | ||
|
||
from | ||
RegexInjectionFlowConfig config, DataFlow::PathNode source, DataFlow::PathNode sink, | ||
RegexInjectionSink regexInjectionSink, Attribute methodAttribute | ||
where | ||
config.hasFlowPath(source, sink) and | ||
regexInjectionSink = sink.getNode() and | ||
methodAttribute = regexInjectionSink.getRegexMethod() | ||
select sink.getNode(), source, sink, | ||
"$@ regular expression is constructed from a $@ and executed by $@.", sink.getNode(), "This", | ||
source.getNode(), "user-provided value", methodAttribute, | ||
regexInjectionSink.getRegexModule() + "." + methodAttribute.getName() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
from flask import request, Flask | ||
import re | ||
|
||
|
||
@app.route("/direct") | ||
def direct(): | ||
unsafe_pattern = request.args["pattern"] | ||
re.search(unsafe_pattern, "") | ||
|
||
|
||
@app.route("/compile") | ||
def compile(): | ||
unsafe_pattern = request.args["pattern"] | ||
compiled_pattern = re.compile(unsafe_pattern) | ||
compiled_pattern.search("") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
from flask import request, Flask | ||
import re | ||
|
||
|
||
@app.route("/direct") | ||
def direct(): | ||
unsafe_pattern = request.args['pattern'] | ||
safe_pattern = re.escape(unsafe_pattern) | ||
re.search(safe_pattern, "") | ||
|
||
|
||
@app.route("/compile") | ||
def compile(): | ||
unsafe_pattern = request.args['pattern'] | ||
safe_pattern = re.escape(unsafe_pattern) | ||
compiled_pattern = re.compile(safe_pattern) | ||
compiled_pattern.search("") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
53 changes: 53 additions & 0 deletions
53
python/ql/src/experimental/semmle/python/security/injection/RegexInjection.qll
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
/** | ||
* Provides a taint-tracking configuration for detecting regular expression injection | ||
* vulnerabilities. | ||
*/ | ||
|
||
import python | ||
import experimental.semmle.python.Concepts | ||
import semmle.python.dataflow.new.DataFlow | ||
import semmle.python.dataflow.new.TaintTracking | ||
import semmle.python.dataflow.new.RemoteFlowSources | ||
|
||
/** | ||
* A class to find methods executing regular expressions. | ||
* | ||
* See `RegexExecution` | ||
*/ | ||
class RegexInjectionSink extends DataFlow::Node { | ||
string regexModule; | ||
Attribute regexMethod; | ||
|
||
RegexInjectionSink() { | ||
exists(RegexExecution reExec | | ||
jorgectf marked this conversation as resolved.
Show resolved
Hide resolved
|
||
this = reExec.getRegexNode() and | ||
regexModule = reExec.getRegexModule() and | ||
regexMethod = reExec.(DataFlow::CallCfgNode).getFunction().asExpr().(Attribute) | ||
) | ||
} | ||
|
||
/** | ||
* Gets the argument containing the executed expression. | ||
*/ | ||
string getRegexModule() { result = regexModule } | ||
|
||
/** | ||
* Gets the method used to execute the regular expression. | ||
*/ | ||
Attribute getRegexMethod() { result = regexMethod } | ||
} | ||
|
||
/** | ||
* A taint-tracking configuration for detecting regular expression injections. | ||
*/ | ||
class RegexInjectionFlowConfig extends TaintTracking::Configuration { | ||
RegexInjectionFlowConfig() { this = "RegexInjectionFlowConfig" } | ||
|
||
override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource } | ||
|
||
override predicate isSink(DataFlow::Node sink) { sink instanceof RegexInjectionSink } | ||
|
||
override predicate isSanitizer(DataFlow::Node sanitizer) { | ||
sanitizer = any(RegexEscape reEscape).getRegexNode() | ||
} | ||
} |
27 changes: 27 additions & 0 deletions
27
python/ql/test/experimental/query-tests/Security/CWE-730/RegexInjection.expected
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
edges | ||
| re_bad.py:13:22:13:28 | ControlFlowNode for request | re_bad.py:13:22:13:33 | ControlFlowNode for Attribute | | ||
| re_bad.py:13:22:13:33 | ControlFlowNode for Attribute | re_bad.py:13:22:13:44 | ControlFlowNode for Subscript | | ||
| re_bad.py:13:22:13:44 | ControlFlowNode for Subscript | re_bad.py:14:15:14:28 | ControlFlowNode for unsafe_pattern | | ||
| re_bad.py:24:22:24:28 | ControlFlowNode for request | re_bad.py:24:22:24:33 | ControlFlowNode for Attribute | | ||
| re_bad.py:24:22:24:33 | ControlFlowNode for Attribute | re_bad.py:24:22:24:44 | ControlFlowNode for Subscript | | ||
| re_bad.py:24:22:24:44 | ControlFlowNode for Subscript | re_bad.py:25:35:25:48 | ControlFlowNode for unsafe_pattern | | ||
| re_bad.py:36:22:36:28 | ControlFlowNode for request | re_bad.py:36:22:36:33 | ControlFlowNode for Attribute | | ||
| re_bad.py:36:22:36:33 | ControlFlowNode for Attribute | re_bad.py:36:22:36:44 | ControlFlowNode for Subscript | | ||
| re_bad.py:36:22:36:44 | ControlFlowNode for Subscript | re_bad.py:37:16:37:29 | ControlFlowNode for unsafe_pattern | | ||
nodes | ||
| re_bad.py:13:22:13:28 | ControlFlowNode for request | semmle.label | ControlFlowNode for request | | ||
| re_bad.py:13:22:13:33 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute | | ||
| re_bad.py:13:22:13:44 | ControlFlowNode for Subscript | semmle.label | ControlFlowNode for Subscript | | ||
| re_bad.py:14:15:14:28 | ControlFlowNode for unsafe_pattern | semmle.label | ControlFlowNode for unsafe_pattern | | ||
| re_bad.py:24:22:24:28 | ControlFlowNode for request | semmle.label | ControlFlowNode for request | | ||
| re_bad.py:24:22:24:33 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute | | ||
| re_bad.py:24:22:24:44 | ControlFlowNode for Subscript | semmle.label | ControlFlowNode for Subscript | | ||
| re_bad.py:25:35:25:48 | ControlFlowNode for unsafe_pattern | semmle.label | ControlFlowNode for unsafe_pattern | | ||
| re_bad.py:36:22:36:28 | ControlFlowNode for request | semmle.label | ControlFlowNode for request | | ||
| re_bad.py:36:22:36:33 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute | | ||
| re_bad.py:36:22:36:44 | ControlFlowNode for Subscript | semmle.label | ControlFlowNode for Subscript | | ||
| re_bad.py:37:16:37:29 | ControlFlowNode for unsafe_pattern | semmle.label | ControlFlowNode for unsafe_pattern | | ||
#select | ||
| re_bad.py:14:15:14:28 | ControlFlowNode for unsafe_pattern | re_bad.py:13:22:13:28 | ControlFlowNode for request | re_bad.py:14:15:14:28 | ControlFlowNode for unsafe_pattern | $@ regular expression is constructed from a $@ and executed by $@. | re_bad.py:14:15:14:28 | ControlFlowNode for unsafe_pattern | This | re_bad.py:13:22:13:28 | ControlFlowNode for request | user-provided value | re_bad.py:14:5:14:13 | Attribute | re.search | | ||
| re_bad.py:25:35:25:48 | ControlFlowNode for unsafe_pattern | re_bad.py:24:22:24:28 | ControlFlowNode for request | re_bad.py:25:35:25:48 | ControlFlowNode for unsafe_pattern | $@ regular expression is constructed from a $@ and executed by $@. | re_bad.py:25:35:25:48 | ControlFlowNode for unsafe_pattern | This | re_bad.py:24:22:24:28 | ControlFlowNode for request | user-provided value | re_bad.py:26:5:26:27 | Attribute | re.search | | ||
| re_bad.py:37:16:37:29 | ControlFlowNode for unsafe_pattern | re_bad.py:36:22:36:28 | ControlFlowNode for request | re_bad.py:37:16:37:29 | ControlFlowNode for unsafe_pattern | $@ regular expression is constructed from a $@ and executed by $@. | re_bad.py:37:16:37:29 | ControlFlowNode for unsafe_pattern | This | re_bad.py:36:22:36:28 | ControlFlowNode for request | user-provided value | re_bad.py:37:5:37:37 | Attribute | re.search | |
1 change: 1 addition & 0 deletions
1
python/ql/test/experimental/query-tests/Security/CWE-730/RegexInjection.qlref
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
experimental/Security/CWE-730/RegexInjection.ql |
40 changes: 40 additions & 0 deletions
40
python/ql/test/experimental/query-tests/Security/CWE-730/re_bad.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
from flask import request, Flask | ||
import re | ||
|
||
app = Flask(__name__) | ||
|
||
|
||
@app.route("/direct") | ||
def direct(): | ||
""" | ||
A RemoteFlowSource is used directly as re.search's pattern | ||
""" | ||
|
||
unsafe_pattern = request.args["pattern"] | ||
re.search(unsafe_pattern, "") | ||
|
||
|
||
@app.route("/compile") | ||
def compile(): | ||
""" | ||
A RemoteFlowSource is used directly as re.compile's pattern | ||
which also executes .search() | ||
""" | ||
|
||
unsafe_pattern = request.args["pattern"] | ||
compiled_pattern = re.compile(unsafe_pattern) | ||
compiled_pattern.search("") | ||
|
||
|
||
@app.route("/compile_direct") | ||
def compile_direct(): | ||
""" | ||
A RemoteFlowSource is used directly as re.compile's pattern | ||
which also executes .search() in the same line | ||
""" | ||
|
||
unsafe_pattern = request.args["pattern"] | ||
re.compile(unsafe_pattern).search("") | ||
|
||
# if __name__ == "__main__": | ||
# app.run(debug=True) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.