-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby: IncompleteHostnameRegExp.ql #7917
Conversation
272c173
to
daf0b6a
Compare
daf0b6a
to
156c943
Compare
QHelp previews: ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, pay special attention to the ExampleThe following example code checks that a URL redirection will reach the UNSAFE_REGEX = /(www|beta).example.com\//
SAFE_REGEX = /(www|beta)\.example\.com\//
def unsafe
target = params[:target]
if UNSAFE_REGEX.match(target)
redirect_to target
end
end
def safe
target = params[:target]
if SAFE_REGEX.match(target)
redirect_to target
end
end The The References
|
"(?<!\\\\)[.]" + | ||
// immediately followed by a sequence of subdomains, perhaps with some regex characters mixed in, followed by a known TLD | ||
"([():|?a-z0-9-]+(\\\\)?[.](" + commonTopLevelDomainRegex() + "))" + ".*", 1) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, is it worth adding a shared HostnameRegExpUtils.qll
file to share this with the python implementation? I notice the JS version has a different implementation. I don't know whether we (the dynamic teams) want to consider consolidating them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of history: The one you copied originated in JS, written before we parsed string literals as RegExps. When we started extracting strings as RegExps, it was rewritten to its current form.
I'd say the new version is better, if you have ASTs for regexps.
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.ql
Outdated
Show resolved
Hide resolved
ruby/ql/src/change-notes/2022-02-10-incomplete-hostname-regexp.md
Outdated
Show resolved
Hide resolved
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelp
Outdated
Show resolved
Hide resolved
ruby/ql/src/change-notes/2022-02-10-incomplete-hostname-regexp.md
Outdated
Show resolved
Hide resolved
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the UNSAFE_REGEX = /(www|beta).example.com\//
SAFE_REGEX = /(www|beta)\.example\.com\//
def unsafe
target = params[:target]
if UNSAFE_REGEX.match(target)
redirect_to target
end
end
def safe
target = params[:target]
if SAFE_REGEX.match(target)
redirect_to target
end
end The The References
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for the Python docs changes.
605a2e0
to
6d3d2d5
Compare
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the UNSAFE_REGEX = /(www|beta).example.com\//
SAFE_REGEX = /(www|beta)\.example\.com\//
def unsafe
target = params[:target]
if UNSAFE_REGEX.match(target)
redirect_to target
end
end
def safe
target = params[:target]
if SAFE_REGEX.match(target)
redirect_to target
end
end The The References
|
6d3d2d5
to
e735678
Compare
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host;
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
e735678
to
1df3f19
Compare
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host;
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
1df3f19
to
3bd081f
Compare
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host;
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments and suggestions. I did not review the query in detail. I have no doubt that the original JavaScript query is excellent. 🙂
My only lingering concern here is the fact that the string "\."
in JavaScript gets interpreted as just a string containing a period (the backslash disappears) whereas in Python, it is interpreted as the string "\\."
(i.e. the backslash is interpreted as just a literal backslash, since \.
is not a valid escape). However, I'm guessing that this detail is abstracted away by relying on the regex library anyway, and so it should be fine.
@@ -48,6 +49,19 @@ newtype TRegExpParent = | |||
/** A back reference */ | |||
TRegExpBackRef(Regex re, int start, int end) { re.backreference(start, end) } | |||
|
|||
/** | |||
* Provides regular expression patterns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very general description. So much so that I'm not actually sure what it means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I mindlessly copied it from
codeql/javascript/ql/lib/semmle/javascript/Regexp.qll
Lines 992 to 1003 in df9533f
/** | |
* Provides regular expression patterns. | |
*/ | |
module RegExpPatterns { | |
/** | |
* Gets a pattern that matches common top-level domain names in lower case. | |
*/ | |
string commonTLD() { | |
// according to ranking by http://google.com/search?q=site:.<<TLD>> | |
result = "(?:com|org|edu|gov|uk|net|io)(?![a-z0-9])" | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would Provides utility predicates related to regular expressions.
be any better?
/** | ||
* Gets a pattern that matches common top-level domain names in lower case. | ||
*/ | ||
string commonTLD() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is not really in line with our style guide. At the very least, it should be Tld
not TLD
. Also, since this function returns a result, it should perhaps start with get
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ 👍
it should perhaps start with get.
And since it has multiple results it should probably start with getA
. So getACommonTld()
.
@@ -751,6 +767,9 @@ class RegExpGroup extends RegExpTerm, TRegExpGroup { | |||
*/ | |||
int getNumber() { result = re.getGroupNumber(start, end) } | |||
|
|||
/** Holds if this is a capture group. */ | |||
predicate isCapture() { not exists(this.getNumber()) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be completely misunderstanding this predicate, but is that not
correct? Is this not expressing "not an unnamed capture group" instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe all capture groups have a number, but only some have a name. Is that assumption incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, the not
makes no sense at all. You're right.
* A node whose value may flow to a position where it is interpreted | ||
* as a part of a regular expression. | ||
*/ | ||
class RegExpPatternSource extends DataFlow::CfgNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this class should be implemented in this file. I would rather that it live somewhere closer to the rest of the data-flow implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Javascript it is in javascript/ql/lib/semmle/javascript/Regexp.qll and this file looks like the Python variant of that. I don't mind moving it to another place, but could you tell me where you'd like me to move it to?
predicate isConstantInvalidInsideOrigin(RegExpConstant term) { | ||
// Look for any of these cases: | ||
// - A character that can't occur in the origin | ||
// - Two dashes in a row |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one surprises me. What about URLs with Punycode such as https://xn--wrdle-vua.dk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erik-krogh , do you know ^ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know double dashes are valid in the hostname, but in practice it's an indicator that this regexp is not used to check a hostname string, at least not in a security-relevant context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JS changes 👍
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
7a03951
to
752f5c6
Compare
non-capture groups should not have a group number
752f5c6
to
1a51f0c
Compare
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, LGTM.
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelp
Outdated
Show resolved
Hide resolved
QHelp previews: javascript/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the app.get('/some/path', function(req, res) {
let url = req.param('url'),
host = urlLib.parse(url).host;
// BAD: the host of `url` may be controlled by an attacker
let regex = /^((www|beta).)?example.com/;
if (host.match(regex)) {
res.redirect(url);
}
}); The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
python/ql/src/Security/CWE-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is a common technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the from flask import Flask, request, redirect
import re
app = Flask(__name__)
UNSAFE_REGEX = re.compile("(www|beta).example.com/")
SAFE_REGEX = re.compile(r"(www|beta)\.example\.com/")
@app.route('/some/path/bad')
def unsafe(request):
target = request.args.get('target', '')
if UNSAFE_REGEX.match(target):
return redirect(target)
@app.route('/some/path/good')
def safe(request):
target = request.args.get('target', '')
if SAFE_REGEX.match(target):
return redirect(target) The The References
ruby/ql/src/queries/security/cwe-020/IncompleteHostnameRegExp.qhelpIncomplete regular expression for hostnamesSanitizing untrusted URLs is an important technique for preventing attacks such as request forgeries and malicious redirections. Often, this is done by checking that the host of a URL is in a set of allowed hosts. If a regular expression implements such a check, it is easy to accidentally make the check too permissive by not escaping the RecommendationEscape all meta-characters appropriately when constructing regular expressions for security checks, and pay special attention to the ExampleThe following example code checks that a URL redirection will reach the class AppController < ApplicationController
def index
url = params[:url]
host = URI(url).host
# BAD: the host of `url` may be controlled by an attacker
regex = /^((www|beta).)?example.com/
if host.match(regex)
redirect_to url
end
end
end The check is however easy to bypass because the unescaped Address this vulnerability by escaping References
|
I reverted the Python related changes and make a separate PR for them.
This pull request is a port of JavaScripts IncompleteHostnameRegExp.ql query. Most of the query and documentation are a shameless copy of the original.
The pull request is split into: