Skip to content

Conversation

hmac
Copy link
Contributor

@hmac hmac commented Mar 27, 2022

This is a port of the JS query of the same name. A few adjustments are needed to adapt to our dataflow and regex libraries, so it's not so easy to share the implementation. I've also added a Ruby-specific check for regexes that use ^ and $, which match the start/end of a line rather than the whole string. These regexes should probably use \A and \z instead.

@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as ^ or $. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/ TODO: implement this part of the query

TODO: describe the danger of using line anchors like ^ or $.

References

1 similar comment
@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as ^ or $. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/ TODO: implement this part of the query

TODO: describe the danger of using line anchors like ^ or $.

References

@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as ^ or $. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/

TODO: describe the danger of using line anchors like ^ or $.

References

@hmac hmac force-pushed the hmac/missing-regexp-anchor branch from 5edf926 to 6535c7d Compare March 28, 2022 23:07
@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as ^ or $. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/

TODO: describe the danger of using line anchors like ^ or $.

References

@hmac hmac force-pushed the hmac/missing-regexp-anchor branch from 6535c7d to b7a5bc4 Compare March 28, 2022 23:28
@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as ^ or $. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/

TODO: describe the danger of using line anchors like ^ or $.

References

@hmac hmac force-pushed the hmac/missing-regexp-anchor branch from b7a5bc4 to f3a5b41 Compare April 3, 2022 23:28
@github-actions
Copy link
Contributor

github-actions bot commented Apr 3, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as ^ or $. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/

TODO: describe the danger of using line anchors like ^ or $.

References

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Apr 3, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as ^ or $. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/

TODO: describe the danger of using line anchors like ^ or $.

References

@github-actions
Copy link
Contributor

github-actions bot commented Apr 4, 2022

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as \A or \z. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/

In Ruby the anchors ^ and $ match the start and end of a line, whereas the anchors \A and \z match the start and end of the entire string. Using line anchors can be dangerous, as this can allow malicious input to be hidden using newlines, leading to vulnerabilities such as HTTP header injection. Unless you specifically need the line-matching behaviour of ^ and $, you should use \A and \z instead.

References

@hmac hmac marked this pull request as ready for review April 14, 2022 03:31
@hmac hmac requested a review from a team as a code owner April 14, 2022 03:31
@github-actions
Copy link
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-020/MissingRegExpAnchor.qhelp

Missing regular expression anchor

Sanitizing untrusted input with regular expressions is a common technique. However, it is error-prone to match untrusted input against regular expressions without anchors such as \A or \z. Malicious input can bypass such security checks by embedding one of the allowed patterns in an unexpected location.

Even if the matching is not done in a security-critical context, it may still cause undesirable behavior when the regular expression accidentally matches.

Recommendation

Use anchors to ensure that regular expressions match at the expected locations.

Example

The following example code checks that a URL redirection will reach the example.com domain, or one of its subdomains, and not some malicious site.

class UsersController < ActionController::Base
    def index
        # BAD: the host of `params[:url]` may be controlled by an attacker
        if params[:url].match? /https?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

The check with the regular expression match is, however, easy to bypass. For example by embedding http://example.com/ in the query string component: http://evil-example.net/?x=http://example.com/. Address these shortcomings by using anchors in the regular expression instead:

class UsersController < ActionController::Base
    def index
        # GOOD: the host of `params[:url]` can not be controlled by an attacker
        if params[:url].match? /\Ahttps?:\/\/www\.example\.com\//
            redirect_to params[:url]
        end
    end
end

A related mistake is to write a regular expression with multiple alternatives, but to only include an anchor for one of the alternatives. As an example, the regular expression /^www\.example\.com|beta\.example\.com/ will match the host evil.beta.example.com because the regular expression is parsed as /(^www\.example\.com)|(beta\.example\.com)/

In Ruby the anchors ^ and $ match the start and end of a line, whereas the anchors \A and \z match the start and end of the entire string. Using line anchors can be dangerous, as this can allow malicious input to be hidden using newlines, leading to vulnerabilities such as HTTP header injection. Unless you specifically need the line-matching behaviour of ^ and $, you should use \A and \z instead.

References

alexrford
alexrford previously approved these changes Apr 26, 2022
Copy link
Contributor

@alexrford alexrford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I like the addition of checks for ^/$ rather than \A/\z.

There's an unrelated build failure in the DCA run that may be fixed by a rebase.

@hmac hmac force-pushed the hmac/missing-regexp-anchor branch from 5dffc67 to bbc3043 Compare April 26, 2022 22:33
@hmac
Copy link
Contributor Author

hmac commented Apr 27, 2022

There's an unrelated build failure in the DCA run that may be fixed by a rebase.

It finally succeeded! It feels like I've been attempting DCA runs on this PR since the dawn of time...

The vast majority of results are in tests, but I've checked a few of the non-test ones and they look legitimate.

@hmac hmac merged commit f4453f4 into github:main Apr 27, 2022
@hmac hmac deleted the hmac/missing-regexp-anchor branch April 27, 2022 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants