Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ReDoS linter + remove complex regexes #3602

Merged
merged 9 commits into from Oct 20, 2021

Conversation

antgamdia
Copy link
Contributor

@antgamdia antgamdia commented Oct 18, 2021

Description of the change

After an open issue, we discovered we are adding some ReDoS vulnerable regexes, meaning some malformed (or even legal) inputs could make Kubeapps crash. Since it is mainly client code, the worst-case scenario is a browser tab crash, but, of course, it is sth we should prevent from happening.

This PR is just to add a linter checking the existence of regex that could result in a problem (aka having quadratic or exponential complexity).

Benefits

We will detect problematic regex at build time.

Possible drawbacks

Perhaps the lining step will take a bit longer, but I think it is worth it.

Applicable issues

Additional information

This PR won't pass the tests since currently, we have some vulnerable regexes. At some point it will, though (hopefully).

Whereas this PR was initially for adding the linter, I take the opportunity to also remove the complex regexes we were using.

Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
@absoludity
Copy link
Contributor

This PR won't pass the tests since currently, we have some vulnerable regexes. At some point it will, though (hopefully).

Can we just update the offending lines in this PR so it's landable? Or is there an issue doing so?

@antgamdia
Copy link
Contributor Author

Can we just update the offending lines in this PR so it's landable? Or is there an issue doing so?

We could ignore the offending lines, but solving them would require some extra effort. In fact, solving it would address this issue #3587 (comment).
The main problem here is coming up with a regex validating a "valid URL, but without including some characters and with a complexity less than O(x^n)". The other one should be easier to fix, though.

Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
@antgamdia antgamdia changed the title Add ReDoS linter Add ReDoS linter + remove complex regexes Oct 19, 2021
Comment on lines +53 to +72
export function ShouldGenerateLink(url: string): boolean {
// Check if it is a valid URL, delegating the check to the browser API
let builtURL;
try {
// Note this browser API constructor will accept many invalid URLs (not well-encoded, wrong IP octets, etc.)
builtURL = new URL(url);
} catch (_) {
return false;
}
if (!(builtURL.protocol === "http:" || builtURL.protocol === "https:")) {
return false;
}

// Check if any url fragment includes a character likely used in a regex, that is, is not encoded
// c.f. https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
// even if it won't catch every possible case, we avoid having an O(x^n) regex
const regex = /^[^:/?#[\]@!$&'*+,;=]*$/;

// If it is an URL but it includes some "ilegal" characters, then return false
return !builtURL.pathname.split("/")?.some(p => !regex.test(p));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new proposed way to check if the URL is "renderable" or not.
Instead of using a complex regex, we just delegate the tedious check to the browser API and then just apply an O(n) regex to check if it contains some illegal (aka non-encoded) character.

Some false positives and negatives could arise, but the drawback is just a link being rendered (nor not) in the UI. I mean, the text will always appear; it is just the href target. We better fail in generating the link rather than blocking the UI because of a malformed string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, sounds good.

"base64url.bearer.authorization.k8s.io." + btoa(token).replace(/=*$/g, ""),
// Trimming the b64 padding character ("=") as it is not accepted by k8s
// https://github.com/kubernetes/apiserver/blob/master/pkg/authentication/request/websocket/protocol.go#L38
"base64url.bearer.authorization.k8s.io." + btoa(token).replace("=", ""),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that = is always a padding character in base64, we can safely remove it without any fancy (but O(n²) haha regex)

image
.

@@ -1,9 +1,6 @@
import { IIngressSpec, IIngressTLS, IResource } from "shared/types";
import { IURLItem } from "./IURLItem";

const isURLRegex =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to remove this monolith:

image (1)

Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Copy link
Contributor

@absoludity absoludity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice solution.

} catch (_) {
return false;
}
if (!(builtURL.protocol === "http:" || builtURL.protocol === "https:")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we don't need to worry about data: urls, given they weren't matched by the previous regex.

Comment on lines +53 to +72
export function ShouldGenerateLink(url: string): boolean {
// Check if it is a valid URL, delegating the check to the browser API
let builtURL;
try {
// Note this browser API constructor will accept many invalid URLs (not well-encoded, wrong IP octets, etc.)
builtURL = new URL(url);
} catch (_) {
return false;
}
if (!(builtURL.protocol === "http:" || builtURL.protocol === "https:")) {
return false;
}

// Check if any url fragment includes a character likely used in a regex, that is, is not encoded
// c.f. https://datatracker.ietf.org/doc/html/rfc3986#section-2.2
// even if it won't catch every possible case, we avoid having an O(x^n) regex
const regex = /^[^:/?#[\]@!$&'*+,;=]*$/;

// If it is an URL but it includes some "ilegal" characters, then return false
return !builtURL.pathname.split("/")?.some(p => !regex.test(p));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, sounds good.

@@ -42,7 +42,9 @@ export class Auth {
return [];
}
return [
"base64url.bearer.authorization.k8s.io." + btoa(token).replace(/=*$/g, ""),
// Trimming the b64 padding character ("=") as it is not accepted by k8s
// https://github.com/kubernetes/apiserver/blob/master/pkg/authentication/request/websocket/protocol.go#L38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since code changes in main branches, best to link to the code in the latest tag imo. (much less likely to change).

"base64url.bearer.authorization.k8s.io." + btoa(token).replace(/=*$/g, ""),
// Trimming the b64 padding character ("=") as it is not accepted by k8s
// https://github.com/kubernetes/apiserver/blob/master/pkg/authentication/request/websocket/protocol.go#L38
"base64url.bearer.authorization.k8s.io." + btoa(token).replaceAll("=", ""),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why you weren't just removing them from the end, but given that = isn't part of radix-64, they can only be present as padding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it won't appear if the string is a multiple of a 3 character number since b64 encodes each 3 bytes. There is a good answer here: https://stackoverflow.com/questions/6916805/why-does-a-base64-encoded-string-have-an-sign-at-the-end
As you said, "=" can only be padding, so it is safe to be wiped out.

@antgamdia antgamdia merged commit 04e8109 into vmware-tanzu:master Oct 20, 2021
@antgamdia antgamdia deleted the addRedosLinter branch October 20, 2021 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UI: InternalError too much recursion when the ns exceeds some length
2 participants