Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL detection accepts invalid pchar characters including > (including when used in angle brackets round whole URL) #4475

Open
davidw65 opened this issue Jun 6, 2023 · 0 comments

Comments

@davidw65
Copy link

davidw65 commented Jun 6, 2023

> is not a character matched by the pchar production for URLs, and a therefore cannot occur in, or at the end of path, fragment, or query components. It is therefore commonly used as a part of angle brackets around a URL in text to unambiguously separate the URL form surrounding text, and, in particular punctuation characters (also to prevent newlines ending the URL, although not an issue here).

If a> character has to be included in a URL, it must be percent encoded.

When you enter a URL into an update (original reports not tested), and surround it with angle brackets, the closing > is interpreted as part of the URL and when the update is viewed on a browser, the HTML sent to it contains an href attribute which includes the >, without any percent encoding, and subsequent characters are also included.

[snip reproducing on live site; the issue is clear from the description alone]

Additional context

I initially tried this because I had a previous problem with ", "after a URl. " ," is a sub-delim, and therefore a pchar, but one could argue that heuristics ought to treat it as punctuation if followed by linear white space. I think most URL detectors treat "." at the end, as punctuation, although it is an unreserved character, so would be valid.

@davidw65 davidw65 changed the title JR> URL detection accepts invalid pchar characters including > preventing when used as angle brackets round whole URL Jun 6, 2023
@davidw65 davidw65 changed the title URL detection accepts invalid pchar characters including > preventing when used as angle brackets round whole URL URL detection accepts invalid pchar characters including > including when used in angle brackets round whole URL Jun 6, 2023
@davidw65 davidw65 changed the title URL detection accepts invalid pchar characters including > including when used in angle brackets round whole URL URL detection accepts invalid pchar characters including > (including when used in angle brackets round whole URL) Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant