Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate support for UTF16 encoding in URIs #6447

Closed
gregw opened this issue Jun 21, 2021 · 9 comments
Closed

Deprecate support for UTF16 encoding in URIs #6447

gregw opened this issue Jun 21, 2021 · 9 comments

Comments

@gregw
Copy link
Contributor

gregw commented Jun 21, 2021

Target Jetty version(s)
9, 10, 11

Enhancement Description

The support for unicode encoded characters in URIs (eg %u2192 for the →character) have never been standardized and are little used. Unicode (or more precisely UTF-16) encoding was rejected by the W3C.

Deprecate their usage and require a compliance mode.

@gregw gregw changed the title Deprecate support for unicode encodings in URIs Deprecate support for UTF16 encoding in URIs Jun 21, 2021
@gregw gregw added this to To do in Jetty 9.4.43 FROZEN via automation Jun 21, 2021
@gregw gregw added this to To do in Jetty 10.0.6/11.0.6 FROZEN via automation Jun 21, 2021
@gregw gregw moved this from To do to In progress in Jetty 9.4.43 FROZEN Jun 21, 2021
@gregw gregw moved this from To do to In progress in Jetty 10.0.6/11.0.6 FROZEN Jun 21, 2021
lachlan-roberts added a commit that referenced this issue Jun 22, 2021
Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
lachlan-roberts added a commit that referenced this issue Jun 23, 2021
Deprecate support for UTF16 encoding in URIs.
Add compliance mode to allow UTF16 encodings.
Improve testing.
lachlan-roberts added a commit that referenced this issue Jun 24, 2021
- Merge from PR #6457.
- Also brought some other ComplianceModes back to disable ambiguous empty segments, and ambiguous encodings.

Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
@gregw gregw moved this from In progress to Review in progress in Jetty 10.0.6/11.0.6 FROZEN Jun 24, 2021
@gregw gregw moved this from Review in progress to Done in Jetty 10.0.6/11.0.6 FROZEN Jun 24, 2021
@gregw gregw moved this from In progress to Review in progress in Jetty 9.4.43 FROZEN Jun 24, 2021
@gregw gregw moved this from Review in progress to Reviewer approved in Jetty 9.4.43 FROZEN Jun 24, 2021
gregw pushed a commit that referenced this issue Jun 24, 2021
- Merge from PR #6457.
- Also brought some other ComplianceModes back to disable ambiguous empty segments, and ambiguous encodings.

Signed-off-by: Lachlan Roberts <lachlan@webtide.com>
@gregw gregw moved this from Reviewer approved to Done in Jetty 9.4.43 FROZEN Jun 24, 2021
@lachlan-roberts
Copy link
Contributor

UTF16 encoding has been deprecated and is now disabled in 10.0.x/11.0.x by default. It can be re-enabled by using a UriCompliance mode to enable Violation.UTF16_ENCODINGS.

@guusdk
Copy link

guusdk commented May 12, 2023

Apologies for reviving an old issue. I'm struggling with understanding in what versions exactly UTF-16 URL-encoding has been deprecated/disabled. This issue, as well as the changelog suggests that it has been removed in 9.4.43 as well as 10.0.x and 11.0.x. However, the default configuration of a 9.4.43 does appear to retain UTF-16 URL encoding. This matches with @lachlan-roberts 's comment, which mentions deprecating the feature only in 10.0.x and 11.0.x (and does not mention 9.4.43). Was support deprecated in 9.4.43?

@gregw
Copy link
Contributor Author

gregw commented May 12, 2023

UTF-16 is still supported in the latest 9.4. release (9.4.51), and is supported through to jetty-12.
What has changed, is if support is on/off by default or not. If you want UTF-16 support, then it is available with the right compliance mode.

The deprecation was just moving support from being always on to a compliance mode.

@guusdk
Copy link

guusdk commented May 12, 2023

Thanks Greg! I worded it awkwardly, but that is the gist of what I had assumed: Instead of having UTF-16 URL encoding support enabled by default, it now is disabled by default.

What is confusing me is that the changelog suggests that this change is introduced in 9.4.43. However, when using 9.4.43 (without any explicit HttpCompliance configured, in other words, "using the default") we find that Jetty is not rejecting UTF-16 URL encoding (just like in earlier versions, but unlike version 10.0.6 for example).

@gregw
Copy link
Contributor Author

gregw commented May 12, 2023

I believe we left the defaults as there were in jetty-9 (I'm not looking in detail, as jetty-9 is end-of-life for support).

@guusdk
Copy link

guusdk commented May 15, 2023

Thanks Greg - I appreciate the effort of answering questions on an end-of-life release. What's confusing me is that this made it in the Jetty-9 changelog, while there's not an apparent change in behavior.

@lachlan-roberts
Copy link
Contributor

@guusdk this issue was assigned to both 9.4 and 10.0/11.0 projects which means the title of this issue is what is put on the change log.

@guusdk
Copy link

guusdk commented May 15, 2023

Hi Lachlan, I appreciate you helping out. It's not completely clear to me what you mean, but I'm assuming that you mean to say that this issue being assigned to the 9.x version got it in the changelog, not necessarily that any of the related work got in the same branch. That would correspond with what we're observing.

@lachlan-roberts
Copy link
Contributor

#6467 is what was merged to 9.4.x regarding this change. In the comments of this it says

it excludes the new compliance options from the current defaults so it does not change behaviour.

I was just saying that the exact title of this issue is what is put in the change log. And that title doesn't necessarily indicate a change of behaviour for 9.4 anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

4 participants
@gregw @guusdk @lachlan-roberts and others