Decode percent encodings for non-ASCII characters in `iunreserved` category on normalization #19

lo48576 · 2022-04-19T23:59:30Z

IRIs are defined similarly to URIs in [RFC3986], but the class of unreserved characters is extended by adding the characters of the UCS (Universal Character Set, [ISO10646]) beyond U+007F, subject to the limitations given in the syntax rules below and in section 6.1.

--- RFC 3987 section 2.1. Summary of IRI Syntax

These IRIs should be normalized by decoding any percent-encoded octet sequence that corresponds to an unreserved character, as described in section 2.3 of [RFC3986].

--- RFC 3987 section 5.3.2.3. Percent-Encoding Normalization

RFC 3987 says that percent-encoded octet sequences that corresponds to unreserved characters in IRI (including non-ASCII characters in iunreserved category) should be decoded.
However, current implementation of this crate just decodes ASCII unreserved characters and does not care iunreserved characters.

The text was updated successfully, but these errors were encountered:

lo48576 · 2022-04-19T23:59:42Z

This blocks #18.

lo48576 added the bug Something isn't working label Apr 19, 2022

lo48576 self-assigned this Apr 19, 2022

lo48576 changed the title ~~Percent encode non-ASCII characters in iunreserved category on normalization~~ Decode percent encodings for non-ASCII characters in iunreserved category on normalization Apr 22, 2022

lo48576 closed this as completed in 729f557 Apr 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decode percent encodings for non-ASCII characters in `iunreserved` category on normalization #19

Decode percent encodings for non-ASCII characters in `iunreserved` category on normalization #19

lo48576 commented Apr 19, 2022 •

edited

lo48576 commented Apr 19, 2022

Decode percent encodings for non-ASCII characters in iunreserved category on normalization #19

Decode percent encodings for non-ASCII characters in iunreserved category on normalization #19

Comments

lo48576 commented Apr 19, 2022 • edited

lo48576 commented Apr 19, 2022

Decode percent encodings for non-ASCII characters in `iunreserved` category on normalization #19

Decode percent encodings for non-ASCII characters in `iunreserved` category on normalization #19

lo48576 commented Apr 19, 2022 •

edited