New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percent Encoding Asterisk #3488

Closed
caseykulm opened this Issue Jul 31, 2017 · 2 comments

Comments

2 participants
@caseykulm

caseykulm commented Jul 31, 2017

Bug Report:

I'm not sure if the expected behavior is described anywhere, but I would expect the HttpUrl class to encode * as %2A, and have implemented a failing test case here.

This was also brought up a while ago in a SO post, https://stackoverflow.com/questions/39771140/should-httpurl-builder-addpathsegment-2a-encode-an-asterisk

@swankjesse

This comment has been minimized.

Show comment
Hide comment
@swankjesse

swankjesse Jul 31, 2017

Member

According to some research, major browsers don’t encode it. http://tinyurl.com/url-escaping
And some URLs use it: https://en.wikipedia.org/wiki/C*-algebra

Member

swankjesse commented Jul 31, 2017

According to some research, major browsers don’t encode it. http://tinyurl.com/url-escaping
And some URLs use it: https://en.wikipedia.org/wiki/C*-algebra

@caseykulm

This comment has been minimized.

Show comment
Hide comment
@caseykulm

caseykulm Jul 31, 2017

Seems like Section 2.2 of RFC 3986 has the answer, but I'm not quite sure which answer it's indicating.

URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component. If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.

That same section states that * is a reserved character, although I'm not sure how to interpret "unless these characters are specifically allowed by the URI scheme to represent data in that component"

caseykulm commented Jul 31, 2017

Seems like Section 2.2 of RFC 3986 has the answer, but I'm not quite sure which answer it's indicating.

URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component. If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.

That same section states that * is a reserved character, although I'm not sure how to interpret "unless these characters are specifically allowed by the URI scheme to represent data in that component"

@caseykulm caseykulm closed this Aug 2, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment