Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwanted encoding of "$" in URL path #246

Closed
skirino opened this issue Oct 2, 2015 · 10 comments
Closed

Unwanted encoding of "$" in URL path #246

skirino opened this issue Oct 2, 2015 · 10 comments

Comments

@skirino
Copy link

skirino commented Oct 2, 2015

Similar to #176 but this time is about $.

I'm trying to access to a URL whose path contains $, e.g.

/jenkins/$stapler/bound/e7b78ef7-b201-4cc2-bae6-3cf2d41f6ed4/render

hackney converts $ to %24 before sending request and results in 404 NotFound error.

For reference:

@deadtrickster
Copy link
Contributor

excerpt from rfc:

URI producing applications should percent-encode data octets that
   correspond to characters in the reserved set unless these characters
   are specifically allowed by the URI scheme to represent data in that
   component.  If a reserved character is found in a URI component and
   no delimiting role is known for that character, then it must be
   interpreted as representing the data octet corresponding to that
   character's encoding in US-ASCII.

I'm not sure $ has any special meaning in http context. looks more like url parsing/routing failure on the server-side

@skirino
Copy link
Author

skirino commented Oct 19, 2015

@deadtrickster Thank you for your comment.
I believe that Sean B. Durkin's answer in stackoverflow is correct, i.e. $ has no special meaning in the URI syntax.
Actually http_uri:encode/1 does not convert $:

1> http_uri:encode("$").
"$"

@benoitc
Copy link
Owner

benoitc commented Oct 21, 2015

The hackney URI parser is based on the chromium one and http_uri does less.

Reading the RFC 3986 , reserved characters should be percent encoded unless these characters are specifically allowed by the URI scheme to represent data in that component.

Passing a list of reserved chars to the request may be a solution. Thoughts?

@dtykocki
Copy link

@benoitc -- I have a potential fix to this issue on a local branch. If y'all decide not to go down the reserved chars path, I'll submit a PR.

@skirino
Copy link
Author

skirino commented Oct 23, 2015

@benoitc
Thank you for your response!
Adding an option to control character encoding sounds great, as it resolves not only this issue but also potential issues when interacting with "exotic" web apps.

@benoitc
Copy link
Owner

benoitc commented Oct 23, 2015

@dtykocki 👍 feel free to send early so we can play with the patch :) I am myself working on hackney today

@jflatow
Copy link

jflatow commented Dec 1, 2015

👍 For better or worse, there are APIs that use suspicious characters. For instance, I can't use the LinkedIn API with Hackney because they use "(" and ")" in their URLs: https://developer.linkedin.com/docs/signin-with-linkedin

@benoitc
Copy link
Owner

benoitc commented Dec 1, 2015

@jflatow you mean they don'decode urls?

i thin the best solution for that would be offering a mode letting the users to encode the urls the way they want and a strict mode, the default. thoughts?

@jflatow
Copy link

jflatow commented Dec 1, 2015

Right - yes - I agree that would be a good option.

This was referenced Jan 24, 2016
@terinjokes
Copy link
Contributor

@benoitc I'm also running into a situation where a service outside my control expects certain characters to not be URL encoded. (In this case, they're using it as a delimiter, which the linked RFC section above specifically notes is the scenario where precent encoding the reserved character would produce the wrong response)


EDIT

The Chromium source linked to from hackney_url, marks @ as PASS in the lookup table.

I can send a PR to update the lookup table here.

benoitc added a commit that referenced this issue Mar 21, 2016
This function allows the users to bypass default hackney strict PATH encoding
similar chromium algorithm. So they can handle servers that need a specific
encoding.

fix #277; #272, #246
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants