Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding path parameters #93

Closed
ivan-tymoshenko opened this issue May 11, 2022 · 17 comments
Closed

Decoding path parameters #93

ivan-tymoshenko opened this issue May 11, 2022 · 17 comments

Comments

@ivan-tymoshenko
Copy link

Hi, I have a question about how should I work with encoded path parameters. There is one example that has all troublespots that I have.

I want to match the pattern url, and get param equal %23.

const pattern = new URLPattern('http://t.t/[]:param')

const inputPath = encodeURI('http://t.t/[]') + encodeURIComponent('%23')
pattern.exec(inputPath) // returns null
@wanderview
Copy link
Contributor

It appears encodeURI is encoding the [] but the URL standard does not do that. Try instead:

const inputPath = new URL('http://t.t/[]%23');

@ivan-tymoshenko
Copy link
Author

It's one of the reserved characters and should be encoded before sending.
https://datatracker.ietf.org/doc/html/rfc3986#section-2.2

@wanderview
Copy link
Contributor

URLPattern operates on URLs, not URIs. URLs only percent encode a few codepoints in the path:

https://url.spec.whatwg.org/#path-percent-encode-set

You can test this out on the live URL viewer here:

https://jsdom.github.io/whatwg-url/#url=aHR0cDovL3QudC9bXQ==&base=YWJvdXQ6Ymxhbms=

@ivan-tymoshenko
Copy link
Author

Is there a way how I can combine the real world that sends me encoded URI with URLPattern?

@ivan-tymoshenko
Copy link
Author

Encoded and decoded URIs should be equal. (https://datatracker.ietf.org/doc/html/rfc2616#section-3.2.3)
Is it not true for URLs?

@wanderview
Copy link
Contributor

Is there a way how I can combine the real world that sends me encoded URI with URLPattern?

I need more context about the use case.

Encoded and decoded URIs should be equal. (https://datatracker.ietf.org/doc/html/rfc2616#section-3.2.3)
Is it not true for URLs?

URL and URLPattern don't do any automatic decoding. Not sure if that is what you are asking or not.

@ivan-tymoshenko
Copy link
Author

Yes, they don't decode URLs. They "canonicalize" URLs.

new URL('%7E', 'http://h.d/').pathname === '/~'

I understand that the problem is a little bit out of scope URLPattern implementation. In practice when we receive a request URI, we need to decode it before matching. It seems that there is no correct way to decode URI with encoded parameters to match the URI.

If I receive an encoded URI, do you know what should I do with it before calling URLPattern.exec function? I mean if URLPattern doesn't do encoding/decoding, then I should do it by myself. I'm asking how I should do it.

@wanderview
Copy link
Contributor

wanderview commented May 11, 2022

You're example is a bug in chrome and not interoperable across browsers. Per the URL spec the pathname should remain /%7E:

https://jsdom.github.io/whatwg-url/#url=JTdF&base=aHR0cDovL2guZC8=

(Note, both firefox and safari correctly produce a pathname of /%7E.)

Can you not call decodeURI() on the input prior to passing the value to URLPattern.exec()?

@ivan-tymoshenko
Copy link
Author

First of all, thanks very much for your help.

  1. This is an RFC that describes how HTTP works (https://datatracker.ietf.org/doc/html/rfc2616#section-3.2.3). And it says that /%7E is a current path and equals /~. Chrome doesn't support all valid HTTP URIs? I understand that URL is a subset of URI. But when we touch on the practice it becomes a little vague for me.

  2. If it's just a static path, then yes the decodeURI() works. But URLPattern supports params in the pathname. These params should be decoded by decodeURIComponent() and not decoded by thedecodeURI(). And here we have a circle: I should decode a URL before matching and to decode it correctly I should know where the params are (I can know it only after matching).

Example:
patter = /~:param
input url = /%7E%2523
/%7E - should be decoded by decodeURI function to /~
%2523 - should be decoded by decodeURIComponent to %23

@ivan-tymoshenko
Copy link
Author

And why when I put http://ABC.com/%7Esmith/home.html to the chrome it converts it to the http://ABC.com/~smith/home.html. Sorry if it's a dumb question.

@wanderview
Copy link
Contributor

Using your example, this just seems to work for me:

const pattern = new URLPattern({pathname: '/~:param'});
const encoded = '/%7E%2523';
const decoded = decodeURI(encoded);
const result = pattern.exec({pathname: decoded});
result.pathname.groups.param === '%23';

You can of course re-encode the end result if you want it in that form. I'm not sure I quite understand.

@wanderview
Copy link
Contributor

wanderview commented May 11, 2022

And why when I put http://ABC.com/%7Esmith/home.html to the chrome it converts it to the http://ABC.com/~smith/home.html.

Browser URL bars can do extra decoding that APIs like URL() and URLPattern() do not do. Also, chrome is not conformant at the API layer with other browsers.

@ivan-tymoshenko
Copy link
Author

ivan-tymoshenko commented May 11, 2022

Using your example, this just seems to work for me:

const pattern = new URLPattern({pathname: '/~:param'});
const encoded = '/%7E%2523';
const decoded = decodeURI(encoded);
const result = pattern.exec({pathname: decoded});
result.pathname.groups.param === '%23';

You can of course re-encode the end result if you want it in that form. I'm not sure I quite understand.

You skipped the decodeURIComponent step at the end.
from the /%7E%2523 i want to get param %23
from the /%7E%23 i want to get param #

/%7E%23 => decodeURI('/%7E%23') == /~%23 => URLPattern.exec('/~%23').param == %23 => decodeURIComponent('%23') == #, that is correct

/%7E%2523 => decodeURI('/%7E%2523') == /~%23 => URLPattern.exec('/~%23').param == %23 => decodeURIComponent('%23') == #, but should be %23 (we double decode it)

@ivan-tymoshenko
Copy link
Author

And it seems like I can fetch an unsupported URL.

await fetch('https://jsdom.github.io/whatwg-ur%6c/')

https://jsdom.github.io/whatwg-url/#url=aHR0cHM6Ly9qc2RvbS5naXRodWIuaW8vd2hhdHdnLXVyJTZjLw==&base=YWJvdXQ6Ymxhbms=

@SanderElias
Copy link
Collaborator

@wanderview @kenchris @ivan-tymoshenko From the discussion above I can't decide if this is working according to specs, or if there is some bug that needs fixing.
If it is working according to spec, we should close this issue. Otherwise, we should distill some actionable issues out of this.

@wanderview
Copy link
Contributor

I think its working per spec and there is nothing actionable here.

@SanderElias
Copy link
Collaborator

That was the point i was gravitating to also.
I will close the issue. If it turns out the is some actionable issue in here, please open up a new issue referring this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants