Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unicode URLs #15

Closed
wants to merge 1 commit into from

Conversation

blakeembrey
Copy link
Member

From pillarjs/path-to-regexp#42, I implemented it at the framework level. This is mostly for reference since I'm not sure on some behaviour (you probably want to 400 at decode failure?). Either way, I think this would be a really useful addition - especially for Express 5.0. We just need to update the readme and make it clear which normalized string method is used.

@dougwilson
Copy link
Contributor

Hi @blakeembrey I appreciate the effort :) I wanted to let you know I'm not ignoring you, it's just been super crazy this week :(

@blakeembrey
Copy link
Member Author

@dougwilson Once you get around to #13 I'll spend some time thinking about this once 😄

Currently this should not be merged since it won't work (it decodes / incorrectly). I need to think about it, but the only other alternative I can think of is to shift the dependency on path matching implementations (have a note on the README and mentioning encoding inputs to match paths). At the least, it would avoid changing any code on the router that could affect performance.

@wesleytodd
Copy link
Member

Is this PR still valid and should it be added in router@2?

@blakeembrey
Copy link
Member Author

@wesleytodd It's still valid but I don't think it should be added for V2.

One possible approach I added with the latest path-to-regexp is to do something like this: https://github.com/serviejs/servie-route/blob/c911ef96489a7350e3d3a6ca319e04c7e8879ffc/src/index.ts#L37-L41. It makes it easier to support encoded URLs correctly, but it's still not 100% without normalization of the URL at the beginning (for URL this happens automatically, for Express.js it'd need to happen at the beginning of the application or somewhere).

@blakeembrey blakeembrey closed this Jan 6, 2020
@wesleytodd
Copy link
Member

Hm, I have no idea what that linked code is doing (I guess it is some other language I dont know 😛), but I am guessing it is just delegating to the whatwg URL parsing? The effect being something like req.url = new URL(req.url)?

@wesleytodd
Copy link
Member

wesleytodd commented Jan 6, 2020

Ok, in my flippant first read I missed that it is just a call to require('path-to-regex').match which was destructured. So with that you use decodeURIComponent to do the parsing?

@blakeembrey
Copy link
Member Author

I use encodeURI to turn the path characters into percent encoded characters where relevant, and then decodeURIComponent to extract matched parameters without percent encoding. The decoding part actually already happens in the Express-router since it predates this feature. However, the functionality only works because the URL object already encodes the special Unicode characters into percent encoding already - in Express we'd need a step to normalize in some way too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants