Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Description vs URL parsing #71

Closed
annevk opened this issue Nov 6, 2018 · 13 comments
Closed

Description vs URL parsing #71

annevk opened this issue Nov 6, 2018 · 13 comments
Milestone

Comments

@annevk
Copy link

annevk commented Nov 6, 2018

I haven't looked at the standard, but you might want to clarify how import: URLs are parsed [processed]. E.g., import://test.com/../y shouldn't result in being treated identically to import:/y probably.

@domenic
Copy link
Collaborator

domenic commented Nov 6, 2018

Certainly we'll want to spell this out in more detail in the proper spec. For now I'll note that the first paragraph of https://github.com/domenic/import-maps#import-urls says

import: is a new URL scheme which is reserved for purposes related to JavaScript module resolution. As non-special URLs, import: URLs just consist of two parts: the leading import:, and the following path component. (This is the same as blob: or data: URLs.)

I guess that's inaccurate (or, only accurate for "conforming" import: URLs), per https://jsdom.github.io/whatwg-url/#url=ZGF0YTovL3Rlc3QuY29tLy4uL3k=&base=YWJvdXQ6Ymxhbms= and https://jsdom.github.io/whatwg-url/#url=aW1wb3J0Oi8vdGVzdC5jb20vLi4veQ==&base=YWJvdXQ6Ymxhbms=

So this somewhat reduces to whatwg/url#385, which has unfortunately seen little movement.

@mikesamuel
Copy link

URL protocols like file: that try to be half-hierarchical suffer some weird corner cases.

For example, according to Std 66 and RFC 8089,

new URL('../bar', 'file:foo/').href === 'file:///bar'

IMO, it's reasonable to assume that resolving two relative paths gives you a relative path, but this is not the case.

This can only lead to subtle bugs and probably enables bypassing of mitigations for path traversal attacks.

If import:relative-path is definitely needed, then maybe URL Resolution Semantics should include a section on what to do when both path segments are relative, and the left URL's net .. segments are greater than or equal to the right's path segments post path-normalization.

I'm happy to provide spec language to identify corner cases like this if needed.

@mikesamuel
Copy link

@annevk There is no ambiguity in the meaning of import://test.com/../y if import is a hierarchical protocol since Std 66 specifies that a path component is

path-absolute = "/" [ segment-nz *( "/" segment ) ]

so an absolute path cannot start with a slash, an empty path component, and another slash.

@annevk
Copy link
Author

annevk commented Mar 23, 2019

@mikesamuel browsers use https://url.spec.whatwg.org/ not STD 66.

@littledan
Copy link
Contributor

I don't think we will need anything like relative path support for built-in module specifiers. Any of the options in whatwg/url#385 would avoid this complexity (and any of them seem like they would work decently for this application; we just need a choice documented).

@mikesamuel
Copy link

@annevk, Fair enough.

I believe https://url.spec.whatwg.org/#path-or-authority-state is the equivalent.

path or authority state

  • If c is U+002F (/), then set state to authority state.
  • Otherwise, set state to path state, and decrease pointer by one.

@mikesamuel
Copy link

@littledan, Why the "for built-in module specifiers" qualification?

Apologies if I'm operating from the wrong premises, but f an import: URL can have a relative path component as many examples in the README do, and there's a need to resolve a path relative to an import: URL or to resolve one import: URL relative to another, then the file: problem seems likely to come up in practice.

With file: there're some corner cases with windows drive specifiers like file:C|/ that led to it being speced as quasi-hierarchical, quasi-raw. It seems there're ways to dodge that issue for built-in module specifiers discussed: #94 (comment)

@littledan
Copy link
Contributor

Right, I was just being confused above; path-like import: URLs might make sense to process with some intelligence.

@annevk
Copy link
Author

annevk commented Mar 24, 2019

I misstated something in OP. The potential problem is not with parsing, the potential problem is with processing the resulting the URL record.

@mikesamuel
Copy link

@annevk I think inheriting file: oddness could cause many problems down the road and should be easy to avoid with some care but I don't want to hijack your issue. Would you prefer I file a separate issue to track that?

@annevk
Copy link
Author

annevk commented Mar 26, 2019

The file: oddness I'm aware of is a parser problem, which shouldn't affect import: since it's a new scheme without special handling in the URL parser. There might well be similar issues, but they'd be in the processor for this scheme, which this issue is about.

@mikesamuel
Copy link

@annevk, maybe I don't understand what you mean by "parser problem."

I realize, belatedly, that the code sample above is parser related, but the equivalent problem in STD 66 isn't. Perhaps my confusion comes from whatwg/url's differences that follow from for-scheme-state:

  1. If url’s scheme is "file", then:
    1. If remaining does not start with "//", validation error.

If import-maps specs path resolution-like operations on import: URLs' scheme specific parts then I worry about the STD 66 corner cases that would not necessarily be parser related.

@annevk
Copy link
Author

annevk commented Mar 26, 2019

Okay, as far as I can tell those corner cases are basically OP and whatwg/url#385.

@domenic domenic added this to the Full featured milestone Jul 3, 2019
@domenic domenic modified the milestones: Full featured, import: URLs Sep 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants