Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relative non file paths #5814

Merged
merged 2 commits into from Jan 26, 2019
Merged

Conversation

@adobs
Copy link
Contributor

@adobs adobs commented Dec 28, 2018

Addresses #5211

Screenshots
If applicable, add before and after screenshots to help show the issue being fixed, or the appearance of a new feature.

* Relative: ./foo/bar, foo/bar
*/
isAbsolutePath(path: string): boolean {
return path[0] === '/' || path.indexOf('//') !== -1 ? true : false;
Copy link
Contributor

@jasongrout jasongrout Dec 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing URLs can get pretty tricky. How about this to make it more robust:

  1. try using the browser url parsing. If it works, then you have an absolute URL: try {new URL(path);} catch(e) {/* not an absolute url, but could be an absolute path in a relative URL*/}

  2. If we don't have an absolute URL, then we may still have an absolute path in a relative URL, so just check to see if the first character is / (which covers both the cases where it starts with / and //).

Copy link
Contributor

@jasongrout jasongrout Dec 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm...even that doesn't cut it. Look at some of the valid URLs in the current URL standard: https://tools.ietf.org/html/rfc3986.html#section-1.1.2.

What about things like tel:+1-816-555-1212? Definitely a valid URL, but not one on which we want to be prepending /files. Perhaps instead of figuring out which URLs we should be excluding from prepending /files, we instead should just be figuring out which special URLs on which we should be prepending /files.

Copy link
Contributor

@jasongrout jasongrout Dec 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, tel:+1-816-555-1212 is considered an absolute URI, so the strategy above might work still.

Reading https://tools.ietf.org/html/rfc3986.html#section-4.2, it does seem that we can use the browser new URL error to determine if we have an absolute or relative URI. If we have a relative URI, we have four possibilities:

  1. "//" authority path-abempty
  2. path-absolute (in this case, the path starts with a /)
  3. path-noscheme (in this case, the path starts with some character other than /
  4. path-empty (in this case, the path is empty, which means that the first characters of the uri could be ? or #, representing the query or hash parts.

I think we want to prepend /files in both case 3 and 4 (case 3 since it represents a file relative to the current file, and case 4 because it represents the current file). So it seems sufficient to do what I mentioned above: use the browser URL parser (new URL()) to determine if we have a relative URI. If we have a relative URI, check to see if the first character is /. If it is, don't mess with the URL. If it isn't, transform the url as we see fit. If we decide that we shouldn't transform case 4 above, we can check for case 4 by seeing if the first character is a # or ?.

Copy link
Contributor Author

@adobs adobs Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re # 4 - Are there any other cases where the first character is something else? EG % @ = &. As is, new URL("#") is successful, which would make it seem as if it's an absolute path, not a relative path, and thus /files would not be prepended.

I can check if the first character is # or ? before the try/catch statement, but want to make sure there aren't any edge cases to content with.

Copy link
Contributor

@jasongrout jasongrout Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get:

> new URL("#")
TypeError: # is not a valid URL.

Copy link
Contributor

@jasongrout jasongrout Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I tried to be very precise in my statements above. There is a difference in the spec between absolute/relative URIs and absolute/relative paths.

Regarding your question about 4, see the spec I linked: https://tools.ietf.org/html/rfc3986.html#section-4.2, namely:

      relative-ref  = relative-part [ "?" query ] [ "#" fragment ]

If relative-part is the path-empty case, it's just a zero-length string, so the relative ref must either start with ? or #.

Copy link
Contributor Author

@adobs adobs Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake -- I was testing in console of browser with new URL
image

Copy link
Contributor

@jasongrout jasongrout Dec 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, very interesting. Firefox does not allow new URL("#"), but it seems like Chrome does.

Copy link
Contributor

@jasongrout jasongrout Jan 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into the Chrome source code, but it wasn't obvious to me where the behavior is coming from. I think that Chrome may be violating the HTML whatwg spec (https://url.spec.whatwg.org/#api).

Regardless, it looks like this one special case can be checked by looking at the scheme and seeing if it is "about"?

Copy link
Contributor

@jasongrout jasongrout Jan 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Technical details for future reference]: to be clear, I think Chrome is not following the standard at https://url.spec.whatwg.org/#no-scheme-state. It should start the parsing of the input URL in the scheme-start-state, where case 2 is true, then should skip to no-scheme-state, where it should return an error in step 1 because base is null because it wasn't given.

@adobs adobs force-pushed the relative-non-file-paths branch 2 times, most recently from aefe09e to 817732e Jan 1, 2019
@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jan 5, 2019

@adobs and I talked a lot about this today, and concluded that we should modify the isLocal function to consider absolute URLs like /path/to/file as not local. We also talked about replacing the logic in isLocal to use the browser URL parsing, instead of the a href trick it uses.

@adobs - I experimented a bit more. Here's another difference between firefox and Chrome: new URL("data:something") is valid in Chrome, but not in Firefox. Apparently Firefox tries to ensure that data: URLs are actually formed according to the RFC 2397 spec, but Chrome is happy to just parse the string, and not worry about if it is actually formatted as an official data URL. This means that just catching an error will not tell you if the URL is relative or not - there may be something else wrong with the URL, other than it being relative.

A way around this is to check both if new URL(myurl) throws an error and new URL(myurl, window.location.href) is valid. But then there's the issue of what to do if the url is misformed - do we do anything?

All this is making me think that maybe we keep the a href trick in isLocal, which doesn't seem to really check the URL is a valid URL, just parses the parts, and just change the check for starting with // to instead just checking for a single /. Perhaps this whole thing really should just be a one-character change, changing '//' to '/'? (and changing the test as well.)

@adobs adobs force-pushed the relative-non-file-paths branch from 817732e to 674238d Jan 6, 2019
@adobs adobs force-pushed the relative-non-file-paths branch from 4dd32a5 to 4141bea Jan 6, 2019
@adobs
Copy link
Contributor Author

@adobs adobs commented Jan 6, 2019

@jasongrout
Interesting that Chrome has so many unstandard differences from Firefox :/

Re your comment

A way around this is to check both if new URL(myurl) throws an error and new URL(myurl, window.location.href) is valid. But then there's the issue of what to do if the url is misformed - do we do anything

It seems that, at least in Firefox, new URL(#installation) throws an error and new URL(#installation, window.location.href) does not (where window.location.href = "https://github.com/jupyterlab/jupyterlab")
Would a user even have access to the second parameter of the new URL constructor? Or how is this check sufficient?


I have two commits, so it'll be easy to pick the version we want

  1. commit 4141bea edited url.ts isLocal with new URL
  • this version has isLocal with new URL and the other changes we discussed for isAbsolutePath
  1. commit 4caf697 domain and user relative non-file paths
  • this version has the one character change in isLocal from // to /
  • of note, tests all pass with this change ... what were you referring to when you said "(and changing the test as well.)"?

@jasongrout
Copy link
Contributor

@jasongrout jasongrout commented Jan 7, 2019

  • of note, tests all pass with this change ... what were you referring to when you said "(and changing the test as well.)"?

You already fixed this test: https://github.com/jupyterlab/jupyterlab/pull/5814/files#diff-fa3c582f48a1eb66103353f4323ddcbfR96, so I think we're good now!

@jasongrout jasongrout changed the title [wip] Relative non file paths Relative non file paths Jan 7, 2019
@jasongrout jasongrout added this to the 1.0 milestone Jan 7, 2019
Copy link
Member

@saulshanabrook saulshanabrook left a comment

Jason explained the change to me and it made sense.

I also tried this example a nbgitpuller triggering link locally and it was treated as an absolute path.

However, it doesn't actually address the original issue, which is asking for the absolute paths to be prefixed with the server prefix, see #5211 (comment) for more details.

@saulshanabrook saulshanabrook merged commit 84359ea into jupyterlab:master Jan 26, 2019
1 of 3 checks passed
@lock lock bot locked as resolved and limited conversation to collaborators Aug 8, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants