Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid file URIs on Windows #105

Closed
felixfbecker opened this issue Oct 11, 2016 · 13 comments
Closed

Invalid file URIs on Windows #105

felixfbecker opened this issue Oct 11, 2016 · 13 comments

Comments

@felixfbecker
Copy link
Contributor

The client %-encodes the colon after the drive letter on Windows, which is not how file URIs are done under Windows: https://en.wikipedia.org/wiki/File_URI_scheme#Windows

Just an exempt from a stack trace I just got:

LanguageServer\Project->loadDocument('file:///c%3A/Us...')
@egamma
Copy link
Member

egamma commented Oct 12, 2016

CC @jrieken

@jrieken
Copy link
Member

jrieken commented Oct 12, 2016

Our URI implementation doesn't perform scheme-dependent serialisation but uses the default percentage encoding/decoding. You can also observe that when serialising an http uri that uses = inside the query or fragment.

Generally, our URI guarantees that parse and toString are inverse operations and its properties (scheme, authority, path, query, fragment) are always decoded.

When calling Uri.toString you can pass true such that it skips most of the encoding - only # and ? are encoded. That's often useful when inserting URI into an html document.

@felixfbecker Is this a problem because you are passing on our serialised URIs to another library and that fails to parse them or is this just an observation?

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Oct 12, 2016

@jrieken It caused the PHP language server (which is implemented in PHP) to crash because the URI implementation there did not not handle this case. It performed a match against the URI to check if it is a well-formed Windows file URI, and if yes, stripped the leading slash. Since this match failed, file paths were left with a trailing backslash, which caused the LS to crash when trying to read the file from disk. I implemented a workaround to handle this case but it is technically not a valid file URI.

I don't know what Uri.toString() you are referring too? If you use encodeURIComponent() you should instead use encodeURI() on the whole URI, it will take care of these cases automatically.

@jrieken
Copy link
Member

jrieken commented Oct 12, 2016

As said encoding is scheme-specific (https://tools.ietf.org/html/rfc3986#section-2.2) and we have not implemented that.

I am aware on encodeURI but it fails with Uris like this: encodeURI("file:///my/c#files/project1.sln") because it will not encode the #-character, making everything after that a fragment, and thereby breaking the semantics of this URI when being parsed again. Since both encode-methods are slightly outdated we use a modified variant (as recommended here) for individual parts.

What I have observed form implements like C# System.URI is that it is up to the user to decode before constructing an URI-object. So, Console.WriteLine(new System.Uri("file:///s%C3%BCper/file" ).AbsolutePath); prints /s%C3%BCper/file and not /süper/file. Maybe something similar should be done for php.

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Oct 12, 2016

Here is the workaround I implemented: felixfbecker/php-language-server@66b5176
As you can see from the tests well-formed URIs worked fine before, just the invalid file URIs coming from VS Code imposed a problem. The function should be a generic functions that converts file URIs to paths so they can be read from disk, but if the input is not a valid file:// URI, it cant work of course.

So imo the mistake is on VS Code's side. Maybe you could at least do a check for file:// URIs (which is what you will use 99% of the time?) and handle windows drive letter colon specifically in there.

@dbaeumer
Copy link
Member

@felixfbecker there is a way how you can address this on the client itself. The LanguageClient allows you to pass in a uriConverter which we call whenever a URI flows from the client to the server and vice versa. See https://github.com/Microsoft/vscode-languageserver-node/blob/master/client/src/main.ts#L359

@felixfbecker
Copy link
Contributor Author

I worked around this with a custom URI converter, by first letting VS Code format without encoding, then parse and format it by NodeJS's url.format(). NodeJS encodes the URI correctly. Which makes me wonder: Why are you not using Node's url module directly? The VS Code Uri class is not even compatible with Node's Url interface.

@dbaeumer
Copy link
Member

dbaeumer commented Oct 25, 2016

I leave the answer up to @jrieken :-)

@jrieken
Copy link
Member

jrieken commented Oct 25, 2016

I think everybody knows the answer - cough standalone editor, single code base cough

@felixfbecker
Copy link
Contributor Author

Sorry, I don't understand? Are you not allowed to use Node native modules? There is a shim for the URI module on NPM for the browser (automatically used by Browserify).

@felixfbecker
Copy link
Contributor Author

FYI, Node 7 has just been released and includes an implementation of the WHATWG URL standard:
nodejs/node#7448
https://url.spec.whatwg.org/

I think that would be great to use as it is available in browsers as well and can be polyfilled if not.

@dbaeumer
Copy link
Member

@felixfbecker I think we are beyond the point changing this now. And with the URI rewiter code that is available on the client you can always adopt this nicely. Any objection to close.

@felixfbecker
Copy link
Contributor Author

No, I can live with my workaround. I still think VS Code / Monaco should sooner or later move away from (inferior) "home-cooked" implementations for basic things like URI parsing, globbing, type checking etc. Just my 2ct 😄

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants