Skip to content

Conversation

@milesziemer
Copy link
Contributor

The language server has to convert to and from LSP's URIs and the Smithy model's source location filenames. The filenames used for files in jars are actually URIs in the form jar:file/foo.jar!/bar.smithy - obviously there isn't an actual file path to a file within a jar.

Because LSP's URIs and these Jar URIs are URIs, they're percent-encoded. When we convert from a regular file URI -> filename, Path.of(URI).toString() makes sure the filename ends up properly decoded, regardless of how the client encoded the URI. However, when going from jar file URI -> jar file filename (as it appears in the model), we don't want to decode the URI (because the filename is encoded in the model).

This should be easy, but since some clients encode the URI differently, the URI sent by the client might not be encoded the same way it is in the model filename. In particular, VSCode is quite aggresive in its encoding, and encodes the ! in the jar URI. To handle this, we were decoding the LSP URI, but if the URI has other special characters, like spaces, those would also be decoded. I'm pretty sure this would always be an issue on windows too, since the : in C: would be encoded.

The problem hasn't come up yet, because who puts special characters in file/directory names? However, when trying to autodownload our new standalone installations in the VSCode extension, I found that extensions' storage directories are under /Application Support/ on Mac. So we need to fix this in order to autodownload the language server.

To fix this, I updated the implementation of a few methods in LspAdapter. Most notable is the new smithyJarUriToJarModelFilename method which takes an LSP jar URI and turns it into a Java URI that can properly toString() into the model filename, or toURL() into a URL that can be used to read the contents of the file. The method has a comment that explains how it works - it's really a hack.

We really shouldn't be using strings to represent URIs and paths at all, but at some point we still need to go back and forth between LSP URI and model filename, so I'm not sure if that would fix the problem, or just move it.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

The language server has to convert to and from LSP's URIs and the Smithy
model's source location filenames. The filenames used for files in jars
are actually URIs in the form `jar:file/foo.jar!/bar.smithy` - obviously
there isn't an actual file path to a file within a jar.

Because LSP's URIs and these Jar URIs are URIs, they're percent-encoded.
When we convert from a regular file URI -> filename, `Path.of(URI).toString()`
makes sure the filename ends up properly decoded, regardless of how the
client encoded the URI. However, when going from jar file URI -> jar
file filename (as it appears in the model), we don't want to decode the
URI (because the filename is encoded in the model).

This should be easy, but since some clients encode the URI
differently, the URI sent by the client might not be encoded the same
way it is in the model filename. In particular, VSCode is quite
aggresive in its encoding, and encodes the `!` in the jar URI. To handle
this, we were decoding the LSP URI, but if the URI has other special
characters, like spaces, those would also be decoded. I'm pretty sure
this would always be an issue on windows too, since the `:` in `C:`
would be encoded.

The problem hasn't come up yet, because who puts special characters in
file/directory names? However, when trying to autodownload our new
standalone installations in the VSCode extension, I found that
extensions' storage directories are under `/Application Support/` on
Mac. So we need to fix this in order to autodownload the language
server.

To fix this, I updated the implementation of a few methods in
LspAdapter. Most notable is the new `smithyJarUriToJarModelFilename`
method which takes an LSP jar URI and turns it into a Java URI that can
properly `toString()` into the model filename, or `toURL()` into a URL
that can be used to read the contents of the file. The method has a
comment that explains how it works - it's really a hack.

We really shouldn't be using strings to represent URIs and paths at all,
but at some point we still need to go back and forth between LSP URI and
model filename, so I'm not sure if that would fix the problem, or just
move it.
@milesziemer milesziemer requested a review from a team as a code owner August 25, 2025 16:11
@milesziemer milesziemer requested a review from joewyz August 25, 2025 16:11
@kubukoz
Copy link
Contributor

kubukoz commented Aug 26, 2025

The problem hasn't come up yet, because who puts special characters in file/directory names?

Coursier :)

smithy-lang/smithy#2576

@milesziemer milesziemer merged commit 341ed01 into smithy-lang:main Aug 26, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants