Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LSP jar file system #4114

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Arthurm1
Copy link
Contributor

@Arthurm1 Arthurm1 commented Jul 7, 2022

This is yet another way of implementing the FileSystemAPI.

I found with #3750 that it was getting too complicated and there were code paths in the MetalsURLConnection that I was unsure of. There were also performance issues and I think it would have taken a lot of investigation to get the custom URLConnection and custom FileSystem on par with the built in Java ones.

So instead this PR simply maps jar: <-> metalsfs: uris whenever Metals communicates via LSP or via Debug Adaptor. Bugs in this area should be easy to track down/fix as it's probably down to missing mapper code in one of the messages.

This PR shouldn't change any client that doesn't support jar: scheme in virtual documents. As far as I know - that's only VSCode. If there are others then they will need to change from handling jar: to handling metalsfs: scheme. If they want they can also implement the LSP FileSystemAPI but they don't have to as virtual docs is separate from FileSystemAPI support.

By default - on VSCode - the user won't switch to a multi-workspace and display Metals libraries in the File Explorer but they have a command Show libraries folder in file explorer to do that.

I've tested this PR with navigation, debugging, searching jars, Metals Tree View and all seems ok.

This PR also auto-decompiles class files when they are missing source jars and allows navigation within that decompiled class file. It also allows decoding of class and tasty files from within jars. Currently it only does this from the file explorer. You can't run "goto definition" on a symbol that has no source jar. Ideally this would jump to an auto-decompiled class file of the workspace jar. I'm just not sure where this should be done. It doesn't find the non-source jar because SymbolIndexBucket#query0 only calls loadFromSourceJars - it doesn't try a loadFromWorkspaceJars equivalent. Is that where the code should be changed? Or should it be in MetalsLanguageServer#definitionOrReferences I don't want to cause issues (or performance issues) in other codepaths.

I've changed the MetalsDebugAdapter interfaces to use URI (as a String) instead of AbsolutePath. Is this an issue - is it a public interface? I see that there are 2 versions of the interface - do I need to create a 3rd?
The issue is that metalsfs: is not a valid path (which AbsolutePath wraps) and passing it as a URI fixes this.
I wonder if the uri mapping will also help VSCode notebook cells which I think have the scheme vscode-notebook-cell:

I've tried to test this without virtual docs enabled to mimic the behaviour of non-VSCode clients but it would be handy if someone double checked I've not caused any issues for those clients.

I've done nothing with handling CompletionRequest in DebugProxy line:120->146. I'm not sure if this should handle jar files - what does it actually do?

@kpodsiad
Copy link
Member

kpodsiad commented Jul 7, 2022

(I haven't looked at this PR yet so this comment may be absurd)
I was experimenting with Java & vscode and I found that they show dependencies in a separate tree view. It's similar to packages in Metals view). I didn't check how they handle with missing sources but this might a a good lead.

Screenshot Screenshot 2022-07-07 at 15 07 05

@Arthurm1 Arthurm1 force-pushed the jar_file_system_uri_transform branch 4 times, most recently from 7158836 to 6497dab Compare July 14, 2022 18:46
@Arthurm1 Arthurm1 force-pushed the jar_file_system_uri_transform branch from 6497dab to 90d3236 Compare July 15, 2022 13:32
@Arthurm1
Copy link
Contributor Author

@tgodzik This is ready for review whenever you have time.

I think the FileWatcherLspSuite test failure on the Mac is spurious as it happens on virtual docs and the readonly. But I can't test locally as I don't have mac.

@tgodzik
Copy link
Contributor

tgodzik commented Jul 15, 2022

@tgodzik This is ready for review whenever you have time.

I think the FileWatcherLspSuite test failure on the Mac is spurious as it happens on virtual docs and the readonly. But I can't test locally as I don't have mac.

Awesome! I will take a look after the weekend.

@tgodzik
Copy link
Contributor

tgodzik commented Jul 20, 2022

I've done nothing with handling CompletionRequest in DebugProxy line:120->146. I'm not sure if this should handle jar files - what does it actually do?

It handles completions in the debug console, so most likely it doesn't need to be handled here. Taking a look at the PR!

Copy link
Contributor

@tgodzik tgodzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! That was a lot of great work. I left a couple of comments and lose thoughts.

Does the PR need some more changes in VS Code extension? I seem to be getting:

image

private def decodeCFRFromClassFile(
def decodeCFRAndWait(path: AbsolutePath): DecoderResponse = {
val result = decodeCFRFromClassFile(path)
Await.result(result, Duration("10min"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Await.result(result, Duration("10min"))
Await.result(result, Duration("1min"))

I would say 10 minutes is a bit too much

@@ -11,7 +11,7 @@ import scala.meta.internal.metals.StatusBar

class StatusBarSuite extends BaseSuite {
val time = new FakeTime
val client = new TestingClient(PathIO.workingDirectory, Buffers())
val client = new TestingClient(PathIO.workingDirectory, () => null, Buffers())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val client = new TestingClient(PathIO.workingDirectory, () => null, Buffers())
val client = new TestingClient(PathIO.workingDirectory, () => None, Buffers())

could we switch the parameter to Option ?

// cope with this file being a de-compiled class file. Java won't compile a source file with a `.class` ext
val localSource =
sourceRoot.resolve(
s"${source.filename.stripSuffix(".class").stripSuffix(".java")}.java"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s"${source.filename.stripSuffix(".class").stripSuffix(".java")}.java"
s"${source.filename.stripSuffix(".class")}.java"


val sourceRoot = localSource.parent
val buildTarget = buildTargets
.inferBuildTarget(source)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move the below to inferBuildTarget ? I think that could be useful for more scenarios. Also, do we need to distinguish Scala and Java targets here? The logic seems the same.

Comment on lines +81 to +84
val prints = fingerprints.get(path)
if (path.isJarFileSystem && prints != null && prints.size > 0)
Option(prints.peek()).map(_.text)
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val prints = fingerprints.get(path)
if (path.isJarFileSystem && prints != null && prints.size > 0)
Option(prints.peek()).map(_.text)
else {
if (path.isJarFileSystem)
FileIO.slurp(path, charset)
else {

it should be ok to just read for jar and this is only used for synthetics decorator, which caches last document.

Comment on lines +136 to +141
} else if (uri.startsWith("metalsfs:")) {
// shouldn't need encoding. Definitely don't encode using URI("metalsfs", "", path, null) as this will create triple slashes: metalsfs:///somejar
uri
} else
throw new IllegalStateException(s"Why here? $uri")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (uri.startsWith("metalsfs:")) {
// shouldn't need encoding. Definitely don't encode using URI("metalsfs", "", path, null) as this will create triple slashes: metalsfs:///somejar
uri
} else
throw new IllegalStateException(s"Why here? $uri")
}
} else {
// shouldn't need encoding. Definitely don't encode using URI("metalsfs", "", path, null) as this will create triple slashes: metalsfs:///somejar
uri
}
}

fingerprints.add(path, FileIO.slurp(path, charset))
fingerprints.add(
path,
if (path.isJarFileSystem) params.getTextDocument.getText
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to add fingerprints for jarFileSystem. There are really only useful if the file can change,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree except I think fingerprints are used to get the text of the file in SyntheticsDecorationProvider#enrichWithText but I would like that to change so the text is passed in from the LSP message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you save it in the fingerprints to avoid the complexity of reading it from the file system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I came to look at auto-decompiling class files using CFR and realised the decompile took a few seconds for every time that file is read from disk, I started tracing how often Metals uses FileIO.slurp and it seems to use it a lot and, to me, that seems unnecessary since all the LSP messages send us the file contents. I see that fingerprinting is used to go back to the last successfully compiled version but I don't get why it's used in SyntheticsDecorationProvider#enrichWithText instead of just using the contents passed by LSP and I don't get why it's used in many other places e.g. InteractiveSemanticDB, when the text is readily available. I doubt this has been an issue with any files present on the file system but when you have to decompile the class file then that has to be wired in and it still takes time so why do it at all?
But yes - mostly because of the additional decompile phase needed when slurping a class file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that fingerprinting is used to go back to the last successfully compiled version but I don't get why it's used in

We only have semanticdb for the last compiled state, so whenever we use it we try to check for the differences between the current version and the compiled one to calculate adjustments to positions.

But yes - mostly because of the additional decompile phase needed when slurping a class file.

I think that should be fine then, but for that case ideally we would use buffers, which contain all the currently opened files. Fingerprints are also saved currently in between sessions and there is no need to save them if they don't really change.

error: String,
)

final class LSPFileSystemProvider(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if creating out own FileSystemProvider wouldn't actually cause less work (less changes and no need for a mapper), we could handle those URIs like normal ones and paths should be fine also. We would only mos likely need to map them for DAP

A lot of the methods in https://docs.oracle.com/javase/8/docs/api/java/nio/file/spi/FileSystemProvider.html can be left blank.

I can try and take a look at it if you don't have the time (unless you think it's not a great idea)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgodzik I have created a working FileSystemProvider in #3750. It's pretty much complete - I just never pushed the last version because it was slow. I guess I could spend some time optimising. My other issue was the URL translation for the custom URLConnection class needed - it's an area that I don't know much about so I'm not sure how many types of URL have to be dealt with for the URLClassLoader to work with the custom FileSystem - I ended up going round in circles a bit there. You're right in that it's less invasive to the rest of the codebase - no mapping required except for the debugger. If you want, I can push the latest version of #3750 and you can decide for yourself if you think it's a better approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the issue with URLClassLoader is complicated then maybe the current approach would work better. Let me think about it for a bit.

}

import scala.collection.mutable
// TODO this is not thread safe - do we care?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop the cache, I don't really see much benefit to it.

@Arthurm1
Copy link
Contributor Author

@tgodzik There is possibly a third way of implementing this.

The LSP filesystem could present the exact file system info the jar has.
e.g. jar:file:///home/me/coursier/https/repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.16/scala-library-2.12.16.jar

This would mean no special metalfs: uri scheme or java file system needing to be created. Or any mapping of URIs within message handlers. Or any mapping for the debug adapter.

The downside is that the file explorer would not be flat so the jars would be in deep dirs. Breadcrumbs would work within a jar - which is likely where they are most used.
To solve the file explorer issue - we could add a additional nodes to the Packages Tree view (under Projects and Libraries) which would allow the user to browse the jdk, source jars and workspace jars.

I think this would add a chunk of functionality to the user (breadcrumb navigation and jar browsing) that could be built on by Metals later. Then we could either go down the java file system root or down the mapping root but that could be decided at a later date.

Ideally something like microsoft/vscode#124903 would be implemented so that a flat structure could be represented by the LSPFileSystemAPI but the original jar: URIs would be used for file management

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 7, 2022

There is possibly a third way of implementing this.

The LSP filesystem could present the exact file system info the jar has.
e.g. jar:file:///home/me/coursier/https/repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.16/scala-library-2.12.16.jar

Just remembered we can't do that as VSCode doesn't like "jar:file:" uris under the FileSystemAPI, only under virtual documents.

@tgodzik
Copy link
Contributor

tgodzik commented Aug 8, 2022

There is possibly a third way of implementing this.
The LSP filesystem could present the exact file system info the jar has.
e.g. jar:file:///home/me/coursier/https/repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.16/scala-library-2.12.16.jar

Just remembered we can't do that as VSCode doesn't like "jar:file:" uris under the FileSystemAPI, only under virtual documents.

Ach ok, I am wondering though if we could do a proxy filesystem that would exactly the same mapping we do here and would work automatically with the new URIs. Would that be possible? I haven't yet looked into it, but I plan to see if we can improve this PR a bit in terms of maintainability

@Arthurm1
Copy link
Contributor Author

@tgodzik

I am wondering though if we could do a proxy filesystem that would exactly the same mapping we do here and would work automatically with the new URIs. Would that be possible? I haven't yet looked into it, but I plan to see if we can improve this PR a bit in terms of maintainability

#3750 is pretty much a proxy filesystem. It fakes the initial path section: metalsfs:/root/source/somejar.jar and then delegates any further path to the jar: filesystem. I can look into why it's slow if you're not keen on this mapping PR.

@tgodzik
Copy link
Contributor

tgodzik commented Aug 30, 2022

@tgodzik

I am wondering though if we could do a proxy filesystem that would exactly the same mapping we do here and would work automatically with the new URIs. Would that be possible? I haven't yet looked into it, but I plan to see if we can improve this PR a bit in terms of maintainability

#3750 is pretty much a proxy filesystem. It fakes the initial path section: metalsfs:/root/source/somejar.jar and then delegates any further path to the jar: filesystem. I can look into why it's slow if you're not keen on this mapping PR.

Sorry for talking so long to look into this! I am just still trying to figure out which is the best way here. I would probably prefer the filesystem as the mapper adds the requirement that we always need to remember to map the URI, which is a bit less maintainable I think. On the other hand if using the file system is slow, I can try and look into that. In which cases was it slow?

I haven't forgotten about your PR and would love to have it merged, just want to minimize the amount of potential issues.

@ckipp01 ckipp01 added the experiment Ticket related to an experiment or experimental features label Jan 2, 2023
@ckipp01
Copy link
Member

ckipp01 commented Jan 2, 2023

Going to mark this as a draft for now. If you dive back into this, please do mark it back as ready to review!

@ckipp01 ckipp01 marked this pull request as draft January 2, 2023 14:02
@tgodzik
Copy link
Contributor

tgodzik commented Jan 2, 2023

Going to mark this as a draft for now. If you dive back into this, please do mark it back as ready to review!

It's really stuck on me, since I was trying to think on how to make it a bit less complex and have so far failed to figure it out :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experiment Ticket related to an experiment or experimental features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants