Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find definition won't work for external dependencies in Clojure #661

Closed
Andre0991 opened this issue Apr 5, 2021 · 55 comments
Closed

Find definition won't work for external dependencies in Clojure #661

Andre0991 opened this issue Apr 5, 2021 · 55 comments
Labels
enhancement lsp insufficiency As far as we know, LSP doesn't have enough power to deal with this

Comments

@Andre0991
Copy link

Andre0991 commented Apr 5, 2021

Hi. I'm experimenting with using eglot with Clojure.

I've had no issues so far except that xref-find-definitions won't work for external dependencies.
When I use it on some symbol that refers to an external lib, Emacs asks me if I want to create the directory:

Directory `/Users/andreperictavares/dev/peric/clj-new/src/clj_new/file:/Users/andreperictavares/.m2/repository/org/clojure/clojure/1.10.1/clojure-1.10.1.jar!/clojure/java/' does not exist! Create it? (y or n) y

The expected behaviour would be to open a buffer with the contents of the file in the jar (and manage it in the same project? Not sure).

Here's the an eglot log sample for this issue:

[client-request] (id:7) Sun Apr  4 21:44:24 2021:
(:jsonrpc "2.0" :id 7 :method "textDocument/definition" :params
          (:textDocument
           (:uri "file:///Users/andreperictavares/dev/nu/luna/src/luna/logic/error.clj")
           :position
           (:line 4 :character 23)))
[server-reply] (id:7) Sun Apr  4 21:44:25 2021:
(:jsonrpc "2.0" :id 7 :result
          (:uri "jar:file:////Users/andreperictavares/.m2/repository/prismatic/schema/1.1.9/schema-1.1.9.jar!/schema/core.clj" :range
                (:start
                 (:line 0 :character 4)
                 :end
                 (:line 0 :character 15))))

Steps to reproduce

  1. Download the clojure binary

  2. Download the native clojure-lsp binary in https://github.com/clojure-lsp/clojure-lsp/releases

  3. Configure eglot to use it

(add-to-list 'eglot-server-programs '(clojure-mode . ("clojure-lsp")))
  1. Clone clj-new and install its dependencies.
git clone https://github.com/seancorfield/clj-new.git
clj
  1. Open the project in Emacs, go to clj-new/src/clj/generate/def.clj and use xref-find-definitions on some external lib like clojure.java.io (line 5 in this file).

Note: I'm using Doom Emacs. I think this does not affect anything in regards to this behaviour. Unfortunately, I don't have a clean Emacs for confirming this easily right now. If you suspect this is Doom specific, I can setup a barebones Emacs in a few days.

@joaotavora
Copy link
Owner

The expected behaviour would be to open a buffer with the contents of the file in the jar (and manage it in the same project? Not sure).

Yeah, this isn't supported. Could it be? Maybe yes. But I'm unsure if it should be in the context of Eglot or simply Xref and Project, which are two packages that Eglot depends on. I lean towards Xref and Project. We can the maintainer of both those projects here, @dgutov, and allow him to give his opinion, or you can open a bug in Emacs' bug tracker, by sending mail to bug-gnu-emacs@gnu.org or using M-x report-emacs-bug.

Once the Xref + jar opening is sorted (supposing it is doable) we can think about whether that file should be associated with the same project. Indeed, as you suspect, the situation isn't linear because the file could be also found by M-. in some other eglot project. I think I've done something practical to that effect recently, but I can't remember how. Perhaps I didn't push it to master 🤔 . It had something to do with project-external-roots which seems like a good concept to apply here.

@dgutov
Copy link
Collaborator

dgutov commented Apr 12, 2021

I don't think it's really Xref or Project's responsibility, but here are some suggestions.

  • Emacs can open files inside .jar files and other archives. But for some reason file-name-handler-alist is only set up to open the archive files themselves differently, but without translation for file names inside. Someone could look into fixing that. Eglot could translate file names returned by the language server into the expected format.
  • Or Eglot could simply provide specialized xref-location classes, where the xref-location-marker location opens a file inside an archive, if needed. This should certainly be doable, whereas the jury is out on Getting eglot to work with php-language-server #1 (could be some pitfalls, as indicated by nobody having done this yet).
  • Yet another twist on the last option: follow in lsp-mode's example and introduce the notion of url-handlers: https://github.com/emacs-lsp/lsp-mode/blob/63b594930f99aa1a81dce5118c28f5fdd838fc29/clients/lsp-clojure.el#L285 (customized per-client). In Clojure's case, the language server provides a method clojure/dependencyContents, to fetch the dependency's contents. Downside: you probably can't edit the file this way (not sure if you often want to). Upside: it's generalizable to other language servers and other resources. jdtls has a similar action called java/classFileContents; not sure if there are others yet.

Approach #2 could be conceivably added to xref.el itself. Approach #3 (which I personally like) doesn't have that option.

Regarding how to track which dependencies belong where, I don't have much better to suggest than some ad-hoc options (perhaps track the list of files/buffers opened this way, associated with a project, or perhaps track lists of directories, if LSP servers report "library paths"). If you go the "virtual buffer" route (#3), it could simply have default-directory value belonging to the project.

As you are both the provider and the consumer of the project information in this case, I'm not sure if following the project.el framework holds much value to you. E.g., you could set project-vc-external-roots-function in Eglot-managed buffers, but would users expect their project-related command to change behavior because they simply started using Eglot? I'm not sure. Using the existing project-find-functions can be handy to avoid choosing such logic by yourself (if the LSP protocol doesn't give any advice on that), but you have no requirement to limit yourself to project.el's structures otherwise.

Here's what I've done in another tool (not LSP-related): for each open project, there is a corresponding connection buffer. That buffer has a buffer-local variable set to the "library path" for that project. Then I do a linear search across such buffers whenever the current file is seemingly not attached to any project directly.

@joaotavora
Copy link
Owner

I said Xref becasue this is a general cross-referencing problem. It could conceivably exist without LSP, which is Eglot's main business, and be useful in other situations.

Eglot could translate file names returned by the language server into the expected format.

Translate LSP formats into Emacs formats is indeed Eglot's business. But where can I get hold of this "expected format"?

If no expected format exists, we should make one. That's number 2 and it should live in Xref (or xref-backends.el or xref-utils.el distributed separately)

In Clojure's case, the language server provides a method clojure/dependencyContents

This is the curious absurdist view of LSP as language-agnostic-but-not-really. These extensions belongs in clojure-mode-.el, obviously. But if emacs-lsp has some generic url-handlers trick to open files inside jars then maybe it could contribute them to Emacs so that more packages can benefit.

Regarding how to track which dependencies belong where [...] Here's what I've done in another tool (not LSP-related):

If the same problem is being solved inside and outside the scope of LSP, this again hints at the need of a shared library.

But first I'd solve the file-finding problem.

@dgutov
Copy link
Collaborator

dgutov commented Apr 12, 2021

I said Xref becasue this is a general cross-referencing problem. It could conceivably exist without LSP, which is Eglot's main business, and be useful in other situations.

It explicitly encourages extensibility to allow for special cases. I'm not sure how "core" finding files inside archives is to its purpose, but if it's implemented sufficiently general, it could be added.

If you go this route, do consider option #1 first. It should take ~ the same amount of code, but then you'd be able to address files inside archives anywhere, not just using Xref.

I also wonder which of the approaches would work better with Tramp. Like, if the LS is on a remote host, and we add an addressing scheme for files inside archive files, would Tramp have to add special handling for files inside archives on remote hosts?

This is the curious absurdist view of LSP as language-agnostic-but-not-really.

The view of current reality, shrug.

These extensions belongs in clojure-mode-.el, obviously. But if emacs-lsp has some generic url-handlers trick to open files inside jars then maybe it could contribute them to Emacs so that more packages can benefit.

These handlers are per-server, not per-language (but good luck convincing clojure-mode maintainers anyway). lsp-clojure and lsp-java also have different handlers, even though the end behavior is very similar, because different servers use different urls and commands.

If the same problem is being solved inside and outside the scope of LSP, this again hints at the need of a shared library.

The implementation is trivial, the real difficulty is choosing which data to use, and where to get it from.

@joaotavora
Copy link
Owner

If you go this route, do consider option #1 first. It should take ~ the same amount of code, but then you'd be able to address files inside archives anywhere, not just using Xref.

That sounds fine. But you said there's a format that Eglot should translate to. Where is that format? Or should I invent it inside Emacs and distribute a package from it? Didn't you say there is code to find files inside jars?

The view of current reality, shrug.

My concrete reality as someone who actually uses LSP is using it without any such trash.

(but good luck convincing clojure-mode maintainers anywa

I don't use clojure, so I'm not going to be doing any convincing. Clearly doesn't belong in Eglot. You said there's a clojure server with some third-party fancy extensions, maybe clojure-mode maintainers or someone else wants to leverage that using Eglot as a library

The implementation is trivial, the real difficulty is choosing which data to use, and where to get it from.

I wasn't referring to the implementation, just pointing out that the problem seemingly reaches out of LSP. I don't fully understand the problem and its quirks yet, though. My naive understanding of the statement is "how to have a file somehow referenced by two projects, and what advantages and limitations does that bring?"

@dgutov
Copy link
Collaborator

dgutov commented Apr 12, 2021

Where is that format? Or should I invent it inside Emacs and distribute a package from it?

Invent, yes. Create a file name handler and distribute it from somewhere. But first I would check why it's still not done yet: after all, archive files themselves already have file name handlers.

Didn't you say there is code to find files inside jars?

It certainly exists, since you can visit a .jar file and even RET into any of its contents. But I haven't looked in detail into where this code resides.

You said there's a clojure server with some third-party fancy extensions, maybe clojure-mode maintainers or someone else wants to leverage that using Eglot as a library

As long as there is a good extensibility story, I suppose interested parties could create an eglot-clojure package. Or indeed incorporate that into a major mode, whatever.

But it will require infrastructure, such as the use said url handlers in the eglot-specific location type's xref-location-marker implementations. Or, if you go the file-name-handlers route, in the translation of uris returned by the server to the file names in the format understood by the file name handler.

I wasn't referring to the implementation, just pointing out that the problem seemingly reaches out of LSP. I don't fully understand the problem and its quirks yet, though.

It's the kind of problem that doesn't easily lend itself into being extracted into a library. project.el could end up solving it for its goals one day, and then documenting it, and the only part you'd really be able to reuse is the documentation, I think.

My naive understanding of the statement is "how to have a file somehow referenced by two projects, and what advantages and limitations does that bring?"

My solution is essentially a hack (just use the most recently used project that has this file in the "externals" set), with the upside of having most features seamlessly working, and the downside of sometimes not guessing the user's intent correctly (using the "wrong" project is more problematic for some commands than others, e.g. it's fine for "find definition" but not for "find references").

@joaotavora
Copy link
Owner

But first I would check why it's still not done yet: after all, archive files themselves already have file name handlers.

How to check that? If you have any suspicions, please share, else I think it's easier just to try it.

Or, if you go the file-name-handlers route, in the translation of uris returned by the server to the file names in the format understood by the file name handler.

In this case, since there is no format and I'm supposed to invent one, I might as well just use the same format as the clojure LSP server is using here, if it's minimally useful. So no translation necessary. Else, if LSP comes up with an official format that's not gratutiously different, translation is Eglot's job.

@dgutov
Copy link
Collaborator

dgutov commented Apr 12, 2021

How to check that? If you have any suspicions, please share, else I think it's easier just to try it.

Try the code, search its history, C-x v g, ask the people who touched it in the past. Michael, probably.

In this case, since there is no format and I'm supposed to invent one, I might as well just use the same format as the clojure LSP server is using here, if it's minimally useful.

IIUC there are some considerations to use for choosing particular addressing scheme. Michael could advise on that as well.

Else, if LSP comes up with an official format that's not gratutiously different, translation is Eglot's job.

Have you compared the formats used by clojure-lsp and jdt.ls?

@joaotavora
Copy link
Owner

Have you compared the formats used by clojure-lsp and jdt.ls?

No, have you? This format looks fine. If someone else (jdt or hopefully LSP itslef) shows another format then Eglot will do some minimal translation. Same thing if Michael or the facts of code show that Clojurels format isn't suitable.

@dgutov
Copy link
Collaborator

dgutov commented Apr 12, 2021

No, have you?

I'd have to install both to really do that. This handler looks pretty involved, though:

https://github.com/emacs-lsp/lsp-java/blob/master/lsp-java.el#L837-L877

@joaotavora
Copy link
Owner

yes, pretty involved. Better stay away from it

@joaotavora
Copy link
Owner

OK, so to sum up, I think this

  • is a low priority feature request for me personally (I don't have time right now), but anyone else can pick it up
  • it should be implemented in Emacs itself, as a attributable package that Eglot can be made to depend on
  • it should in theory be as simple as making that library add an element to file-name-handler-alist, with some supporting function. This is @dgutov 's idea number 1.
  • the format to use shall be the one described here https://docs.oracle.com/javase/7/docs/api/java/net/JarURLConnection.html, unless some good reasons exist to deviate from it, in which case Eglot will translate from that format to Emacs's format

@Andre0991 since you seem to be interested in Lisp languages, you can try to develop this. By scratching your itch, you'll help Eglot, Emacs and many others.

@joaotavora joaotavora added enhancement lsp insufficiency As far as we know, LSP doesn't have enough power to deal with this labels Apr 12, 2021
@dgutov
Copy link
Collaborator

dgutov commented Apr 12, 2021

Sounds like one hell of a first issue: lots of investigation and organizational stuff, not so much tech.

@joaotavora
Copy link
Owner

Yes, it's a bit ambitious as a first issue. One should definitely start just experimenting briefly with file-name-handler-alist ( which as far as I know is documented) and then think about making the library. It was just a suggestion for @Andre0991 , anyway. I'm not saying I'm never going to pick this up myself, just not in the foreseeable future, so I was just registering those bullet points as a summary of what I think should be done after all this analysis.

@Andre0991
Copy link
Author

Olá.

First, I need to amend my issue description.

I found out that there is this config.edn file that dictates some settings to clojure-lsp, including the URI format. Sorry about that - I didn't remember this file existed at all, I just realized that when mentioning the URI path thing to a coworker who is familiar with clojure-lsp.
This config file can exist on a per-project or globally in ~/.lsp/.config.edn. It has the option dependency-scheme, whose default value is zip, but that can be changed to jar, which happens to be my case.

For more details, please see https://clojure-lsp.github.io/clojure-lsp/settings/.

On dependency-scheme, it says:

How the dependencies should be linked, jar will make urls compatible with java's JarURLConnection. You can have the client make an lsp extension request of clojure/dependencyContents with the jar uri and the server will return the jar entry's contents. Similar to java clients.

I see that both Calva and lsp-mode explicitely set this to jar in the initialization options.

If I don't set it to jar, eglot logs this:

[server-reply] (id:14) Tue Apr 13 20:52:11 2021:
(:jsonrpc "2.0" :id 14 :result
          (:uri "zipfile:///Users/andreperictavares/.m2/repository/org/clojure/tools.namespace/0.3.1/tools.namespace-0.3.1.jar::clojure/tools/namespace/find.clj" :range
                (:start
                 (:line 11 :character 2)
                 :end
                 (:line 11 :character 30))))

Also, I have a question, though I'm not sure if it's part of the scope of this issue.
Let's say the URI issue is solved and somehow Emacs knows how to open the thing in the jar.
Suppose I just did that by using xref-find-definitions.
Now, I'm wondering how this would impact eglot and the clojure-lsp server.
To be more specific: Will xref-find-definitions also work in this file, in case I need to navigate to yet another file in another jar?


I don't know if the points above impact anything you have discussed so far.

Finally: I don't know much emacs-lisp, but working on this issue looks very tempting. That said, I don't have much time right now and I see that I need to learn a lot of concepts. I'll try to do some experimenting with file-name-handler-alist on some weekend. That said if someone that is reading this knows how to do that and wants to tackle this issue, by all means, don't wait for me and do it.

@dannyfreeman
Copy link

dannyfreeman commented Oct 21, 2022

I decided to take a look at fixing this problem today and have come up with the following solution, which setups a handler for the file-name-handler-alist that can unzip jar files and dump their contents in to a new buffer.

The buffer gets associated with the current project. When used in conjunction with cider, I can evaluate expressions in the new buffer, modify it, and save it under the current project structure (does NOT update it in the jar itself, but that's fine). It seems like clojure-lsp/eglot continues to provide analysis on the unziped file in some cases.

It currently expects :dependency-scheme in clojure-lsp settings to be set to :jar, although it should be trivial to modify to support :zip.

This has been really useful for me so far. I'm going to use it at work for a couple days and see how I like it, maybe make some tweaks to how the buffer is configured.

I'm posting this solution here for other people to use. Would this code be something that the maintainers like @joaotavora might like to have submitted to this project? If not, maybe it would be suited elsewhere, like clojure-mode, or an independent package. Asking here first since this is where I found the bug ticket. Really interested to hear your thoughts on that.

(defvar clojure-jar-match
  (rx (group "/" (* not-newline) ".jar") ;; match group 1, the jar file location
       "::"
       (group (* not-newline) "." (+ alphanumeric)) ;; match group 2, the file within the archive
       line-end)
  "A regex for matching jar files to be opened by eglot and clojure-lsp")

(defvar clojure-jar-file->buffer
  (make-hash-table :test 'equal))

(defun existing-clojure-jar-buffer (file)
  (when-let ((existing-buffer (gethash file clojure-jar-file->buffer)))
    (if (buffer-live-p existing-buffer)
        existing-buffer
      (remhash file clojure-jar-file->buffer)
      nil)))

(defvar project-source-root "src/")

(defun try-place-buffer-file-in-project-src (source-file fallback)
  "Set `buffer-file-name' to SOURCE-FILE under a project's `project-source-root' directory.
If that can't be done then the buffer-file-name is set to FALLBACK."
  (let* ((root (project-root (project-current)))
         (project-src-dir (expand-file-name project-source-root root)))
    (if (file-exists-p project-src-dir)
        (setq buffer-file-name (expand-file-name source-file project-src-dir))
      (setq buffer-file-name fallback))))

;; Implement save file (save into project root src folder if it exists)
(defun clojure-jar-handler (op &rest args)
  "A `file-name-handler-alist' handler for opening clojure source files located in jars.
For this to work, the `:dependency-scheme' clojure-lsp setting should be set to `:jar',
although this could also work when set to `:zip'."
  (cond
   ((eq op 'get-file-buffer)
    (let* ((file  (car args))
           (match (string-match clojure-jar-match file))
           (jar   (substring file (match-beginning 1) (match-end 1)))
           (source-file (substring file (match-beginning 2))))
      (if-let ((existing-buffer (existing-clojure-jar-buffer file)))
          existing-buffer
        (message "Unzipping %s to show %s" jar source-file)
        (with-current-buffer (create-file-buffer file)
          (puthash file (current-buffer) clojure-jar-file->buffer)
          ;; unzip -p path/to/bidi-2.1.3.jar bidi/bidi.cljc
          (call-process "unzip" nil (current-buffer) nil "-p" jar source-file)
          (try-place-buffer-file-in-project-src source-file file)
          (setq buffer-read-only nil)
          (setq buffer-offer-save nil)
          (set-auto-mode)
          (set-buffer-modified-p nil)
          (goto-char (point-min))
          (current-buffer)))))
   (t (let ((inhibit-file-name-handlers (cons 'clojure-jar-handler
                                              (and (eq inhibit-file-name-operation op)
                                                   inhibit-file-name-handlers)))
            (inhibit-file-name-operation op))
        (apply op args)))))

(defun clojure-jar-init ()
  (add-to-list 'file-name-handler-alist (cons clojure-jar-match #'clojure-jar-handler))
  (remove-hook 'clojure-mode-hook #'clojure-jar-init))

(add-hook 'clojure-mode #'clojure-jar-init)

I am also not very proficient with emacs-lisp, and would love some feedback if anyone has the time.

Thank you!

@joaotavora
Copy link
Owner

I am also not very proficient with emacs-lisp, and would love some feedback if anyone has the time.

Well, then, great effort. I do spot a tiny bit of non-idiomatic Elisp code here and there, but nothing really serious. I guess you know Clojure, Elisp isn't so off.

I don't understand the entry point. How does one find a file in a .jar, does one just C-x C-f to it like if there jar were a directory? Anyway, from my cursory look this does seem suited to clojure-mode.el, for sure. At any rate not Eglot: except for the references to language servers and how to start them, Eglot is completely language-agostic.

@dannyfreeman
Copy link

dannyfreeman commented Oct 21, 2022

Thank you for the quick reply!

I don't understand the entry point. How does one find a file in a .jar, does one just C-x C-f to it like if there jar were a directory?

Not sure I understand this question. There isn't really any time I've C-x C-f navigated into the jar file.

clojure-lsp reports the file location like this /path/to/library.jar::path/inside/jar/to/source.clj. This hybrid path thing is eventually provided to xref-find-definitions to open up, which out of the box just opens up a blank buffer. I'm using that hybrid jar path to get to the jar, and unzipping the contents of the file inside it into a new buffer.

The buffer itself I try to associate with a new file in the current project, and if saved would be found under project-root/src/path/inside/jar/to/source.clj. If that fails then I've set the file to the full path/to.jar::path/in/jar which doesn't work for saving, but works for editing and evaluation at least.

Either way, I am happy to see what the clojure-mode.el maintainers think, and will look into publishing a new package if they don't want this.

@joaotavora
Copy link
Owner

Not sure I understand this question.

You ended up answering it. I think that, with your code loaded, if you give /path/to/library.jar::path/inside/jar/to/source.clj to C-x C-f it will open source.clj inside the jar. And the same will happen when xref-find-definitions tries to find it.

So it looks pretty good. Though on second thought, jars aren't really Clojure-specific are they? So submitting to clojure-mode may not make much sense. Also, jars are just zip files, aren't they? So do check if Emacs doesn't have some utilities similar to this one for zips that may also work for jars. Else, maybe a new package jar-utils in GNU ELPA looks good.

@dannyfreeman
Copy link

dannyfreeman commented Oct 21, 2022

Jars are just zip files and can be used to package arbitrary source files. So java, kotlin, scala, clojure, etc. I believe this might also work for .java files as is, but would have to find a way to test that.

I will look into other emacs utilities for handling zip files. shelling out to unzip might not be very portable! From there I'll find a way to test another jvm language. I appreciate the guidance.

Edit:

. I think that, with your code loaded, if you give /path/to/library.jar::path/inside/jar/to/source.clj to C-x C-f it will open source.clj inside the jar.

C-x C-f and my minibuffer completion setup struggles with that file path, but calling (find-file "/path/to/library.jar::path/inside/jar/to/source.clj") does seem to work. Very cool!

@mainej
Copy link

mainej commented Oct 21, 2022

@dannyfreeman Emacs already knows how to handle jars and their files. Try opening a jar file in your .m2 directory. When I do that, I end up in a file that looks like /path/to/.m2/repository/rewrite-clj/rewrite-clj/0.6.1/rewrite-clj-0.6.1.jar:rewrite_clj/parser/core.clj. Note the single colon separating the jar name and the file name. I wonder if clojure-lsp could be modified to return URIs of that format. (I've been helping maintain clojure-lsp lately, so I might be able to help with that ;))

@joaotavora
Copy link
Owner

Or maybe emacs facilities for handling jars could just learn to grok the double ::? Wouldn't that solve the problem as well?

@mainej
Copy link

mainej commented Oct 21, 2022

To follow up on this … it appears that arc-mode.el provides archive-mode, which is a dired style view of an archive. Because jar files follow the zip file format, archive-find-type classifies the jar as a zip. This gets translated in the mode line into “Zip-Archive”. From the dired view, you can navigate to a file.

So what I wrote above is slightly wrong... you can't just open a file within a jar. You have to open the jar first and then find a file within it.

That’s as far as I’ve gotten. I’m not sure yet how archive-mode unzips the file, or whether that’s something tools like eglot could hook into.

Or maybe emacs facilities for handling jars could just learn to grok the double ::? Wouldn't that solve the problem as well?

@joaotavora that sounds ideal, although we're way over my head already.

@dannyfreeman
Copy link

dannyfreeman commented Oct 21, 2022

I was about to post a response, but you summed it up @mainej, we still can't navigate straight to files within jars with just one command like find-file.

I can go through archive mode by replacing the call-process function in my code above with (archive-zip-extract jar source-file)

Edit: I'm going to try to find some time this weekend to search through cider's source and see what they are doing. Perhaps they will have some hints for me. I don't think they use xref-* though, so there might be a bunch of custom stuff for handling these situations. jk there is a lot of custom xref behavior it seems.

@mainej
Copy link

mainej commented Oct 22, 2022

(archive-zip-extract jar source-file)

@dannyfreeman that's exactly what cider does in cider-find-file. Thanks to @ericdallo for pointing that out.

So I'd say all signs are pointing to using archive-zip-extract from arc-mode.el to unzip jar files into temp buffers. Exactly how, where, and when to do that are open for debate. It'd be amazing to have it built into Emacs in find-file. That would have the broadest reach. But having it in eglot would be nice too, since all JVM languages would benefit from being able to open JAR files. Or perhaps it belongs in a package outside of eglot so that lsp-mode and cider can use it too.

@mainej
Copy link

mainej commented Oct 22, 2022

A note on clojure-lsp...

As you've discussed there's a setting called :dependency-scheme which controls how clojure-lsp renders URIs.

There are two valid values "zipfile" and "jar". For historical reasons "zipfile" is the default, although the recommended value is "jar" and many lsp clients set that as their own default.

Note that you might find documentation that suggests :jar (a keyword) and other invalid things. These are all equivalent to setting "zipfile", which is admittedly horribly confusing. We've planned to fix those docs. Also note that we might change the default at some point. We may even remove "zipfile" entirely, since it's infrequently used and causes confusion. Both are reasons to set it to "jar" explicitly.

The URIs for "zipfile" look like

zipfile://<path-to-jar>::<path-to-entry>

An example is zipfile:///some/path/some.jar::some/file.clj. Note that the path-to-jar begins with a slash (so 3 slashes at the beginning) and that the separator is ::.

The URIs for "jar" look like

jar:<uri-to-jar>!/<path-to-entry>

Here uri-to-jar is itself a uri, usually with the format file://<path-to-jar>. So, a full example is jar:file:///some/path/some.jar!/some/file.clj. Note that again we have 3 slashes, but the separator is !/.

You may be dismayed to see a URI in a URI. I agree, but direct your complaints to Java who defined this style. 🙄 And actually, since this is Java's preferred way of referencing its own jar files, ideally Emacs would understand this style too.

@dannyfreeman you're setting :jar but getting back "zipfile" style URIs. The fact that your regex searches for ::, not !/ confirms this. I recommend that you (and eglot) use "jar" so I'd suggest you update your setting and your regex.

@dannyfreeman
Copy link

@mainej This is great information, thank you for sharing it. There is no reason this file-name-handler-alist strategy couldn't support both zipfile and jar strategies. I'll turn on the real "jar" setting and get that working too.

I think the path forward for the time being is to get this working and published somewhere as an independent package. That will allow me to work through all the kinks, see if other language servers can benefit from it. From there we can look to see if emacs might be a better home for this code. I will post here, and in clojure lsp slack channel when I get a repo set up.

@dannyfreeman
Copy link

dannyfreeman commented Oct 23, 2022

The nested jar URIs that are provided when using the :dependency-scheme "jar" setting need some help from eglot to be parsed into something I can work with in emacs.

Relevant code:

eglot/eglot.el

Line 1503 in e501275

(retval (url-unhex-string (url-filename (url-generic-parse-url uri))))

When eglot gets a filepath like what the jar option provides jar:file:///home/user/.m2/repository/org/clojure/clojure/1.10.3/clojure-1.10.3.jar!clojure/core.clj, it parses it as a uri, which results in the jar: part being stripped off, but not the file://.
This file:// part should also be removed so that emacs itself only has to deal with the local file path + jar delimiter like this /home/user/.m2/repository/org/clojure/clojure/1.10.3/clojure-1.10.3.jar!clojure/core.clj.
If the file:// prefix is not removed, I end up going down the path of having to implement nearly every op in file-name-handler-alist in order to deal with the file:// suffix,
which I'm not confident I can do correctly. Examples on the internet are sparse.

Right now I have eglot locally parsing the uri twice, with the line linked above replaced with

(retval (eglot--parse-uri uri))

and it's definition:

(defun eglot--parse-uri (uri)
  (url-unhex-string
   (url-filename
    (let ((url (url-generic-parse-url uri)))
      (if (string= "jar" (url-type url))
          ;; jar: URIs can contain a nested URI, so we need to parse twice.
          ;; For example, `jar:file:///home/user/some.jar!/path/in/jar'
          (url-generic-parse-url (url-filename url))
        url)))))

Once that is wired up I can make emacs work with the jar setting by changing my regex a little

(defvar clojure-jar-match
  (rx (group "/" (* not-newline) ".jar") ;; match group 1, the jar file location
       (or "::" "!") "/" ;; this part is what's new
       (group (* not-newline) "." (+ alphanumeric)) ;; match group 2, the file within the archive
       line-end))

Parsing the URI twice might be something that should be placed in eglot.
I'm not sure if other JVM language servers can provide similar URIs.
I imagine they do if this is a common format.

Would eglot maintainers be interested in a patch for double parsing the URIs with a jar scheme? If not, could providing the :dependency-scheme "zipfile" to clojure-lsp when starting the process be considered instead?

Baring that, I think I could still make this work outside of eglot by adding some advice to eglot--uri-to-path or something to that effect.

@dannyfreeman
Copy link

Either patch will not be enough. This only makes it so that extra code like what I've got above capable of dealing with the jar style paths.

I'll start to prepare the patches and try to have them sent today. Congrats on getting this code merged into emacs!

@joaotavora
Copy link
Owner

This only makes it so that #661 (comment) capable of dealing with the jar style paths.

But hadn't we established earlier that there's already some code in Emacs itself which does more or less the same?

Regardless of your answer to the previous question, and imagining one of those patches in in place in lisp/progmodes/eglot.el, what is the full set of things must the user do/install to get find definitions for external dependencies? Don't be afraid to state the obvious, like "install Emacs version XYZ", or "install upcoming Eglot ELPA package version XYZ2". I just want to get the overall picture.

@joaotavora
Copy link
Owner

Fell free to answer the questions/continue discussion in the bug report where you propose the patches.

@dannyfreeman
Copy link

dannyfreeman commented Oct 23, 2022

But hadn't we established earlier that there's already some code in Emacs itself which does more or less the same?

No, there is still nothing in emacs to handle this exact situation. There is code in emacs, via arc-mode that can first open jar files, and allow users to navigate to a file within the jar using a dired like interface and open it. But nothing for navigating directly to a file within a jar via things like find-file and xref-find-defintions.

I've started creating a package for this here: https://git.sr.ht/~dannyfreeman/jarchive

imagining one of those patches in in place in lisp/progmodes/eglot.el, what is the full set of things must the user do/install to get find definitions for external dependencies? Don't be afraid to state the obvious, like "install Emacs version XYZ", or "install upcoming Eglot ELPA package version XYZ2".

Assuming eglot has one of the patches in place, the user needs to do the following

  1. Make sure your on at least emacs 28 (what I'm testing with)
  2. install clojure-lsp
  3. setup eglot according to README
  4. load this code into emacs: https://git.sr.ht/~dannyfreeman/jarchive/tree/main/item/jarchive.el
  5. call (jarchive-setup) somewhere
  6. start eglot in a clojure buffer
  7. use xref-find-definitions over a symbol defined in another project

@abcdw
Copy link

abcdw commented Oct 25, 2022

@dannyfreeman Could you provide a link to the thread with patches, please?

@dannyfreeman
Copy link

dannyfreeman commented Oct 25, 2022

Yeah, when I get them sent out I will link them here. It's taken me a little longer than I had hoped, with life getting in the way. I'll try to do it later today though.

Edit: here is the two patches zipped up for emacs. I still need to send it to the mailing list. I'm taking the time to read through the contribute guidelines, trying to get it right the first time.
patches.zip

@sw1nn
Copy link

sw1nn commented Oct 25, 2022

@dannyfreeman fyi, I tried your patch to eglot + jarchive and it works for me to xref-find-definitions from source outside jar to source inside the jar, but if I then try to xref-find-definitions for some fn defined in the 'source in the jar', I don't find definitions, but rather a text match within the file. I realise this is all getting a bit inception, I guess this might require a bit more co-ordination between xref and eglot.

@dannyfreeman
Copy link

dannyfreeman commented Oct 25, 2022

That is because I'm setting the directory of the opened 'source in the jar' to the jar's directory. eglot starts a new lsp server there, and clojure-lsp doesn't really know what to do probably because it's not in a real project directory and the actual file does not exist.

I don't know of a great way to solve this right now. One way is to trick clojure-lsp by placing the extracted file in the source path of the current project.
Here is an example of how to do that: https://git.sr.ht/~dannyfreeman/jarchive/commit/clojure-lsp-hack (notice the hard coded "/src" directory being used). Eglot is also tricked because it looks like the file is managed under the current project.

I'm not very crazy about that approach though because it makes too many assumptions. Especially that source file will be managed in a project under the src/ directory (which may not be on the classpath at all).

A work around that requires no code changes at all is using M-x write-file to relocate the opened 'sourcein jar' buffer to the correct source directory of a project. I can confirm that using jarchive to navigate to clojure/core.clj, I can make lsp work correctly in clojure/core.clj by moving it to my/project/src/clojure/core.clj then calling M-x revert-buffer.

I'm not sure what will be a good solution. Going to have to give it some thought.

@dannyfreeman
Copy link

@dannyfreeman Could you provide a link to the thread with patches, please?

@abcdw here is the thread started on gnu emacs bug list
https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-10/msg02118.html

@mainej
Copy link

mainej commented Oct 26, 2022

if I then try to xref-find-definitions for some fn defined in the 'source in the jar', I don't find definitions

@ericdallo perhaps you can help @dannyfreeman with this. I know that lsp-mode is able to do the two-level navigation project file -> dep file 1-> dep file 2 so perhaps you have some insight about how the temp buffers need to be managed to support this.

@joaotavora
Copy link
Owner

joaotavora commented Oct 26, 2022 via email

@dannyfreeman
Copy link

dannyfreeman commented Oct 26, 2022

Setting xref-extend-to-xref to t does not work, but that is because I'm changing some of the buffer-local vars like buffer-file-name and default-directory. buffer-file-name is different that what eglot uses to track in eglot--servers-by-xrefed-file so that fails. Setting default-directory also causes (eglot--current-project) to return a transient project, which I don't want. I think I can make it work though. I just have to find the right values to set.

@dannyfreeman
Copy link

dannyfreeman commented Oct 26, 2022

https://git.sr.ht/~dannyfreeman/jarchive/commit/a5ab89db3f7bf248f3f0ff0ad97767f2c9fe01e5

This change makes the file managed by eglot by simply not setting the default-directory, which uses the value from the buffer that the user xref-find-defintions from, causing project-current to pick it up.

It's still not possible to navigate around again with xref-find-defintions and friends. I think because the file itself is not saved so clojure-lsp cannot find it. This seems pretty typical.

lsp-mode seems to be doing something special with clojure by creating a temp directory, writing the extracted files to that temp directory, and adding those to the lsp workspace so that clojure-lsp can find them. Maybe I need to be doing something along those lines, I don't know.

Does eglot (or maybe project.el) have a mechanism for adding an external directory to the current project? I think that would be what I need to do this automatically.

Right now I get around this with a function I wrote called jarchive-move-to-visiting-project that just saves the extracted file in the current project.

@joaotavora
Copy link
Owner

Does eglot (or maybe project.el) have a mechanism for adding an external directory to the current project? I think that would be what I need to do this automatically.

I think project.el does. But then again this seems like a completely different approach and seems to negate the need for the patches you sent to Eglot.

In my mind, project.el should support adding jars as collections of source files just like it supports adding directories as collections of source files. Many years ago, Eclipse did this. You could add a jar as a library or a file as a library: it's just a different implementation detail.

Please respond in Emacs's bug tracker.

@abcdw
Copy link

abcdw commented Oct 26, 2022

@dannyfreeman Thank you!

BTW, I packaged jarchive for guix:
https://git.sr.ht/~abcdw/rde/tree/90af100a4d70d7016261d39b91b6748768ac374b/rde/packages/emacs-xyz.scm#L330

@dgutov
Copy link
Collaborator

dgutov commented Oct 27, 2022

In my mind, project.el should support adding jars as collections of source files just like it supports adding directories as collections of source files. Many years ago, Eclipse did this. You could add a jar as a library or a file as a library: it's just a different implementation detail.

Sounds like you want to advise project-vc-external-roots-function. Or change its whole value.

Or create an Eglot-specific project backend.

@joaotavora
Copy link
Owner

I'm going to answer this in Emacs's bug tracker https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-10/msg02118.html. It's akward and unpractical to keep commenting in both places.

@dgutov
Copy link
Collaborator

dgutov commented Oct 27, 2022

FWIW, I don't think anybody pinged me there, and I currently have 1939 unread messages in my Debbugs folder.

@joaotavora
Copy link
Owner

I was just registering the reason for closing the issue. Don't worry, I'll CC you the next email.

@andreyorst
Copy link

sorry, what's the status of this? I see this issue is closed as completed, but the mailing-list thread suggests that the feature is not yet supported?

I'm considering migrating from lsp-mode to Eglot once this feature is implemented, but the status is a bit unclear.

@dannyfreeman
Copy link

@andreyorst this issue is closed because discussion moved over to the mailing list. Work on it is still ongoing. I can post back here when its wrapped up for people who don't want to follow the mailing list discussion.

@dannyfreeman
Copy link

Posting a final resolution here. Installing my jarchive package resolves this issue: https://git.sr.ht/~dannyfreeman/jarchive
It's available on ELPA. The eglot-extend-to-xref setting works with it as well.

@abcdw you may want to update your guix recipe with a newer commit. Thank you for sharing that here.

@Andre0991
Copy link
Author

Hi @dannyfreeman.

I see that there are no new messages in the Emacs bug tracker 1 – will Emacs get a patch too or is jarchive a permanent solution?

@dannyfreeman
Copy link

I see that there are no new messages in the Emacs bug tracker 1 – will Emacs get a patch too or is jarchive a permanent solution?

There is still discussion on that ticket in the 2022-11 archives. I think this is the latest message at the time of me writing this. The way the mailing list archives are displayed can be confusing.

The way I see it, jarchive is the solution for now, and it is on ELPA: https://elpa.gnu.org/packages/jarchive.html which is the most available place for it to be next to emacs core IMO.

Maybe down the line jarchive or something like it will make its way into Emacs. I think that would be nice and I'll keep discussing it. Until then ELPA is a nice proving ground for this functionality.

@Andre0991
Copy link
Author

Oh, indeed, I missed the newest messages.

Thank you so much for tackling this – I just installed jarchive and LSP+Clojure is 100% supported by Emacs now.

@pesterhazy
Copy link

FWIW, I just tried jarchive, and it doesn't fix the issue for me. I get an error like this:

Error in post-command-hook (#[0 "\303\304\301\242\305#\210\300\306�!\205��r\211q\210
?\205��\307\310\311 \")\207" [#<buffer fs.cljc> (#0) eglot--managed-mode remove-hook post-command-hook nil buffer-live-p apply eglot--connect eglot--guess-contact] 4]): (file-missing "Setting current directory" "No such file or directory" "/Users/user/.m2/repository/babashka/fs/0.1.2/fs-0.1.2.jar::babashka/")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement lsp insufficiency As far as we know, LSP doesn't have enough power to deal with this
Projects
None yet
Development

No branches or pull requests

9 participants