-
-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lsp/eglot configuration should be documented in the readme #71
Comments
One thing worth mentioning, that is natural to me but represents a real difference using corfu, pertains to its (relatively) long retention of the table function. Corfu basically expects backends to do good caching, invalidate their own cache, and re-compute completions, as needed. For backends that do lots of caching during one call to their CAPF (most I've encountered), corfu adds its own filtering via completion styles on top. Looking at LSP-mode's CAPF, you see that a single table function instance does in fact re-request completions from the server, but only if the starting position has changed or the original prefix no longer matches. So by default lsp-mode does a fair bit of caching, but recomputes candidates "when needed". This is very good behavior IMO. Many backends, it would appear, inappropriately cache their candidates, even when those candidates are no longer relevant. They expect the front-end to notice when a cache should not be relied upon, and take corrective action by dumping the completion table and getting a new one. To deal with this ambiguity, company, as discussed, aggressively dumps the completion table on new key events. The net effect for lsp-mode is querying the server on each key event (which for slow servers probably represents a small corfu advantage). LSP-mode uses a fuzzy sorting algorithm to order candidates. With corfu, additional key events do not dump the table, which means non-trivial completion styles (like orderless) can act, but only on the filtered/sorted cached results from the original call to the CAPF. This represents a kind of double-filtering. In fact, corfu relies on backend caching [1]. If a server-attached backend attempted to update candidates in real time based on what is typed in the buffer/at the prompt, completion styles would be defeated. Some people I think find this bizarre, since with corfu, the backend controls candidate selection up to the point of caching, and then corfu + completion styles controls sorting/selection while the cache is active. And there is no indication when/if the backend has dumped its cache. The additional cognitive difficulty comes from the fact that "one corfu popup session" is not equivalent to "one caching session" necessarily. And this means that even with a simple completion style, you'll get different results from, e.g. [1] Or its extreme form: "having the full candidate data-set available", as in the obarray for elisp-mode. |
Most of the points you describe are a consequence of the capf/completion API design. Corfu queries the Capfs and gets a completion table and an initial prefix as result. This completion table is kept alive and the backend has to recompute candidates when needed. I think there is not much doubt that this is valid behavior as long as the prefix itself is unaltered. The edge case is when the prefix is shortened by backspace, since then the candidates of the completion tables are suddenly less restricted than before. However there is already a misconception there, which I would like to point out: The input prefix is not necessarily a prefix. It is up to the completion style to decide how to interpret the input string. Of course this can create issue if backends are not fully compatible with all completion styles, in particular the ones which perform substring filtering instead of prefix filtering (substring, flex, orderless). Fortunately you can combine multiple completion styles and always add
We discussed this a few times already. I am still convinced that the current Corfu approach makes sense.
For these reasons I don't want to add such a special rule to request the completion table again from the backend. |
I would argue: and corfu's design. Which differs from other (popular) choices in some small but important ways — ways which I would not want to change, but which may surprise new users accustomed to the other approach. My main suggestion is that a short note describing this might be helpful to them. Happy to help with that if you agree.
Some experienced and knowledgeable people disagree on this point, and the docs are conflicting/ambiguous. And the most common in-buffer 3rd party UI doesn't even allow the backend a chance to recompute. So, whether correct or not, this isn't (yet) a common point of view. But I agree the "prefix shortened edge case" is perhaps the most important gray zone. And some backends do already adhere relatively well to this principle.
Indeed. Yet it is "the thing being completed" no matter how it used (prefix, flex, etc.), and if altered, it's perhaps reasonable to give up and request a new table. In actual practice, it is not even up to the completion style to interpret the prefix, since the server or process they are communicating with to provide the completions may have a different point of view about relevance. For "in-process" completion like elisp the style will always dominate, but that's a rather special case. More and more completion data will be provided from "out of process" I suspect. The simple-minded/antiquated CAPF notion of "just give me everything possible and I'll filter and sort it reasonably" is not well-matched to the problem. This fact, by the way, is why I think a 2-stage i) "backend suggests candidates", ii) "front-end filters them" divided approach makes such good sense, as is already the case in practice with e.g. corfu + (e.g.) lsp-mode. No amount of picking completion styles will affect what an lsp-server delivers. But I can certainly filter them according to it!
That's fair and as you say we've gone over it already. Perhaps the right thing to do is try to educate backend authors on the need to a) cache when possible, and b) invalidate the cache when sensible. It was good to see that LSP-mode actually does this pretty reasonably (given constraints). There does need to be some contract regarding when a UI will give up. As an extreme boundary condition, certainly if I delete the entire buffer text, a completion function shouldn't be expected to provide sensible completions any longer (corfu already deals with this). So we already know the time-horizon for a completion function is not infinite ;). |
The design is guided by the given APIs.
As of now I don't see any surprised users. It is good to discuss this of course. On the other hand I don't see new arguments. I appreciate that you want to help - but what would be needed most is an example configuration with the different lsp clients and other configuration tips which help new users to get on boarded. I don't consider documenting capf internals as helpful for new users.
My arguments still stand. The docs are basically non-existent. This means I have to rely on the construction of the API - and the API leads to the Corfu design as a consequence. Furthermore I can rely on the existence of reference implementations, for example the elisp backend which work well without dropping the table. Since you don't believe me about this, implement a patch which implements completion table dropping under various scenarios. Then we can continue the discussion based on hard facts. (EDIT: After looking into a possible implementation - the cleanest solution I came up with is the
This is the business of the backend. Either the backend leaves everything to the frontend, this means also the interpretation of the input - is it a prefix/substring/regexp. Furthermore this means that the entire filtering is the business of the frontend. Or the backend has its own opinion about the input and performs pre-filtering. If this is the case, then it should also be the decision of the backend on when to keep or drop the cached completion candidates. In the first case no dropping should happen. In the second case dropping should only happen when the backend is broken for some reason. It performs pre-filtering but at the same time it does not handle it correctly, when the input changes.
I agree. However there is nothing preventing the backends from implementing this correctly. As I understood, for lsp-mode it works well already? I would really appreciate if you post your configurations somewhere - this could be helpful for other users.
Currently the contract is pretty simple. If you set |
The documentation of the
they should ideally only choose which kind of completion table to use, and not pre-filter it based on the current text between START and END It says ideally - do whatever you want. But I am repeating the exact same argument I made a while ago. If the capf follows this guideline we will get the ideal behavior. If it does not we will get something non-ideal - but it may still work well enough in most scenarios, except for the edge case when the prefix is deleted. |
Here I present to you the (defun cape-capf-buster (capf)
(lambda ()
(pcase (funcall capf)
(`(,beg ,end ,_table . ,plist)
`(,beg ,end
,(lambda (str pred action)
(complete-with-action action
(pcase (funcall capf)
(`(,_beg ,_end ,table . ,_plist) table)
(_ nil))
str pred))
,@plist)))))
(setq-local completion-at-point-functions (list (cape-capf-buster #'cape-file-capf))) EDIT: We may want to refine the cache buster to only bust the cache when the first word (before a space) of the input changes. This way we don't bust the cache and still allow Orderless filtering. |
Reworked cache buster with Orderless support: (defun cape-capf-buster (capf)
(lambda ()
(pcase (funcall capf)
(`(,beg ,end ,table . ,plist)
(let* ((start (copy-marker beg))
(input (buffer-substring-no-properties start (point))))
`(,beg ,end
,(lambda (str pred action)
(let ((new-input (buffer-substring-no-properties start (point))))
(unless (or (equal new-input input) (string-match-p "\\s-" new-input))
(pcase (funcall capf)
(`(,_beg ,_end ,new-table . ,_plist)
(message "BUST")
(setq table new-table input new-input)))))
(complete-with-action action table str pred))
,@plist))))))
(setq-local completion-at-point-functions (list (cape-capf-buster #'cape-file-capf))) |
See discussion in minad/corfu#71
See discussion in minad/corfu#71
@jdtsmith I added the capf buster to the Cape package. See https://github.com/minad/cape#capf-buster---cache-busting. This approach gives us a lot of flexibility. We can add all different kinds of Capf transformers to the Cape package, we will probably come up with more in the future. Please let me know what you think. The reiteration of this discussion lead me to rework some of the Capfs of the Cape package to behave better with regards to caching. |
As a total aside, I think the code in the comment linked above is not quite right anymore: LSP now starts company unconditionally, if the I'm not sure how this is supposed to fit together, but I added |
@antifuchs I asked and that is the recommended approach. Confusing name choice. |
Oof, yeah - at least it's consistent in its nonsensical-ness (: |
Maybe the lsp-mode maintainers will reconsider in the future and rename the relevant variables in order to make this less confusing. In the longer term one could even ask if Corfu could be detected or auto-activated. But of course this requires a more significant adoption. Unfortunately there are other issues like emacs-lsp/lsp-mode#3225. Hopefully this issue is resolved soon. @antifuchs Did you observe other issues? You talked about another bug above, or was this purely related to the company auto-activation? @jdtsmith Btw, your issue emacs-lsp/lsp-mode#3201 reminded me again of the possibility to add another Capf-transformer (https://github.com/minad/cape#other-capf-transformers). We could add a transformer which takes the docbuffer and uses the first line as docsig. It would be better to implement this directly in the backend of course, since the backend knows best how to implement this efficiently and correctly (similar argument as with the caching). But we could offer a transformer as a stop gap measure. Same thing with docsig->annotation-function or even file extension->kind-function (file-image, file-text, file-binary, file-video, ...), assuming that one could define such special file kinds. |
Yeah that's a good thought and better than the affixation hack approach I used: wrap the table. I'm hopeful they can extract more usable information from the LSP data stream, but this is a good workaround. |
BTW, the capf-buster is a killer idea. It allows you to bust the cache of individual misbehaving/differently-behaving capfs as a workaround, then consult the authors to correct. It's a targeted workaround, and therefore better IMO than constant across-the-board table-dumping. The latter encourages (bad) backends to be sloppy with their caches, and gives good backends no performance reward for the correct caching they implement. And it also makes it possible to use capf's whose authors fundamentally disagree on the division of responsibility between the UI and completion table in terms of keeping candidates up to date. I haven't tried with it yet, but the eglot capf is in this category, and would probably respond well to cache-buster wrapping. |
Thanks! Yes, and it keeps the current structure of Corfu intact - the cache busting is done by the Capf/table. Note that Corfu basically only cares about the completion table for its entire process, not about the Capf. Capfs are only the provider of the completion table. Furthermore as we discussed there are many problematic scenarios. The benign case is a static completion table (obarry), where the entire filtering happens in the completion style. Issues are kind of expected for more complicated scenarios and combinations with various completion styles. They can be worked around on a case by case basis. |
In fact oddly after testing I've now come to realize that lsp-mode does not correctly dump its cache when the prefix is altered, despite having code which very much appears to be doing this. Perhaps a bug, and likely hard to detect given company is the default. I'll give the buster a try with it and let you know how it works. |
Maybe this should be improved on the side of lsp-mode, but I would like to point out one specific problem: If you use Orderless the cache dumping should happen when you alter the prefix (in both directions!), but only as long as the prefix does not contain spaces. The dumping should also happen when you make the prefix longer since I've heard that lsp servers limit the amount of returned results. If you use a longer prefix, you may get different results. Therefore you have to use CMP=equal for the capf buster.
Now the problem with space is specific Orderless - the cache busting depends on the completion style the user has configured! This is a strong reason to keep the cache buster separate and neither hard coded in the UI nor hard coded in the backend! |
Yeah. And for "pure prefix completers" (like the iPython mode I'm working on) you'd want/need to dump only the last two. In fact that's what I prefer even for things like lsp-mode, for which the following will never hold true:
That is a very "completion data are in-process" viewpoint. We will never get something like LSP to give us all possible completions at a point, independent of the text in the buffer! Too expensive. So I prefer the following: if the in-buffer prefix text is unchanged, keep the same cached candidates and filter them using subsequent input in the buffer. I've tried the capf-buster with lsp-mode and it works great. Still not sure why its own cache invalidation is failing; I may open a report. |
Looking forward to this. I fiddled quite a bit with cape but I think I've found a satisfactory design now. It is all not that obvious.
|
Don't be too excited, it will likely not pass your smell-test, as it requires some deep iPython hackery to get completions out quickly in usable form. I just recently figured out how to work around the fact that accept-process-output sometimes exits non-locally for no good reason. It was most readily revealed by dialing `corfu-auto-delay' down to very low values like 0.05s. I'm not sure if you are an interactive python user (I've only recently started), but I have put up one small piece of the system already — python-mls. Working on this does make clear how much conceptually simpler "prefix-only" completion systems are. But then again I often wish I had orderless in zsh...
Haven't checked the new stuff, but how are you handling the pre-existing `completion-table-with-cache'? Superseding it? |
I use it from time to time, but I am not a heavy user.
Indeed.
I would say
For |
Good thoughts, the in-buffer comparison does seem smarter. Another leftover from "completion is prefix-completion" era. I had forgotten about that issue. As an aside, my current cache invalidation (in the table function), looks like: (when (and
last-probe (not (string-empty-p probe)) ;; should be good
(not (string-prefix-p last-probe probe))) ;; but alas
(ipython-better--completion-data-recalculate) ;invalidate cache
(setq last-probe probe)) (probe = str passed). This is prefix-only, but it's possible I shouldn't be considering the probe text at all. I wish there were documentation somewhere for what probe-strings different completion styles will pass. |
Yeah, but it is actually a side effect, since the completion table accesses its outer context. But there is no alternative, the tables which are affected by this are already impure to get started with. The only tables which are pure are tables like obarray, or tables produced by capfs which precompute the candidates early, then capture them in the table closure and retain them there forever without ever checking the buffer string again.
I cannot judge if this will do it from the small snippet. However intuitively I think it is insufficient since you rely on the STR argument and not the in-buffer text. As we discussed, this STR argument is only meaningful for I really recommend you look in detail into You may also consider to implement your table "naively" and then rely on external combinators like |
Thanks for these thoughts. Most tables are indeed impure. The key is I do not rely on STR for the generation of candidates, but instead provide the entire line to iPython and ask it to work it out. So I'm not giving STR double-duty. But I haven't explored all possible completion styles. Orderless works great (for filtering), but it is sort of a special case given its heavy use of "out of band" filtering using `completion-regexp-list' — in some ways it's more forgiving of "impurity". I did dig around with -dynamic and -cache before coming up with the design. As a result of this, I have real sympathy for the authors of eglot/lsp-mode/etc. where, when you start completion, you do not yet even know what your server will consider the completable prefix! I solve this by "conservatively guessing", then using bounds to "touch up" the completing region after hearing back from iPython. But definitely a frontend-backend mismatch. I've also been thinking about the need for naive CAPFs for this to be an approachable combinatorial problem. The real complexity is that there are CAPFs which:
|
And necessarily impure!
Both these cases work well with the capf buster. The capf buster comes before the existing cache so it will overrule it.
I disagree with this approach for various technical reasons, in particular since it introduces global state and a memory leak, but it is not that of a big deal. Company at least does a bit of caching itself, see I would like to note: If you use the |
Yes. Here's a puzzler and example of a "sort of working" cache: lsp-mode's cache invalidation failing when typing slowly, but working when you type "fast"... Screen.Recording.2021-11-26.at.10.54.58.AM.movNo idea. I've noticed lots of other small inconsistencies too. With the capf-buster (default prefix-valid mode), it seems to "just work", probably because it's closer to (well-tested) company behavior. Not clear whether to pursue fixing these upstream or just work around like this. |
Oh crap, yes this one is very problematic. Corfu treats completion tables as interruptible: Line 602 in e11468e
This is the only way we can implement a responsive interface in the presence of slow backends. Unfortunately this breaks backends where the internal state is manipulated in a non-interruptible way. For my cape capfs I make sure that they are written in an interruptible way. Now I should probably add a |
Also not clear to me. Let's just write down the requirements of Capfs for now:
|
Both lsp-mode and eglot drop the request if user input arrives, so they don't need Needless to say, neither this behavior, nor being able to handle |
@dgutov Says who? Icomplete uses while-no-input and has used for a long time, so I say it is part of the contract. But there are a few tables which don't handle it well so to say, notably Tramp. I special case Tramp in Vertico for this reason. But besides that I've not observed big issues. On the other hand maybe I just don't type fast enough. Anyways In order to mitigate issues I am playing around with Capf combinators and Capf transformers, see https://github.com/minad/cape#other-capf-transformers. The relevant one for this particular issue is |
The doc. Which doesn't mention any of the limitations on completion tables' working which are needed for this to function reliably. Icomplete doesn't define the contract, it's just one of the clients. Which has historically worked only with particular completion tables. |
Yes it should be mentioned in the docs. However the docs are basically non-existent, so I am deducing the intended contract from the bits I found around here and there. Icomplete does not define the contract, but it is an old client and since it is continuously updating, I consider it a reference implementation for similar clients. So by putting together the information, this is the contract I came up with #71 (comment). It is perfectly fine if there is disagreement here and there, since the contract has never been clearly defined. It is no wonder that there are all kinds of different tables in the wild. Certainly the interruptibility of completion tables is the most problematic part of the points I mentioned above. |
Btw, Dmitry from a formal point of view, there are only two possibilities regarding interruptibility of completion tables. Either a completion table can be interruptible from outside or it cannot be interrupted at all. It is formally excluded for a completion table to use |
But Corfu isn't really similar, is it? Icomplete hasn't had to deal with in-buffer completion.
Yeah, tell me about it.
Or introduce an asynchronous interface with e.g. interruptible futures. Letting the UI/frontend itself call their |
Icomplete also has an in-buffer mode. It is or was broken. You certainly have a point, that in-buffer completion is more difficult and problematic, since these capf completion tables often maintain state. This is different from minibuffer completing-read where the tables are mostly stateless and pure and as such are interruptible. Technically Corfu is similar to in-buffer Icomplete, except that it works mostly well. ;)
Was there a debate about this?
Yes. See also minad/cape#7. Not sure if you already saw this? |
Just catching up here. So @minad, you conjecture corfu's interrupting call to lsp-mode is what is causing the cache to be (apparently) correctly dropped when typing fast? So it behaves like company in terms of table dropping when it gets interrupted? But that still leaves a bug in lsp-mode's internal cache-dropping logic, yes? |
Yeah, sorry. Been meaning to respond, will do so now. |
@jdtsmith No, I think Corfu interrupting the lsp-mode completion table may corrupt the lsp-mode completion table. But maybe it is not the case, this was speculation. |
I just wrapped lsp-completion-at-point in cape-noninterruptable-capf, and... no change in behavior. That's not surprising, because the incorrect cache-retaining behavior was with slow-typing, where interruption is less likely. I think for lsp's, interruption serves a real purpose since the server can be told "forget about it". And I think in this case it may be providing that service "itself" based on @dgutov's report.... i.e. I am triggering lsp-mode's fast-input-cache-dump behavior. Best behavior comes with a cape-capf-buster wrap, which comes closer to company's skepticism regarding the internal caching habits of backends... |
Oh, then I misunderstood you above. When I read what you wrote, I somehow swapped fast and slow in my mind. |
Amusingly in this case corfu's interruption helped matters. |
The lsp-mode configuration is documented well in the wiki now. |
See comment in #41 (comment).
Try multiple possible configurations for each server:
Needs investigation. PR welcome!
The text was updated successfully, but these errors were encountered: