x/tools/gopls: completions taking 20x as long since https://github.com/golang/tools/commit/41e4e5654988 #62665
What version of Go are you using (
||Cold start, no daemon||Same completion, repeated after cold start||Fresh session with daemon running from previous session|
|golang/tools@e2393aba883a (current tip)||2172ms||241ms||2213ms|
Log files for each result from the first column of timings:
I have bisected this slowdown to golang/tools@41e4e5654988.
This appears to highlight a couple of things to me:
- Type checking has slowed down at least in this use case (assuming that is the main factor influencing completion speed).
- Something possibly isn't working as expected with a pre-existing daemon instance, which is more easily seen given the previous point.
The text was updated successfully, but these errors were encountered:
Thanks for the report. Possibly related: golang/vscode-go#2889.
The CL you bisected to wouldn't have slowed down type checking. However, it would have resulted in many more completion candidates being considered. Before that CL it was possible that gopls miss some struct fields in the current type being selected (!). After that CL we should have no such omissions, but may also pull in additional candidates. The 20x slowdown you observe is likely due to parsing cost for building this candidate list; that cost is mitigated the second time around due to the parse cache.
This was on my radar due to golang/vscode-go#2889, and is among our highest priorities to address.
Based on my current understanding, you should only experience this in certain packages, and should only experience the 20x slowdown at startup, after which subsequent completions should be faster (though perhaps not as fast as before). Can you confirm if that is the case?
Sorry, I mean subsequent completions for other things, not just that one specific source location; during editing do you generally experience a 20x slowdown in completion, or 2.5x slowdown? The table suggests 2.5x, but I wanted to be sure.
Not a bug, but an oversight: the parse-cache is per-session, not per instance -- the parse cache is meant to be a minor optimization, not the massive optimization that is experienced here. This is actually strong evidence that the parse cache is involved.
Interestingly, the bug does not immediately reproduce for me in neovim or vs code (I see completions around 100ms), though it does reproduce for Alan. We're investigating why.
Truth be told I had to revert to the fast version because I was getting too frustrated :)
So unfortunately I don't have any evidence or examples beyond the analysis presented here.
So, we have new evidence, but we lost our reproducer and could use your help.
On Alan's machine, we started gopls with
This is the unimportedCompletion logic scraping the module cache. Unfortunately, we deleted our backup of the huge mod cache, and can no longer reproduce 🤦.
Could you do the following, using gopls@tip (or at the culprit)?
Sorry to enlist you, but absent a reproducer this is hard to track down. The steps above will help confirm our theory.
Great, happy to help.
All of my response is based on analysis using golang/tools@866a6b0ff32a.
This is what I get. Although for some reason I don't have the
Setting to "1ms" did not influence the completion time:
This feels like a separate bug?
Bingo (note I reverted the
Which triggers an interesting thought for me.
I would absolutely not have expected the edit and completion that I'm performing in this situation to have triggered unimported completions. Because they are so expensive, I would only have expected them to trigger in the following scenario:
with the cursor at the position
Otherwise, a trivial completion attempt in a "hello world" example:
takes ~2000ms from cold and ~200ms warm (same session, ignoring the daemon point for now).
Re questions about the VM setup, last time I checked my disk performance through VMWare Fusion was faster than native macOS. Incidentally, you might find that macOS (and windows?) users have been disproportionately reporting more speed problems if unimported completions are dominating as we think they are, especially where those users have large module caches. Might also be worth asking people to report the size of their mod cache for reference, unless you could also lazily compute and report this in log files.
Here is mine:
(because as I understand it the
In summary from my perspective:
Thanks for your response. The bug is understood now. Because of the change to how completion budget is implemented, a check for context cancellation that would have prevented unimported completion from running was eliminated.
The unfortunate/problematic thing is that this existed for multiple months without being caught. We do have several benchmarks for completion, but they were insensitive to this change because they don't capture the initial cost and because they operate in a fresh module cache. We've seen a few reports of slow completion our survey results and issues, and I had it on my list to revisit before the next release, but it would have been hard to hone in on this without your clear reproducer. Thank you again.
I'll note that this is a perfect example of where telemetry would have highlighted the regression -- we would have seen the regression in the release, and may even have root-caused it quickly by observing that it affected darwin more than linux.
A few comments:
You can expect that this will be fixed, and soon. In addition to simply fixing the cancellation logic, there appears to be a lot of low-hanging fruit that can make completion significantly faster.
The complicated part is not the fix itself, but rather the process that led to it.
Add instrumentation for the telemetry described in golang/go#63129, along with a test. Also isolate the existing telemetry regtest. Updates golang/go#63129 Updates golang/go#62665 Change-Id: If641b44d430eaaf96c469c804f42fd72cd821bed Reviewed-on: https://go-review.googlesource.com/c/tools/+/530215 LUCI-TryBot-Result: Go LUCI <firstname.lastname@example.org> Reviewed-by: Hyang-Ah Hana Kim <email@example.com>
Our completion benchmarks were too narrowly focused on type checking speed, and did not provide adequate coverage of unimported completion or completion budget. Run the CompletionFollowingEdit benchmarks with unimported completion enabled, and with a budget of 100ms. The former still seems to have little impact on performance, likely due to my environment, whereas the latter demonstrates a stark regression: with firstname.lastname@example.org, completion timing is bang-on 100ms, whereas with gopls@tip timing is largely unaffected by the budget. Also add a completion test in identifier context, as this exercises different code paths, and further highlights the regression in budget implementation. For golang/go#62665 Change-Id: I8b771225c7e2979ae1eb9e67e79b9b6165e2712b Reviewed-on: https://go-review.googlesource.com/c/tools/+/530015 Reviewed-by: Alan Donovan <email@example.com> LUCI-TryBot-Result: Go LUCI <firstname.lastname@example.org>
Partially revert CL 503016, to go back to starting the completion timer before type checking. As discussed in slack and golang/go#62665, even if this leads to inconsistent behavior across various LSP clients (due to order of requests) we should still do our best to interpret the completion budget from the user perspective. For golang/go#62665 Change-Id: I2035e2ecb7776cead7a19bd37b9df512fdbf3d17 Reviewed-on: https://go-review.googlesource.com/c/tools/+/530016 LUCI-TryBot-Result: Go LUCI <email@example.com> Reviewed-by: Alan Donovan <firstname.lastname@example.org>
Address a few considerations that were overlooked in CL 503016: - don't run goimports callbacks if the budget has expired - don't use a min depth for the second call to deepSearch For golang/go#62665 Change-Id: I23e2a3154049eabaff14c9c5071e7eea26e7c619 Reviewed-on: https://go-review.googlesource.com/c/tools/+/530017 LUCI-TryBot-Result: Go LUCI <email@example.com> Reviewed-by: Alan Donovan <firstname.lastname@example.org>
@myitcv I believe gopls@master should now have similar latency to what you observed in April. Please let me know if it doesn't.
However, I'm not closing this issue because there appeared to be a lot of low-hanging fruit to further optimize significantly. For example, I analyzed one of our completion benchmarks on Kubernetes, and observed that type checking was only 65ms, while completion took 170ms. I think we can get completion latency much closer to type checking latency.
Coincidentally, the potential optimizations I see are almost identical to what we did for workspace/symbol requests:
Thanks, just took a look.
Still seeing similar numbers to before (golang/tools@e2393aba883a) actually. ~20x slow down for the initial completion, and ~2.5x slow down for "warm" completions:
Do I need specific settings to make this "work"?
(I tried setting
One of the big pain points here is that ~2s at startup, with or without a daemon already running. Vim is just one of those tools that you open and close the whole time, so the daemon-based setup is critical.
That said, the ~230ms completion is very noticeable vs ~100ms too.
Did my suggestions in #62665 (comment) also make sense with regards to optimisations? Because whilst a different kind of optimisation, it feels like we want to avoid unimported completions as much as possible, and when "forced" to include them ensure that we have warm caches in the daemon (as you previously observed).
Previously we avoided using context cancellation for implementing completion budget, because it can result in accidentally omitted results (because e.g. constructing of completion items was canceled). But once we have a minimum required set of results, it is OK to use context cancellation to more strictly enforce the completion budget. For golang/go#62665 Change-Id: Ibd10e1f70a396567f8e095bd6824c637cf973518 Reviewed-on: https://go-review.googlesource.com/c/tools/+/530599 LUCI-TryBot-Result: Go LUCI <email@example.com> Reviewed-by: Peter Weinberger <firstname.lastname@example.org>
@myitcv can you try again at tip? Sorry to make you keep trying, but I can't reproduce and want to make sure we've successfully rolled back the regression before including additional optimizations.
You mean the suggestion not to run unimported completions in various contexts where it doesn't make sense? That's definitely a good idea, but I've concluded that it's not at the heart of the problem you observe (see below).
Upon staring at the code and experimenting, I've drawn the following conclusions:
Therefore, I think we should close this bug once we've verified a fix for the regression, and open new bugs for various completion logic improvements, which would constitute a larger project.
Good news! Tested against golang/tools@4b34fbf5f10d and we appear to be back in business!
Thanks very much for digging into this, @findleyr.
I'll leave you to close the issue or keep open as you see fit.