fix(lsp): request only changed portions of the buffer in changetracker #16277

dmitmel · 2021-11-10T13:21:11Z

This PR is an optimization which fixes the issue I have mentioned in the #neovim-dev room. In short: some plugins, most notably vim-commentary and tcomment.vim, generate a lot of on_lines events when applied to a long range. In my particular case, if I comment out a range with either of the plugins I have mentioned, they both generate one on_lines per each commented line. The issue here is that changetracking.prepare requests the entire buffer contents with nvim_buf_get_lines(bufnr, 0, -1, true), which, while in itself is pretty fast, adds up quickly on hundreds of on_lines events.

I fixed the problem by only requesting the changed parts of the buffer since, obviously, we do get the change ranges. The unchanged lines are taken from the table with previous lines. A function such as Array#splice from JavaScript would've helped here for modifying the lines table in-place, but alas, the best we get in Lua is table.insert/table.remove, which can result in easily result in hidden execution time complexity (case in point: hrsh7th/cmp-buffer#18), and a proper implementation of this splice function can get pretty long (see the cmp-buffer PR, or an ECMAScript-compliant implementation here), so I decided against in-place modification.

Not all is lost, however, as I have found that double-buffering the lines tables also helps. This improves performance by reducing the load on the garbage collector because very long tables are not allocated on every on_lines. The way it is done is described in a comment in the code, but basically, when the incremental_changes function is done with curr_lines and prev_lines, curr_lines is saved as it normally would be, but prev_lines is cleared and also saved because clearing the contents of the table doesn't actually de-allocate the internal storage of the table, the keys with nil values are deleted only on the next garbage collection cycle. As such, a table with a fully-allocated internal storage is kept, which will be re-used on the next on_lines event as curr_lines.

Does the logic in the loops which construct curr_lines need commenting? Perhaps this code should go into a separate function, or a method on changetracking?

runtime/lua/vim/lsp.lua

mjlbach · 2021-11-10T16:13:15Z

Perhaps this code should go into a separate function, or a method on changetracking?

We can elevate this into utils

mjlbach

LGTM pending our discussion on discussion on matrix/the couple comments I left.

include the benchmark for pre/post 'nil-ing' the prev_lines table
include the benchmark for incrementally update curr_lines
check to what extent this mitigates lsp: synchronization bottleneck due to buf_get_lines (caused by line-by-line changes) #11867 and Saving large files is extremely slow when using nvim-lsp and rustfmt #14205. In my short tests it doesn't (which is ok/not a blocker, just want to make sure we are documenting things correctly)

Thanks for the PR!

dmitmel · 2021-11-13T16:23:39Z

include the benchmark for pre/post 'nil-ing' the prev_lines table

include the benchmark for incrementally update curr_lines

Done

mjlbach · 2021-11-13T16:26:56Z

This code measures the time taken to comment-out contents of the whole
buffer and then undo the change. tpope's vim-commentary is a pretty good
candidate for this test because it is very popular and suffers from the
problem of generating a text change for every commented-out line. All
text changes will be recorded within a single undo point, and so both
the comment-out and undo commands will emit a ton of on_lines events.
Granted, most likely the need for commenting out thousands of lines
doesn't occur every day, but I believe this is a good substitute for an
action generating a lot of text editing events.

I will be performing tests on files in the neovim repository at commit
2ecf0a4 and neovim itself built with
CMAKE_BUILD_TYPE=Release. As the execution time improvements seem to be
exponential, I am also going to include results for files with line
count of different orders of magnitude. The baseline is the
aforementioned commit 2ecf0a4, the
value after comment= is the AVERAGE number of seconds it took to
comment-out the entire file, and after undo= tells how long, on AVERAGE,
undoing that took.

Is a bit colloquial (first person) and long, can you make it a bit more dry/short with bullet points? Anyways otherwise LGTM, thanks for the changes. Tagging mathias for review as a formality

mjlbach · 2021-11-13T16:28:15Z

Also you didn't end up including

handle buffer line endings in buf_get_full_text, continuing the work in fix(lsp): rewrite incremental sync #16252

AFAIK

dmitmel · 2021-11-13T16:36:28Z

Also you didn't end up including

~~I moved it to #16286, remember? Should I move that fix back here?~~

Anyway, I've decided that it fits here better

mjlbach · 2021-11-13T16:41:13Z

Anyway, I've decided that it fits here better

No, I meant that your commit message is not accurate :)

dmitmel · 2021-11-13T16:47:44Z

No, I meant that your commit message is not accurate :)

Crap. Well, I have already moved the fix back. Do you mind it?

mjlbach · 2021-11-13T16:48:25Z

Can you just unmove it back? It doesn't make sense to group the commits this way

dmitmel · 2021-11-13T16:58:58Z

check to what extent this mitigates #11867 and #14205

Gonna try to check this

mjlbach · 2021-11-13T16:59:34Z

I didn't notice a difference with my repro

mjlbach · 2021-11-13T17:17:50Z

The following code measures the time required to perform a buffer edit to
that operates individually on each line, common to plugins such as vim-commentary.

<insert code here>

All text changes will be recorded within a single undo operations. Both the 
comment operation itself and the undo operation will generate an on_line event
for each changed line.

Using the neovim codebase as an example (commit 2ecf04) with  neovim itself built 
at 2ecf04 with CMAKE_BUILD_TYPE=Release shows the following
performance improvement for double buffering <insert explanation here>

This performance improvements is due to avoiding expensive lua string reallocations on each operation, and instead
<insert explanation here>

dmitmel · 2021-11-13T17:37:13Z

Can confirm that this fixes #11867 (~26s -> ~0.4s). setline seems to cause a lot of on_lines events, even the following code to trigger it is enough:

call setline(1, getline(1, '$'))

Judging by the nature of that issue (an on_lines being fired for every line in the buffer), I believe that in the end my tests are realistic.

mjlbach · 2021-11-13T17:39:06Z

Cool, when you update the commit message and mathias rubber stamps (or requests changes) we will merge.

dmitmel · 2021-11-13T17:40:44Z

Please wait, checking the other issue.

dmitmel · 2021-11-13T18:09:24Z

Fixes #14205 too due to rust.vim's usage of setline().

dmitmel · 2021-11-14T11:55:13Z

Edited the commit message

mjlbach · 2021-11-15T19:34:04Z

I'm going to give the fixed incremental sync implementation a few days ~= week for people to report bugs @mfussenegger

Otherwise this PR is failing lint.

dmitmel · 2021-11-15T19:42:28Z

Otherwise this PR is failing lint.

The failure is in unrelated C code, does this happen to other PRs too?

mjlbach · 2021-11-21T17:03:22Z

Mathias will merge this PR when he has reviewed. It will likely be in 0.6

dmitmel · 2021-11-21T17:13:09Z

Nice. Thank you.

dmitmel · 2021-11-22T10:05:02Z

I should also note that I have found that this PR may break on_reload logic. I'll investigate this.

dmitmel · 2021-11-24T16:29:09Z

False alarm, this was a bug in a specific Language Server, I've fixed it: LuaLS/lua-language-server#821

dmitmel · 2021-11-24T17:05:21Z

I have discovered a different bug related to on_reload. In short, the on_reload handler must ~~flush~~ forget all debounced didChange messages before sending didClose and didOpen. I'll fix this in a separate PR. Also, I'm going to rebase this one because it has conflicts with the master branch.

The following code measures the time required to perform a buffer edit to that operates individually on each line, common to plugins such as vim-commentary. set rtp+=~/.config/nvim/plugged/nvim-lspconfig set rtp+=~/.config/nvim/plugged/vim-commentary lua require('lspconfig')['ccls'].setup({}) function! Benchmark(tries) abort let results_comment = [] let results_undo = [] for i in range(a:tries) echo printf('run %d', i+1) let begin = reltime() normal gggcG call add(results_comment, reltimefloat(reltime(begin))) let begin = reltime() silent! undo call add(results_undo, reltimefloat(reltime(begin))) redraw endfor let avg_comment = 0.0 let avg_undo = 0.0 for i in range(a:tries) echomsg printf('run %3d: comment=%fs undo=%fs', i+1, results_comment[i], results_undo[i]) let avg_comment += results_comment[i] let avg_undo += results_undo[i] endfor echomsg printf('average: comment=%fs undo=%fs', avg_comment / a:tries, avg_undo / a:tries) endfunction command! -bar Benchmark call Benchmark(10) All text changes will be recorded within a single undo operation. Both the comment operation itself and the undo operation will generate an on_lines event for each changed line. Formatter plugins using setline() have also been found to exhibit the same problem (neoformat, :RustFmt in rust.vim), as this function too generates an on_lines event for every line it changes. Using the neovim codebase as an example (commit 2ecf0a4) with neovim itself built at 2ecf0a4 with CMAKE_BUILD_TYPE=Release shows the following performance improvement: src/nvim/lua/executor.c, 1432 lines: baseline, no optimizations: comment=0.540587s undo=0.440249s without double-buffering optimization: comment=0.183314s undo=0.060663s all optimizations in this commit: comment=0.174850s undo=0.052789s src/nvim/search.c, 5467 lines: baseline, no optimizations: comment=7.420446s undo=7.656624s without double-buffering optimization: comment=0.889048s undo=0.486026s all optimizations in this commit: comment=0.662899s undo=0.243628s src/nvim/eval.c, 11355 lines: baseline, no optimizations: comment=41.775695s undo=44.583374s without double-buffering optimization: comment=3.643933s undo=2.817158s all optimizations in this commit: comment=1.510886s undo=0.707928s This performance improvement is due to avoiding expensive lua string reallocations on each operation, and instead requesting only the changed chunk of the buffer, as reported by firstline and new_lastline parameters of on_lines, plus re-using already allocated tables for storing the resulting lines to reduce the load on the garbage collector.

dmitmel · 2021-11-27T15:49:40Z

Is there anything left to be done with regards to this PR?

mjlbach · 2021-11-27T15:50:49Z

No, I passed review/merging on a section of my core responsibilities over to @mfussenegger

mjlbach · 2022-01-16T21:54:56Z

@dmitmel When I tried benchmarking this I didn't get the speedup I expected, using your reproduction from #11867 (comment)

I outlined the situation (which you are probably already familiar with here): #11867 (comment).

I think an alternative to this PR would involve bypassing get_lines completely and caching firstline, lastline, newlastline, etc. on each on_lines call, and only calling buf_get_lines at the beginning and end of the debounce window (without any of the intermediate caching as implemented in this PR).

The alternative, which I am also ok with, is to tell users to switch to plugins which apply better atomic buffer changes (which IMO is what they should be doing anyways).

What do you think?

dmitmel · 2022-01-16T22:03:24Z

I think an alternative to this PR would involve bypassing get_lines completely and caching firstline, lastline, newlastline, etc. on each on_lines call, and only calling buf_get_lines at the beginning and end of the debounce window (without any of the intermediate caching as implemented in this PR).

Honestly, at that point you might as well stop using firstline/lastline and find the modified range using the procedure I have implemented in #16437. As we have previously established in the Matrix chat, comparing even tens of thousands of strings is very very cheap, and that would considerably simplify the code by bypassing the logic necessary for combining all of the intermediate firstline/lastline ranges into a single one.

mjlbach · 2022-01-16T22:08:20Z

considerably simplify the code by bypassing the logic necessary for combining all of the intermediate firstline/lastline ranges into a single one

I'm pretty sure that's like 3-4 lines of code...

Anyways, if you have time can you verify this doesn't actually address the performance issues or point out how I can verify that it does? I used basically config from #11867 (comment)

dmitmel · 2022-01-16T22:34:06Z

No idea. For me it does work, and the improvement is significant (22s -> 0.5s). All I did was compile Nvim on this branch and run the command I specified. Can you record a screencast or something like that?

mjlbach · 2022-01-16T22:38:57Z

@dmitmel if you want the actual authorship of the commit rather than coauthor you can just apply the changes from https://github.com/neovim/neovim/pull/17118/files directly to this PR which updates it to address the rebase conflicts.

mjlbach · 2022-01-17T16:20:06Z

I went ahead and merged this in #17118.

Thanks for your contribution and for addressing all of my questions :)

github-actions bot added lsp lua stdlib and removed lua stdlib labels Nov 10, 2021

mjlbach reviewed Nov 10, 2021

View reviewed changes

runtime/lua/vim/lsp.lua Outdated Show resolved Hide resolved

mjlbach suggested changes Nov 10, 2021

View reviewed changes

mjlbach mentioned this pull request Nov 10, 2021

Buffer content in LSP notifications / requests is always split with \n, regardless of file format #16231

Closed

dmitmel mentioned this pull request Nov 11, 2021

feat(lsp): add lsp.util.split_lines and lsp.util.split_lines_iter #16286

Closed

dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch 3 times, most recently from 12b8f1f to eee1825 Compare November 13, 2021 16:22

mjlbach approved these changes Nov 13, 2021

View reviewed changes

mjlbach requested a review from mfussenegger November 13, 2021 16:27

dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from eee1825 to 5fe2de0 Compare November 13, 2021 16:40

dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from 5fe2de0 to 7fbba2e Compare November 13, 2021 16:50

dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from 7fbba2e to 50b9e81 Compare November 14, 2021 11:55

dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from 50b9e81 to aa407da Compare November 22, 2021 09:59

dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from aa407da to a070aad Compare November 24, 2021 17:10

dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from a070aad to d324dec Compare November 26, 2021 19:45

mjlbach added this to the 0.6 milestone Nov 27, 2021

clason modified the milestones: 0.6, 0.7 Nov 29, 2021

mjlbach mentioned this pull request Jan 16, 2022

lsp: synchronization bottleneck due to buf_get_lines (caused by line-by-line changes) #11867

Closed

mjlbach mentioned this pull request Jan 16, 2022

perf(lsp): request only changed portions of the buffer in changetracker #17118

Merged

mjlbach closed this Jan 17, 2022

dmitmel deleted the lsp-sync-faster-buf-get-lines branch January 19, 2022 01:35

zeertzjq modified the milestones: 0.8, 0.7 Mar 12, 2022

dundargoc removed the request for review from mfussenegger October 29, 2022 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(lsp): request only changed portions of the buffer in changetracker #16277

fix(lsp): request only changed portions of the buffer in changetracker #16277

dmitmel commented Nov 10, 2021 •

edited by mjlbach

mjlbach commented Nov 10, 2021

mjlbach left a comment •

edited

dmitmel commented Nov 13, 2021

mjlbach commented Nov 13, 2021 •

edited

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021 •

edited

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021

mjlbach commented Nov 13, 2021

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021 •

edited

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021

dmitmel commented Nov 13, 2021

dmitmel commented Nov 14, 2021

mjlbach commented Nov 15, 2021 •

edited

dmitmel commented Nov 15, 2021

mjlbach commented Nov 21, 2021

dmitmel commented Nov 21, 2021

dmitmel commented Nov 22, 2021

dmitmel commented Nov 24, 2021

dmitmel commented Nov 24, 2021 •

edited

dmitmel commented Nov 27, 2021

mjlbach commented Nov 27, 2021

mjlbach commented Jan 16, 2022 •

edited

dmitmel commented Jan 16, 2022

mjlbach commented Jan 16, 2022 •

edited

dmitmel commented Jan 16, 2022 •

edited

mjlbach commented Jan 16, 2022 •

edited

mjlbach commented Jan 17, 2022

fix(lsp): request only changed portions of the buffer in changetracker #16277

fix(lsp): request only changed portions of the buffer in changetracker #16277

Conversation

dmitmel commented Nov 10, 2021 • edited by mjlbach

mjlbach commented Nov 10, 2021

mjlbach left a comment • edited

Choose a reason for hiding this comment

dmitmel commented Nov 13, 2021

mjlbach commented Nov 13, 2021 • edited

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021 • edited

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021

mjlbach commented Nov 13, 2021

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021 • edited

mjlbach commented Nov 13, 2021

dmitmel commented Nov 13, 2021

dmitmel commented Nov 13, 2021

dmitmel commented Nov 14, 2021

mjlbach commented Nov 15, 2021 • edited

dmitmel commented Nov 15, 2021

mjlbach commented Nov 21, 2021

dmitmel commented Nov 21, 2021

dmitmel commented Nov 22, 2021

dmitmel commented Nov 24, 2021

dmitmel commented Nov 24, 2021 • edited

dmitmel commented Nov 27, 2021

mjlbach commented Nov 27, 2021

mjlbach commented Jan 16, 2022 • edited

dmitmel commented Jan 16, 2022

mjlbach commented Jan 16, 2022 • edited

dmitmel commented Jan 16, 2022 • edited

mjlbach commented Jan 16, 2022 • edited

mjlbach commented Jan 17, 2022

dmitmel commented Nov 10, 2021 •

edited by mjlbach

mjlbach left a comment •

edited

mjlbach commented Nov 13, 2021 •

edited

dmitmel commented Nov 13, 2021 •

edited

dmitmel commented Nov 13, 2021 •

edited

mjlbach commented Nov 15, 2021 •

edited

dmitmel commented Nov 24, 2021 •

edited

mjlbach commented Jan 16, 2022 •

edited

mjlbach commented Jan 16, 2022 •

edited

dmitmel commented Jan 16, 2022 •

edited

mjlbach commented Jan 16, 2022 •

edited