Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(lsp): request only changed portions of the buffer in changetracker #16277

Closed
wants to merge 1 commit into from

Conversation

dmitmel
Copy link
Contributor

@dmitmel dmitmel commented Nov 10, 2021

Closes #11867

This PR is an optimization which fixes the issue I have mentioned in the #neovim-dev room. In short: some plugins, most notably vim-commentary and tcomment.vim, generate a lot of on_lines events when applied to a long range. In my particular case, if I comment out a range with either of the plugins I have mentioned, they both generate one on_lines per each commented line. The issue here is that changetracking.prepare requests the entire buffer contents with nvim_buf_get_lines(bufnr, 0, -1, true), which, while in itself is pretty fast, adds up quickly on hundreds of on_lines events.

I fixed the problem by only requesting the changed parts of the buffer since, obviously, we do get the change ranges. The unchanged lines are taken from the table with previous lines. A function such as Array#splice from JavaScript would've helped here for modifying the lines table in-place, but alas, the best we get in Lua is table.insert/table.remove, which can result in easily result in hidden execution time complexity (case in point: hrsh7th/cmp-buffer#18), and a proper implementation of this splice function can get pretty long (see the cmp-buffer PR, or an ECMAScript-compliant implementation here), so I decided against in-place modification.

Not all is lost, however, as I have found that double-buffering the lines tables also helps. This improves performance by reducing the load on the garbage collector because very long tables are not allocated on every on_lines. The way it is done is described in a comment in the code, but basically, when the incremental_changes function is done with curr_lines and prev_lines, curr_lines is saved as it normally would be, but prev_lines is cleared and also saved because clearing the contents of the table doesn't actually de-allocate the internal storage of the table, the keys with nil values are deleted only on the next garbage collection cycle. As such, a table with a fully-allocated internal storage is kept, which will be re-used on the next on_lines event as curr_lines.

Does the logic in the loops which construct curr_lines need commenting? Perhaps this code should go into a separate function, or a method on changetracking?

@github-actions github-actions bot added lsp lua stdlib and removed lua stdlib labels Nov 10, 2021
runtime/lua/vim/lsp.lua Outdated Show resolved Hide resolved
@mjlbach
Copy link
Contributor

mjlbach commented Nov 10, 2021

Perhaps this code should go into a separate function, or a method on changetracking?

We can elevate this into utils

Copy link
Contributor

@mjlbach mjlbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending our discussion on discussion on matrix/the couple comments I left.

Thanks for the PR!

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 13, 2021

  • include the benchmark for pre/post 'nil-ing' the prev_lines table

  • include the benchmark for incrementally update curr_lines

Done

@mjlbach
Copy link
Contributor

mjlbach commented Nov 13, 2021

This code measures the time taken to comment-out contents of the whole
buffer and then undo the change. tpope's vim-commentary is a pretty good
candidate for this test because it is very popular and suffers from the
problem of generating a text change for every commented-out line. All
text changes will be recorded within a single undo point, and so both
the comment-out and undo commands will emit a ton of on_lines events.
Granted, most likely the need for commenting out thousands of lines
doesn't occur every day, but I believe this is a good substitute for an
action generating a lot of text editing events.

I will be performing tests on files in the neovim repository at commit
2ecf0a4 and neovim itself built with
CMAKE_BUILD_TYPE=Release. As the execution time improvements seem to be
exponential, I am also going to include results for files with line
count of different orders of magnitude. The baseline is the
aforementioned commit 2ecf0a4, the
value after comment= is the AVERAGE number of seconds it took to
comment-out the entire file, and after undo= tells how long, on AVERAGE,
undoing that took.

Is a bit colloquial (first person) and long, can you make it a bit more dry/short with bullet points? Anyways otherwise LGTM, thanks for the changes. Tagging mathias for review as a formality

@mjlbach
Copy link
Contributor

mjlbach commented Nov 13, 2021

Also you didn't end up including

AFAIK

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 13, 2021

Also you didn't end up including

I moved it to #16286, remember? Should I move that fix back here?

Anyway, I've decided that it fits here better

@dmitmel dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from eee1825 to 5fe2de0 Compare November 13, 2021 16:40
@mjlbach
Copy link
Contributor

mjlbach commented Nov 13, 2021

Anyway, I've decided that it fits here better

No, I meant that your commit message is not accurate :)

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 13, 2021

No, I meant that your commit message is not accurate :)

Crap. Well, I have already moved the fix back. Do you mind it?

@mjlbach
Copy link
Contributor

mjlbach commented Nov 13, 2021

Can you just unmove it back? It doesn't make sense to group the commits this way

@dmitmel dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from 5fe2de0 to 7fbba2e Compare November 13, 2021 16:50
@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 13, 2021

check to what extent this mitigates #11867 and #14205

Gonna try to check this

@mjlbach
Copy link
Contributor

mjlbach commented Nov 13, 2021

I didn't notice a difference with my repro

@mjlbach
Copy link
Contributor

mjlbach commented Nov 13, 2021

The following code measures the time required to perform a buffer edit to
that operates individually on each line, common to plugins such as vim-commentary.

<insert code here>

All text changes will be recorded within a single undo operations. Both the 
comment operation itself and the undo operation will generate an on_line event
for each changed line.

Using the neovim codebase as an example (commit 2ecf04) with  neovim itself built 
at 2ecf04 with CMAKE_BUILD_TYPE=Release shows the following
performance improvement for double buffering <insert explanation here>

This performance improvements is due to avoiding expensive lua string reallocations on each operation, and instead
<insert explanation here>

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 13, 2021

Can confirm that this fixes #11867 (~26s -> ~0.4s). setline seems to cause a lot of on_lines events, even the following code to trigger it is enough:

call setline(1, getline(1, '$'))

Judging by the nature of that issue (an on_lines being fired for every line in the buffer), I believe that in the end my tests are realistic.

@mjlbach
Copy link
Contributor

mjlbach commented Nov 13, 2021

Cool, when you update the commit message and mathias rubber stamps (or requests changes) we will merge.

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 13, 2021

Please wait, checking the other issue.

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 13, 2021

Fixes #14205 too due to rust.vim's usage of setline().

@dmitmel dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from 7fbba2e to 50b9e81 Compare November 14, 2021 11:55
@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 14, 2021

Edited the commit message

@mjlbach
Copy link
Contributor

mjlbach commented Nov 15, 2021

I'm going to give the fixed incremental sync implementation a few days ~= week for people to report bugs @mfussenegger

Otherwise this PR is failing lint.

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 15, 2021

Otherwise this PR is failing lint.

The failure is in unrelated C code, does this happen to other PRs too?

@mjlbach
Copy link
Contributor

mjlbach commented Nov 21, 2021

Mathias will merge this PR when he has reviewed. It will likely be in 0.6

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 21, 2021

Nice. Thank you.

@dmitmel dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from 50b9e81 to aa407da Compare November 22, 2021 09:59
@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 22, 2021

I should also note that I have found that this PR may break on_reload logic. I'll investigate this.

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 24, 2021

False alarm, this was a bug in a specific Language Server, I've fixed it: LuaLS/lua-language-server#821

@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 24, 2021

I have discovered a different bug related to on_reload. In short, the on_reload handler must flush forget all debounced didChange messages before sending didClose and didOpen. I'll fix this in a separate PR. Also, I'm going to rebase this one because it has conflicts with the master branch.

@dmitmel dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from aa407da to a070aad Compare November 24, 2021 17:10
The following code measures the time required to perform a buffer edit to
that operates individually on each line, common to plugins such as vim-commentary.

    set rtp+=~/.config/nvim/plugged/nvim-lspconfig
    set rtp+=~/.config/nvim/plugged/vim-commentary
    lua require('lspconfig')['ccls'].setup({})
    function! Benchmark(tries) abort
        let results_comment = []
        let results_undo = []
        for i in range(a:tries)
            echo printf('run %d', i+1)
            let begin = reltime()
            normal gggcG
            call add(results_comment, reltimefloat(reltime(begin)))
            let begin = reltime()
            silent! undo
            call add(results_undo, reltimefloat(reltime(begin)))
            redraw
        endfor
        let avg_comment = 0.0
        let avg_undo = 0.0
        for i in range(a:tries)
            echomsg printf('run %3d: comment=%fs undo=%fs', i+1, results_comment[i], results_undo[i])
            let avg_comment += results_comment[i]
            let avg_undo += results_undo[i]
        endfor
        echomsg printf('average: comment=%fs undo=%fs', avg_comment / a:tries, avg_undo / a:tries)
    endfunction
    command! -bar Benchmark call Benchmark(10)

All text changes will be recorded within a single undo operation. Both the
comment operation itself and the undo operation will generate an on_lines event
for each changed line. Formatter plugins using setline() have also been found
to exhibit the same problem (neoformat, :RustFmt in rust.vim), as this function
too generates an on_lines event for every line it changes.

Using the neovim codebase as an example (commit 2ecf0a4)
with neovim itself built at 2ecf0a4 with
CMAKE_BUILD_TYPE=Release shows the following performance improvement:

src/nvim/lua/executor.c, 1432 lines:
  baseline, no optimizations:             comment=0.540587s undo=0.440249s
  without double-buffering optimization:  comment=0.183314s undo=0.060663s
  all optimizations in this commit:       comment=0.174850s undo=0.052789s

src/nvim/search.c, 5467 lines:
  baseline, no optimizations:             comment=7.420446s undo=7.656624s
  without double-buffering optimization:  comment=0.889048s undo=0.486026s
  all optimizations in this commit:       comment=0.662899s undo=0.243628s

src/nvim/eval.c, 11355 lines:
  baseline, no optimizations:             comment=41.775695s undo=44.583374s
  without double-buffering optimization:  comment=3.643933s undo=2.817158s
  all optimizations in this commit:       comment=1.510886s undo=0.707928s

This performance improvement is due to avoiding expensive lua string reallocations
on each operation, and instead requesting only the changed chunk of the buffer,
as reported by firstline and new_lastline parameters of on_lines, plus re-using
already allocated tables for storing the resulting lines to reduce the load on
the garbage collector.
@dmitmel dmitmel force-pushed the lsp-sync-faster-buf-get-lines branch from a070aad to d324dec Compare November 26, 2021 19:45
@dmitmel
Copy link
Contributor Author

dmitmel commented Nov 27, 2021

Is there anything left to be done with regards to this PR?

@mjlbach mjlbach added this to the 0.6 milestone Nov 27, 2021
@mjlbach
Copy link
Contributor

mjlbach commented Nov 27, 2021

No, I passed review/merging on a section of my core responsibilities over to @mfussenegger

@mjlbach
Copy link
Contributor

mjlbach commented Jan 16, 2022

@dmitmel When I tried benchmarking this I didn't get the speedup I expected, using your reproduction from #11867 (comment)

I outlined the situation (which you are probably already familiar with here): #11867 (comment).

I think an alternative to this PR would involve bypassing get_lines completely and caching firstline, lastline, newlastline, etc. on each on_lines call, and only calling buf_get_lines at the beginning and end of the debounce window (without any of the intermediate caching as implemented in this PR).

The alternative, which I am also ok with, is to tell users to switch to plugins which apply better atomic buffer changes (which IMO is what they should be doing anyways).

What do you think?

@dmitmel
Copy link
Contributor Author

dmitmel commented Jan 16, 2022

I think an alternative to this PR would involve bypassing get_lines completely and caching firstline, lastline, newlastline, etc. on each on_lines call, and only calling buf_get_lines at the beginning and end of the debounce window (without any of the intermediate caching as implemented in this PR).

Honestly, at that point you might as well stop using firstline/lastline and find the modified range using the procedure I have implemented in #16437. As we have previously established in the Matrix chat, comparing even tens of thousands of strings is very very cheap, and that would considerably simplify the code by bypassing the logic necessary for combining all of the intermediate firstline/lastline ranges into a single one.

@mjlbach
Copy link
Contributor

mjlbach commented Jan 16, 2022

considerably simplify the code by bypassing the logic necessary for combining all of the intermediate firstline/lastline ranges into a single one

I'm pretty sure that's like 3-4 lines of code...

Anyways, if you have time can you verify this doesn't actually address the performance issues or point out how I can verify that it does? I used basically config from #11867 (comment)

@dmitmel
Copy link
Contributor Author

dmitmel commented Jan 16, 2022

No idea. For me it does work, and the improvement is significant (22s -> 0.5s). All I did was compile Nvim on this branch and run the command I specified. Can you record a screencast or something like that?

@mjlbach
Copy link
Contributor

mjlbach commented Jan 16, 2022

@dmitmel if you want the actual authorship of the commit rather than coauthor you can just apply the changes from https://github.com/neovim/neovim/pull/17118/files directly to this PR which updates it to address the rebase conflicts.

@mjlbach
Copy link
Contributor

mjlbach commented Jan 17, 2022

I went ahead and merged this in #17118.

Thanks for your contribution and for addressing all of my questions :)

@mjlbach mjlbach closed this Jan 17, 2022
@dmitmel dmitmel deleted the lsp-sync-faster-buf-get-lines branch January 19, 2022 01:35
@zeertzjq zeertzjq modified the milestones: 0.8, 0.7 Mar 12, 2022
@dundargoc dundargoc removed the request for review from mfussenegger October 29, 2022 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lsp: synchronization bottleneck due to buf_get_lines (caused by line-by-line changes)
5 participants