Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neovim hang at 100% CPU when interrupted with <C-c> #20726

Open
ibhagwan opened this issue Oct 18, 2022 · 17 comments
Open

neovim hang at 100% CPU when interrupted with <C-c> #20726

ibhagwan opened this issue Oct 18, 2022 · 17 comments
Labels
bug issues reporting wrong behavior input job-control OS processes, spawn terminal built-in :terminal or :shell

Comments

@ibhagwan
Copy link

ibhagwan commented Oct 18, 2022

Neovim version (nvim -v)

NVIM v0.8.0

Vim (not Nvim) behaves the same?

not using vim

Operating system/version

Void Linux kernel 5.10.147

Terminal name/version

alacritty 0.10.1

$TERM environment variable

screen-256color

Installation

Void Linux xbps package manager

How to reproduce the issue

I am able to consistently get neovim to hang at 100% CPU when using fzf-lua, opening any interface and quickly pressing <C-c>.

Note that to rerproduce this consistently this commit which maps <C-c> to <Esc> in termal mode, needs to be reverted or you can test with the trace branch which also contains trace logging to file.

What happens behind the scenes is fzf-lua opening a new window and running termopen with the fzf command and then waits for the job's on_exit callback.

While stuck I ran strace -s 99 -ffp <pid> on the neovim pid and it seems to be polling a no longer existing thread (not sure about that, but in the below pid 27433 no longer exists), the loop below continues endlessly:

[pid 27433] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=18, u64=18}}], 1024, -1) = 1
[pid 27432] <... write resumed>)        = 8
[pid 27433] read(18, "\1\0\0\0\0\0\0\0", 1024) = 8
[pid 27432] epoll_wait(9,  <unfinished ...>
[pid 27433] epoll_wait(15,  <unfinished ...>
[pid 27432] <... epoll_wait resumed>[], 1024, 0) = 0
[pid 27432] epoll_wait(9, [], 1024, 0)  = 0
[pid 27432] epoll_wait(9, [], 1024, 0)  = 0
[pid 27432] epoll_wait(9, [], 1024, 0)  = 0
[pid 27432] write(32, "\3", 1)          = 1
[pid 27432] write(18, "\1\0\0\0\0\0\0\0", 8 <unfinished ...>
[pid 27433] <... epoll_wait resumed>[{events=EPOLLIN, data={u32=18, u64=18}}], 1024, -1) = 1
[pid 27432] <... write resumed>)        = 8
[pid 27433] read(18, "\1\0\0\0\0\0\0\0", 1024) = 8
[pid 27432] epoll_wait(9,  <unfinished ...>
[pid 27433] epoll_wait(15,  <unfinished ...>
[pid 27432] <... epoll_wait resumed>[], 1024, 0) = 0
[pid 27432] epoll_wait(9, [], 1024, 0)  = 0
[pid 27432] epoll_wait(9, [], 1024, 0)  = 0
[pid 27432] epoll_wait(9, [], 1024, 0)  = 0
[pid 27432] write(32, "\3", 1)          = 1
[pid 27432] write(18, "\1\0\0\0\0\0\0\0", 8 <unfinished ...>

In an effort to understand where this issue comes from I added trace logging to my plugin from the moment the termopen succeeds until on_exit is called, you can enable logging to file (only with the trace branch) with:

require'fzf-lua'.setup({ debug_tracelog = "~/neovim.log" })

With the above setup the file gets overwritten each time the fzf-lua interface is open, the results indicate neovim is stuck in an endless loop calling the vim._on_key(char) function within the runtime runtime/lua/vim/_editor.lua:560, relevant part of the log right before it gets stuck:

2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/fzf-lua/lua/fzf-lua/fzf.lua:102 nil
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/fzf-lua/lua/fzf-lua/fzf.lua:79 finish
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/fzf-lua/lua/fzf-lua/fzf.lua:81 finish
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/fzf-lua/lua/fzf-lua/fzf.lua:82 finish
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/fzf-lua/lua/fzf-lua/fzf.lua:83 finish
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/fzf-lua/lua/fzf-lua/fzf.lua:85 finish
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/fzf-lua/lua/fzf-lua/fzf.lua:104 nil
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/gitsigns.nvim/lua/gitsigns/manager.lua:449 nil
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/gitsigns.nvim/lua/gitsigns/manager.lua:450 nil
2022-10-18T13:26:32 @/home/bhagwan/.local/share/nvim/site/pack/packer/opt/gitsigns.nvim/lua/gitsigns/manager.lua:451 nil
2022-10-18T13:26:32 @/usr/share/nvim/runtime/lua/vim/treesitter/highlighter.lua:258 nil
2022-10-18T13:26:32 @/usr/share/nvim/runtime/lua/vim/treesitter/highlighter.lua:259 nil
2022-10-18T13:26:32 @/usr/share/nvim/runtime/lua/vim/treesitter/highlighter.lua:260 nil
2022-10-18T13:26:32 @vim/_editor.lua:561 nil  <= first line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:562 nil
2022-10-18T13:26:32 @vim/_editor.lua:563 nil
2022-10-18T13:26:32 @vim/_editor.lua:572 nil
2022-10-18T13:26:32 @vim/_editor.lua:581 nil
2022-10-18T13:26:32 @vim/_editor.lua:561 nil  <= first line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:562 nil  <= last line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:563 nil
2022-10-18T13:26:32 @vim/_editor.lua:572 nil
2022-10-18T13:26:32 @vim/_editor.lua:581 nil  <= last line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:561 nil  <= first line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:562 nil
2022-10-18T13:26:32 @vim/_editor.lua:563 nil
2022-10-18T13:26:32 @vim/_editor.lua:572 nil
2022-10-18T13:26:32 @vim/_editor.lua:581 nil  <= last line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:561 nil  <= first line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:562 nil
2022-10-18T13:26:32 @vim/_editor.lua:563 nil
2022-10-18T13:26:32 @vim/_editor.lua:572 nil
2022-10-18T13:26:32 @vim/_editor.lua:581 nil  <= last line of vim._on_key
2022-10-18T13:26:32 @vim/_editor.lua:561 nil  <= first line of vim._on_key

I think this might be related to neovim not yet interrupting lua code as explained in #6800?

@famiu, @ZyX-I, @justinmk, if my analysis is correct, do you have any idea why this happens? What part of the code causes this? Any workarounds I can apply before #19096 is merged (aside from remapping <C-c>)?

Expected behavior

Neovim should be able to handle <C-c>.

Actual behavior

Neovim will occasionally become unresponsive when pressing <C-c>.

@ibhagwan ibhagwan added the bug issues reporting wrong behavior label Oct 18, 2022
@zeertzjq zeertzjq added job-control OS processes, spawn input lua stdlib and removed job-control OS processes, spawn labels Oct 18, 2022
@vhoyer
Copy link

vhoyer commented Oct 21, 2022

vim-airline/vim-airline#2588 I can reproduce this freezing behavior with vim-airline

here is a reproduction repo: https://github.com/vhoyer-bug-reproductions/airline-freeze-on-C-c

@vhoyer
Copy link

vhoyer commented Oct 21, 2022

also this is happening with NVIM v0.7.2

@vhoyer
Copy link

vhoyer commented Oct 24, 2022

btw, vim-airline doesn't use lua, so I don't know if the labels are correctly applied, or maybe it's a different problem with the same behavior, should I open a new issue?

@zeertzjq zeertzjq added job-control OS processes, spawn terminal built-in :terminal or :shell and removed lua stdlib labels Oct 24, 2022
@jrahm
Copy link

jrahm commented Jan 30, 2023

Not sure if this is the same bug or not, but it sounds similar, and I believe I found fairly minimal steps to reproduce:

  1. Open neovim with nvim -u NONE
  2. Open a terminal
  3. Source a file with the following contents:
local vim = assert(vim)
local loop = vim.loop

local i = 0

local function do_cmd()
  i = i + 1
  print ("spawn: " .. i)
  loop.spawn('echo', {
    args = {'hi'}
  }, function () end)
end

local function start_polling()
  vim.defer_fn(function ()
    local j = 0
    while j < 100 do
      do_cmd()
      j = j + 1
    end
    start_polling()
  end, 100)
end

start_polling()
  1. In the terminal (in INSERT mode) hold Ctrl-C. (In otherwords just send a bunch of Ctrl-C's to the terminal. For reference I have my keyboard rate set to xset r rate 300 100, so it repeats pretty fast).

After several seconds the terminal should hang and Neovim will lock up at 100% CPU.

Obviously this example is pretty contrived, but the freeze happens often enough on my dev workflow to be annoying. (I have a plugin that periodically polls information by launching a bunch of commands). It appears to be a race conditions where if Ctrl-C is pressed at the exact wrong time during the job life cycle, it get's stuck in a loop.

Poking around in GDB indicates that Neovim thinks it's getting a constant stream of Ctrl-C's, so it appears that maybe got_int is not being cleared somewhere. Using gdb to manually set got_int = 0 will break the loop.

To further add to the mystery, I replicated this on 3 different machines (All Linux-based, Arch and Debian), however, I could not get the bug to replicate on a Gentoo server I run. All versions of Neovim were built from source though, so it seems strange to have different behavior. Maybe the server hardware is beefy enough that it masks the issue.

I was also able to replicate this issue on the packaged release of Neovim in Arch linux:

$ /usr/bin/nvim --version

NVIM v0.8.2
Build type: Release
LuaJIT 2.1.0-beta3
Compiled by builduser

Features: +acl +iconv +tui
See ":help feature-compile"

   system vimrc file: "$VIM/sysinit.vim"
  fall-back for $VIM: "/usr/share/nvim"

Run :checkhealth for more info

@neovim neovim deleted a comment from masaeedu Mar 12, 2023
@neovim neovim deleted a comment from vhoyer Mar 12, 2023
@neovim neovim deleted a comment from masaeedu Mar 12, 2023
@AugustoDeveloper

This comment was marked as duplicate.

@Kraust

This comment was marked as duplicate.

@dispensable

This comment was marked as duplicate.

@justinmk
Copy link
Member

justinmk commented Jun 4, 2023

We have clear repro steps given above. So I've hidden redundant comments (though the hints are appreciated).

@xbreak

This comment was marked as off-topic.

@Hippo0o
Copy link

Hippo0o commented Sep 21, 2023

i was able to not trigger this bug every day by adding this in my alacritty config:

key_bindings:
  - { key: C, mods: Control,      mode: Alt,  action: None  } # fix neovim freezes
  - { key: C, mods: Alt|Control,  mode: ~Alt, chars: "\x03" }

it basically disables C-c in every tui and maps C-A-c to C-c instead.

@mattpallissard
Copy link

mattpallissard commented Oct 11, 2023

I've had the issue of ^c intermittently hosing terminal mode when a command writes a lot of data to std{out,err}.

:split +term
~  base64 < /dev/urandom
^c

While debugging this I noticed the following the logs

line 1: ^Ireturn luaeval(printf('require"nvim-treesitter.fold".get_fold_indic(%d)', v:lnum))
nvim_treesitter#foldexpr returning #0

calling nvim_treesitter#foldexpr()

I had the following settings.

set foldmethod=expr
set foldexpr=nvim_treesitter#foldexpr()

So I added this autocmd as a workaround.

autocmd TermOpen * set foldexpr&

This smells like a race condition that is exacerbated by the foldmethod, but I haven't been able to reproduce since modifying the foldexpr.

@mattpallissard
Copy link

Poking around in GDB indicates that Neovim thinks it's getting a constant stream of Ctrl-C's, so it appears that maybe got_int is not being cleared somewhere. Using gdb to manually set got_int = 0 will break the loop.

I'll also point out that I saw the same thing as @jrahm listed above. I spent a while stepping through with GDB before I stumbled upon the workaround in my previous comment.

@wookayin
Copy link
Member

wookayin commented Oct 12, 2023

@mattpallissard That's a great finding! I can also confirm that setting foldexpr=0 for the terminal window prevents neovim from freezing, very consistently. A great repro. I also used the cat /dev/urandom | base64 example to flood terminal outputs.

In your workaround you globally set foldexpr&, but it'd be even better if one can set this locally to the terminal windows: setlocal foldexpr& or setlocal foldexpr=0 so that foldexpr can work for other buffers. Or simply setlocal foldmethod=manual (for the terminal buffer) also works by disabling the execution of foldexpr, hence another workaround:

  autocmd TermOpen * setlocal foldmethod=manual

I've made a more refined and clean version of the repro with minimal dependencies based on @mattpallissard's observation:

-- Run with: nvim --clean -u repro.lua, and hit <Ctrl-C>
vim.o.foldmethod = 'expr'
vim.o.foldexpr = 'MyFold()'

vim.cmd [[
  function! MyFold() abort
    " Do some expensive(?) computation inside foldexpr, to increase the chance of neovim freezing
    " Note: this function gets called **VERY OFTEN**, every single time terminal draws with a new character
    let i = 0 | while i <= 20 | let i += 1 | endwhile
    return 0
  endfunction
]]

vim.cmd [[ term base64 < /dev/urandom ]]
-- vim.cmd [[ setlocal foldmethod=manual ]] -- a workaround to prevent hanging (#20726)
vim.cmd [[ startinsert ]]
Some additional notes regarding the above repro

The following (lua function instead of vimscript foldexpr function) doesn't result in hanging. So it's probably related to vimscript execution?

vim.o.foldexpr = 'v:lua.MyFold()'
function _G.MyFold() return 0 end
The infinite loop happens here

With this repro and a bit of playing around with gdb, I also found that the infinite loop happens around:

@mattpallissard
Copy link

I don't know the codebase well, but it seems like there are likely two problems

  1. There is a context switch, from terminal to foldexpr, where an interrupt handler does the wrong thing.

  2. The infinite loop after the interrupt is handled incorrectly.

wookayin added a commit to wookayin/dotfiles that referenced this issue Oct 12, 2023
Neovim often freezes when hitting `<Ctrl-C>` on a terminal. As a
workaround for this bug, we disable folding (evaluation of `&foldexpr`)
on terminal buffers because we don't really need folding in a terminal.

See neovim/neovim#20726
@justinmk
Copy link
Member

justinmk commented Oct 12, 2023

While debugging this I noticed the following the logs

line 1: ^Ireturn luaeval(printf('require"nvim-treesitter.fold".get_fold_indic(%d)', v:lnum))

Related?

@chanha-park
Copy link

I'm not sure if it's related or not, but I also have a foldexpr related issue. When I use a command like yes inside the nvim's built-in terminal, nvim just stucks.

After executing yes command, I cannot do anything.(none of <ESC>, <C-c>, <C-\>, <C-\><C-n> works)

Tested with minimal.lua below, with nvim --clean -u minimal.lua. Nvim version 0.9.4 on Linux Mint 21.

-- minimal.lua
for name, url in pairs({
    -- ADD PLUGINS _NECESSARY_ TO REPRODUCE THE ISSUE, e.g:
    -- some_plugin = 'https://github.com/author/plugin.nvim'
    ['nvim-treesitter'] = 'https://github.com/nvim-treesitter/nvim-treesitter',
}) do
    local install_path = vim.fn.fnamemodify('nvim_issue/' .. name, ':p')
    if vim.fn.isdirectory(install_path) == 0 then
        vim.fn.system({ 'git', 'clone', '--depth=1', url, install_path, })
    end
    vim.opt.runtimepath:append(install_path)
end

vim.opt.foldmethod = 'expr'
vim.opt.foldexpr = 'nvim_treesitter#foldexpr()'

hjdivad added a commit to hjdivad/dotfiles that referenced this issue Mar 14, 2024
There's an open issue where there appears to be race conditions when
hitting Ctrl-C during lua deferred functions.

Lots of people were hitting this when using treesitter folding in
terminals.  I was using indent, which may or may not have a similar
problem, but either way I'm happy to disable folding in terminal.

References:

- issue: neovim/neovim#20726
- workaround: neovim/neovim#20726 (comment)
@ibhagwan
Copy link
Author

ibhagwan commented Apr 1, 2024

May or may not be related, while attempting to debug this issue I've been using fzf-lua and spamming FzfLua + <C-c>, done enough times I would sometimes be able to hang neovim. However, doing so with gdb attached I am unable to get to the got_int loop described by both @jrahm in #20726 (comment) and @wookayin in #20726 (comment) but I do see the below in the gdb console every single time I am able to hang neovim:

Few examples below, each depicting a would-be got_int hang?

Thread 1 "nvim" received signal SIGPIPE, Broken pipe.
0x00007fa66ba9d60f in __GI___libc_write (nbytes=4093, buf=0x557165339f40, fd=20) at ../sysdeps/unix/sysv/linux/write.c:26
26	in ../sysdeps/unix/sysv/linux/write.c

Thread 1 "nvim" received signal SIGPIPE, Broken pipe.
0x00007fa66ba9d60f in __GI___libc_write (nbytes=2174, buf=0x55716533af50, fd=20) at ../sysdeps/unix/sysv/linux/write.c:26
26	in ../sysdeps/unix/sysv/linux/write.c

Thread 1 "nvim" received signal SIGPIPE, Broken pipe.
0x00007fa66ba9d60f in __GI___libc_write (nbytes=164, buf=0x5571654bcf40, fd=21) at ../sysdeps/unix/sysv/linux/write.c:26
26	in ../sysdeps/unix/sysv/linux/write.c

Is it possible that somehow the combination of receing the SIGPIPE causes got_int to never get cleared and thus enter the infinite loop?

@justinmk, is it possible this issue is not related to lua code interruption or folds?

TheLeoP added a commit to TheLeoP/nvim-config that referenced this issue Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issues reporting wrong behavior input job-control OS processes, spawn terminal built-in :terminal or :shell
Projects
None yet
Development

No branches or pull requests