Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orphaned fish @ 100% cpu after 3.0.0 upgrade #5528

Closed
devsnek opened this issue Jan 14, 2019 · 33 comments
Closed

Orphaned fish @ 100% cpu after 3.0.0 upgrade #5528

devsnek opened this issue Jan 14, 2019 · 33 comments
Labels
bug Something that's not working as intended
Milestone

Comments

@devsnek
Copy link

devsnek commented Jan 14, 2019

version is 3.0.0
macos 10.14.2

Every other day or so, i notice my laptop is running hot and slow, and i open up activity monitor and notice this:

These processes are seemingly left over, as my terminal isn't usually even open in these cases.

It started happening after i upgraded to 3.0.0

@mqudsi
Copy link
Contributor

mqudsi commented Jan 14, 2019

Can you please reinstall with brew install fish --HEAD and see if it still reproduces?

@devsnek
Copy link
Author

devsnek commented Jan 14, 2019

@mqudsi sure. might take a few days to know if it reproduces though.

@mqudsi
Copy link
Contributor

mqudsi commented Jan 14, 2019

Sounds good, thanks.

@devsnek
Copy link
Author

devsnek commented Jan 14, 2019

@mqudsi

==> cmake . -DCMAKE_C_FLAGS_RELEASE=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE=-DNDEBUG -DCMAKE_INSTALL_PREFIX=/usr/local/Cellar/fish/HEAD-0
==> make install
Last 15 lines from /Users/gus/Library/Logs/Homebrew/fish/02.make:
  "_g_profiling_active", referenced from:
      _main in fish.cpp.o
  "_is_interactive_session", referenced from:
      _main in fish.cpp.o
  "_is_login", referenced from:
      _main in fish.cpp.o
  "_no_exec", referenced from:
      _main in fish.cpp.o
  "_program_name", referenced from:
      _main in fish.cpp.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [fish] Error 1
make[1]: *** [CMakeFiles/fish.dir/all] Error 2
make: *** [all] Error 2

@raichoo
Copy link
Contributor

raichoo commented Jan 14, 2019

I recall seeing this from time to time when running fish inside of neovim and then closing neovim. I'm not sure how to reproduce this properly but I believe (if this is in fact the same issue) that this has been present for quite a while.

@Gonzih
Copy link
Contributor

Gonzih commented Jan 14, 2019

I have the same issue on Arch Linux

$ fish --version
fish, version 3.0.0

$ uname -a
Linux 4.20.1-arch1-1-ARCH #1 SMP PREEMPT Wed Jan 9 20:25:43 UTC 2019 x86_64 GNU/Linux

Trying to figure out what is causing this.

@zanchey
Copy link
Member

zanchey commented Jan 14, 2019

@raichoo, yes, around 2.4.0 there were ongoing problems with orphaned processes. I never really got to the bottom of it.

@devsnek, sometimes the build fails like that if you have GNU binutils installed (see #5296). You could try unlinking or removing it first (brew unlink binutils).

@Gonzih could you try getting a backtrace from a spinning process? gdb -p FISH_PID /usr/bin/fish --ex 'thread apply all bt' should do the job.

@Gonzih
Copy link
Contributor

Gonzih commented Jan 14, 2019

@zanchey

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/fish...(no debugging symbols found)...done.
Attaching to program: /usr/bin/fish, process 29578
Reading symbols from /usr/lib/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libncursesw.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Reading symbols from /usr/lib/librt.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpcre2-32.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libstdc++.so.6...done.
Reading symbols from /usr/lib/libm.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libgcc_s.so.1...done.
0x00007faa5b4cb693 in _int_free () from /usr/lib/libc.so.6

Thread 1 (Thread 0x7faa5b42e240 (LWP 29578)):
#0  0x00007faa5b4cb693 in _int_free () from /usr/lib/libc.so.6
#1  0x000055715db75c92 in input_get_bind_mode[abi:cxx11]() ()
#2  0x000055715db7619b in input_readch(bool) ()
#3  0x000055715dbb650b in reader_readline(int) ()
#4  0x000055715dbb95fc in reader_read(int, io_chain_t const&) ()
#5  0x000055715dadc7fd in main ()

@Gonzih
Copy link
Contributor

Gonzih commented Jan 14, 2019

Huh I get a different backtrace every time

Thread 1 (Thread 0x7faa5b42e240 (LWP 29578)):
#0  0x00007faa5b4cea32 in malloc () from /usr/lib/libc.so.6
#1  0x00007faa5b8245fd in operator new (sz=512) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/new_op.cc:50
#2  0x000055715db81383 in input_common_next_ch(unsigned int) ()
#3  0x000055715db736f8 in ?? ()
#4  0x000055715db76288 in input_readch(bool) ()
#5  0x000055715dbb650b in reader_readline(int) ()
#6  0x000055715dbb95fc in reader_read(int, io_chain_t const&) ()
#7  0x000055715dadc7fd in main ()
Thread 1 (Thread 0x7faa5b42e240 (LWP 29578)):
#0  0x000055715db80e47 in input_common_readch(int) ()
#1  0x000055715db736e3 in ?? ()
#2  0x000055715db762c8 in input_readch(bool) ()
#3  0x000055715dbb650b in reader_readline(int) ()
#4  0x000055715dbb95fc in reader_read(int, io_chain_t const&) ()
#5  0x000055715dadc7fd in main ()

But it looks like this is all related to readline call.

Reproducing is very simple:

  • open fish shell, type in bunch of commands, some non existing command to get return code 1
  • focus out of the terminal
  • focus back in to terminal and kill the terminal emulator

This should endup in fish shell daemon process being stuck in a loop on 100% of cpu usage.

@Gonzih
Copy link
Contributor

Gonzih commented Jan 14, 2019

Is this the case when shell instance looses STDIN because terminal emulator was closed or something? Just speculating based on behavior I experienced.

@faho
Copy link
Member

faho commented Jan 15, 2019

Is this the case when shell instance looses STDIN because terminal emulator was closed or something?

@Gonzih: That should be basically it, yeah. It appears we're not checking an error somewhere.

@xpac27

This comment has been minimized.

@floam floam added the bug Something that's not working as intended label Jan 15, 2019
@faho

This comment has been minimized.

@finnito

This comment has been minimized.

@mqudsi
Copy link
Contributor

mqudsi commented Jan 16, 2019

This issue has gotten very messy. Please do not comment unless you are building from master and not the 3.0 release.

Thanks!

@devsnek
Copy link
Author

devsnek commented Jan 18, 2019

Happened again on the HEAD install @mqudsi

Thread 1 (Thread 0xf03 of process 27578):
#0  0x0000000100a5cfc3 in std::__1::deque<wchar_t, std::__1::allocator<wchar_t> >::push_front(wchar_t&&)
    ()
#1  0x0000000100a5c8a4 in input_common_next_ch(int) ()
#2  0x0000000100a5806e in input_mapping_is_match(input_mapping_t const&) ()
#3  0x0000000100a567b3 in input_readch(bool) ()
#4  0x0000000100a829e9 in reader_readline(int) ()
#5  0x0000000100a87e4f in reader_read(int, io_chain_t const&) ()
#6  0x00000001009e965d in main ()

@mqudsi
Copy link
Contributor

mqudsi commented Jan 18, 2019

@devsnek with 100% cpu?

@devsnek
Copy link
Author

devsnek commented Jan 18, 2019

@mqudsi bouncing between 94ish and 100%

@zanchey zanchey added this to the fish-future milestone Jan 19, 2019
@exrok
Copy link

exrok commented Jan 20, 2019

On commit c66b312 .
I also found fish stuck at a solid 100% usage.

#0  0x00007fb107b33e61 in __wmemcmp_sse4_1 () from /usr/lib/libc.so.6
#1  0x00005576f35de248 in input_readch(bool) ()
#2  0x00005576f361e50b in reader_readline(int) ()
#3  0x00005576f36215fc in reader_read(int, io_chain_t const&) ()
#4  0x00005576f35447fd in main ()

@devsnek
Copy link
Author

devsnek commented Jan 22, 2019

happened again, same backtrace

Thread 1 (Thread 0x1003 of process 40307):
#0  0x00000001048e1fe0 in std::__1::deque<wchar_t, std::__1::allocator<wchar_t> >::push_front(wchar_t&&) ()
#1  0x00000001048e18a4 in input_common_next_ch(int) ()
#2  0x00000001048dd06e in input_mapping_is_match(input_mapping_t const&) ()
#3  0x00000001048db7b3 in input_readch(bool) ()
#4  0x00000001049079e9 in reader_readline(int) ()
#5  0x000000010490ce4f in reader_read(int, io_chain_t const&) ()
#6  0x000000010486e65d in main ()

@ridiculousfish ridiculousfish modified the milestones: fish 3.1.0, fish 3.0.1 Jan 25, 2019
@ridiculousfish
Copy link
Member

What should happen here is that the self-insert binding is always available as a fallback, which will always eat one character lookahead queue - so it shouldn't be possible to get into such a loop.

Unfortunately I think we need steps to reproduce - any hints on how to trigger this @devsnek or @exrok? Does it happen after pasting...? What OS, terminal, etc?

@devsnek
Copy link
Author

devsnek commented Jan 27, 2019

@ridiculousfish macos 10.14.3 beta, but i don't have any reproduction steps. i've tried a few times and can't find anything that triggers it.

@exrok
Copy link

exrok commented Jan 27, 2019

@ridiculousfish os: Arch Linux, urxvt terminal, does not seam to happen after pasting. I only ever notice it because high CPU usage and then find in my htop. Even after closing all terminals and tmux sessions the process will still exist and has to be killed directly. I have tried to reproduce it to no avail. It has happened over 10 times combined between my desktop and laptop.

@faho
Copy link
Member

faho commented Jan 27, 2019

@exrok, @devsnek, @Gonzih, @raichoo: Are all of you using the vi bindings?

In which case:

What should happen here is that the self-insert binding is always available as a fallback

That's not true in normal-mode.

@devsnek
Copy link
Author

devsnek commented Jan 27, 2019

yes, I do use vi mode

@kbrah
Copy link

kbrah commented Jan 27, 2019

This happens to me too. I use vi key bindings and I am able to reproduce by killing a terminal when it is in normal mode.

@faho
Copy link
Member

faho commented Jan 27, 2019

@kbrah: Please try bind -m default "" false, and then kill the terminal in normal mode.

@exrok
Copy link

exrok commented Jan 27, 2019

@faho I am using vi mode as well, bind -m default "" false and then killing the terminal in normal mode causes this bug for me.

@faho
Copy link
Member

faho commented Jan 27, 2019

So, I think I almost understand this.

To get it to trigger, set up vi-mode, then enter normal mode and immediately quit the terminal - I think it has to be before the escape timeout has elapsed.

That causes us, for reasons not yet clear to me, to have R_NULL in the input queue, followed by (I think) R_EOF. We read R_NULL, and since we can't do anything with it, put it back. So we never get to the R_EOF, so we never quit.

But I don't quite get what R_NULL is for. Is it a separator between mappings? So you only ever read one at the end of a mapping?

@Gonzih
Copy link
Contributor

Gonzih commented Jan 27, 2019

Yes I also do use vi mode in shell. I noticed since update to 3.0.0 that exiting back to normal mode "felt" a bit off. bind -m default "" false reproduced the issue on my machine also.

@faho
Copy link
Member

faho commented Jan 28, 2019

Okay, I think I got it. There's two different issues. One is with bind "" true, because that has "allow_commands == false", so it'll put an R_NULL and loop on R_NULL. That one remains, but isn't an issue in practice because nobody is messing with generic bindings, so it's not critical for 3.0.1.

The other is actually looping on R_EOF (the "symbol" we assign to getting an EOF on stdin)! Because there's no generic binding to read an R_EOF, in this case it'll read R_EOF, try to execute a mapping on it, find nothing, put it back, and repeat, without ever returning. So the solution is simply to do an early return - R_EOF can only ever signal that we need to quit anyway.

faho added a commit that referenced this issue Jan 28, 2019
If we read an R_EOF, we'd try to match mappings to it.

In emacs mode, that's not an issue because the generic binding was
always available, but in vi-normal mode there is no generic binding,
so we'd endlessly loop, waiting for another character.

Fixes #5528.
@faho
Copy link
Member

faho commented Jan 28, 2019

And cherry-picked to Integration_3.0.1 as 8feabae.

@vcfvct
Copy link

vcfvct commented Jun 4, 2019

Was annoyed by this issue. Thanks @devsnek a lot for reporting this.
upgraded to the 3.0.x and seems to be working fine now. 👍

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something that's not working as intended
Projects
None yet
Development

No branches or pull requests