Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orphaned fish @ 100% cpu after 3.0.0 upgrade #5528

Closed
devsnek opened this Issue Jan 14, 2019 · 32 comments

Comments

Projects
None yet
@devsnek
Copy link

devsnek commented Jan 14, 2019

version is 3.0.0
macos 10.14.2

Every other day or so, i notice my laptop is running hot and slow, and i open up activity monitor and notice this:

These processes are seemingly left over, as my terminal isn't usually even open in these cases.

It started happening after i upgraded to 3.0.0

@mqudsi

This comment has been minimized.

Copy link
Contributor

mqudsi commented Jan 14, 2019

Can you please reinstall with brew install fish --HEAD and see if it still reproduces?

@devsnek

This comment has been minimized.

Copy link
Author

devsnek commented Jan 14, 2019

@mqudsi sure. might take a few days to know if it reproduces though.

@mqudsi

This comment has been minimized.

Copy link
Contributor

mqudsi commented Jan 14, 2019

Sounds good, thanks.

@devsnek

This comment has been minimized.

Copy link
Author

devsnek commented Jan 14, 2019

@mqudsi

==> cmake . -DCMAKE_C_FLAGS_RELEASE=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE=-DNDEBUG -DCMAKE_INSTALL_PREFIX=/usr/local/Cellar/fish/HEAD-0
==> make install
Last 15 lines from /Users/gus/Library/Logs/Homebrew/fish/02.make:
  "_g_profiling_active", referenced from:
      _main in fish.cpp.o
  "_is_interactive_session", referenced from:
      _main in fish.cpp.o
  "_is_login", referenced from:
      _main in fish.cpp.o
  "_no_exec", referenced from:
      _main in fish.cpp.o
  "_program_name", referenced from:
      _main in fish.cpp.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [fish] Error 1
make[1]: *** [CMakeFiles/fish.dir/all] Error 2
make: *** [all] Error 2
@raichoo

This comment has been minimized.

Copy link
Contributor

raichoo commented Jan 14, 2019

I recall seeing this from time to time when running fish inside of neovim and then closing neovim. I'm not sure how to reproduce this properly but I believe (if this is in fact the same issue) that this has been present for quite a while.

@Gonzih

This comment has been minimized.

Copy link
Contributor

Gonzih commented Jan 14, 2019

I have the same issue on Arch Linux

$ fish --version
fish, version 3.0.0

$ uname -a
Linux 4.20.1-arch1-1-ARCH #1 SMP PREEMPT Wed Jan 9 20:25:43 UTC 2019 x86_64 GNU/Linux

Trying to figure out what is causing this.

@zanchey

This comment has been minimized.

Copy link
Member

zanchey commented Jan 14, 2019

@raichoo, yes, around 2.4.0 there were ongoing problems with orphaned processes. I never really got to the bottom of it.

@devsnek, sometimes the build fails like that if you have GNU binutils installed (see #5296). You could try unlinking or removing it first (brew unlink binutils).

@Gonzih could you try getting a backtrace from a spinning process? gdb -p FISH_PID /usr/bin/fish --ex 'thread apply all bt' should do the job.

@Gonzih

This comment has been minimized.

Copy link
Contributor

Gonzih commented Jan 14, 2019

@zanchey

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/fish...(no debugging symbols found)...done.
Attaching to program: /usr/bin/fish, process 29578
Reading symbols from /usr/lib/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libncursesw.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Reading symbols from /usr/lib/librt.so.1...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libpcre2-32.so.0...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libstdc++.so.6...done.
Reading symbols from /usr/lib/libm.so.6...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Reading symbols from /usr/lib/libgcc_s.so.1...done.
0x00007faa5b4cb693 in _int_free () from /usr/lib/libc.so.6

Thread 1 (Thread 0x7faa5b42e240 (LWP 29578)):
#0  0x00007faa5b4cb693 in _int_free () from /usr/lib/libc.so.6
#1  0x000055715db75c92 in input_get_bind_mode[abi:cxx11]() ()
#2  0x000055715db7619b in input_readch(bool) ()
#3  0x000055715dbb650b in reader_readline(int) ()
#4  0x000055715dbb95fc in reader_read(int, io_chain_t const&) ()
#5  0x000055715dadc7fd in main ()
@Gonzih

This comment has been minimized.

Copy link
Contributor

Gonzih commented Jan 14, 2019

Huh I get a different backtrace every time

Thread 1 (Thread 0x7faa5b42e240 (LWP 29578)):
#0  0x00007faa5b4cea32 in malloc () from /usr/lib/libc.so.6
#1  0x00007faa5b8245fd in operator new (sz=512) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/new_op.cc:50
#2  0x000055715db81383 in input_common_next_ch(unsigned int) ()
#3  0x000055715db736f8 in ?? ()
#4  0x000055715db76288 in input_readch(bool) ()
#5  0x000055715dbb650b in reader_readline(int) ()
#6  0x000055715dbb95fc in reader_read(int, io_chain_t const&) ()
#7  0x000055715dadc7fd in main ()
Thread 1 (Thread 0x7faa5b42e240 (LWP 29578)):
#0  0x000055715db80e47 in input_common_readch(int) ()
#1  0x000055715db736e3 in ?? ()
#2  0x000055715db762c8 in input_readch(bool) ()
#3  0x000055715dbb650b in reader_readline(int) ()
#4  0x000055715dbb95fc in reader_read(int, io_chain_t const&) ()
#5  0x000055715dadc7fd in main ()

But it looks like this is all related to readline call.

Reproducing is very simple:

  • open fish shell, type in bunch of commands, some non existing command to get return code 1
  • focus out of the terminal
  • focus back in to terminal and kill the terminal emulator

This should endup in fish shell daemon process being stuck in a loop on 100% of cpu usage.

@Gonzih

This comment has been minimized.

Copy link
Contributor

Gonzih commented Jan 14, 2019

Is this the case when shell instance looses STDIN because terminal emulator was closed or something? Just speculating based on behavior I experienced.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jan 15, 2019

Is this the case when shell instance looses STDIN because terminal emulator was closed or something?

@Gonzih: That should be basically it, yeah. It appears we're not checking an error somewhere.

@xpac27

This comment was marked as off-topic.

Copy link

xpac27 commented Jan 15, 2019

The same problem happens when using entr with the -s option together with tup. Additionally to fish using 100% CPU the program doesn't respond to CTRL+C anymore and the fish process must be killed manually.

find . | entr -s 'tup'

@floam floam added the bug label Jan 15, 2019

@faho

This comment was marked as off-topic.

Copy link
Member

faho commented Jan 15, 2019

@xpac27: Not the same problem, because it's not about the terminal hanging up.

That one is most likely #5438, which was already fixed and will be included in 3.0.1.

@finnito

This comment was marked as outdated.

Copy link

finnito commented Jan 15, 2019

I'm also seeing this on MacOS.

fish --version
fish, version 3.0.0

uname -a
Darwin Finns-MacBook-Pro.local 18.2.0 Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 2018; root:xnu-4903.231.4~2/RELEASE_X86_64 x86_64

sudo gdb -p "7051" /usr/bin/fish --ex 'thread apply all bt'
GNU gdb (GDB) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin18.2.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
/usr/bin/fish: No such file or directory.
Attaching to process 7051
[New Thread 0xf03 of process 7051]
Error calling thread_get_state for GP registers for thread 0xf03

warning: Mach error at "i386-darwin-nat.c:132" in function "virtual void i386_darwin_nat_target::fetch_registers(struct regcache *, int)": (os/kern) invalid argument (0x4)
Reading symbols from /usr/local/Cellar/fish/3.0.0/bin/fish...(no debugging symbols found)...done.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/bsd.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/darwin_vers.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/dirstat.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/dirstat_collection.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/err.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/exception.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/init.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/mach.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/stdio.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/stdlib.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/string.o': can't open to read symbols: No such file or directory.

warning: `/BuildRoot/Library/Caches/com.apple.xbs/Binaries/Libc_darwin/install/TempContent/Objects/Libc.build/libsystem_darwin.dylib.build/Objects-normal/x86_64/variant.o': can't open to read symbols: No such file or directory.
<signal handler called>

Thread 1 (Thread 0xf03 of process 7051):
#0  <signal handler called>
#1  0x0000000102fac6f1 in vtable for std::__1::__shared_ptr_emplace<parsed_source_t, std::__1::allocator<parsed_source_t> > ()
#2  0x00007fae1d500468 in ?? ()
#3  0x00007fae1d500468 in ?? ()
#4  0x0000000000000000 in ?? ()

If I can provide anything else that might help, let me know! The processes I observe getting orphaned run at around 30% CPU and keep my Macbook whirring along nicely, haha.

@mqudsi

This comment has been minimized.

Copy link
Contributor

mqudsi commented Jan 16, 2019

This issue has gotten very messy. Please do not comment unless you are building from master and not the 3.0 release.

Thanks!

@devsnek

This comment has been minimized.

Copy link
Author

devsnek commented Jan 18, 2019

Happened again on the HEAD install @mqudsi

Thread 1 (Thread 0xf03 of process 27578):
#0  0x0000000100a5cfc3 in std::__1::deque<wchar_t, std::__1::allocator<wchar_t> >::push_front(wchar_t&&)
    ()
#1  0x0000000100a5c8a4 in input_common_next_ch(int) ()
#2  0x0000000100a5806e in input_mapping_is_match(input_mapping_t const&) ()
#3  0x0000000100a567b3 in input_readch(bool) ()
#4  0x0000000100a829e9 in reader_readline(int) ()
#5  0x0000000100a87e4f in reader_read(int, io_chain_t const&) ()
#6  0x00000001009e965d in main ()
@mqudsi

This comment has been minimized.

Copy link
Contributor

mqudsi commented Jan 18, 2019

@devsnek with 100% cpu?

@devsnek

This comment has been minimized.

Copy link
Author

devsnek commented Jan 18, 2019

@mqudsi bouncing between 94ish and 100%

@zanchey zanchey added this to the fish-future milestone Jan 19, 2019

@exrok

This comment has been minimized.

Copy link

exrok commented Jan 20, 2019

On commit c66b312 .
I also found fish stuck at a solid 100% usage.

#0  0x00007fb107b33e61 in __wmemcmp_sse4_1 () from /usr/lib/libc.so.6
#1  0x00005576f35de248 in input_readch(bool) ()
#2  0x00005576f361e50b in reader_readline(int) ()
#3  0x00005576f36215fc in reader_read(int, io_chain_t const&) ()
#4  0x00005576f35447fd in main ()
@devsnek

This comment has been minimized.

Copy link
Author

devsnek commented Jan 22, 2019

happened again, same backtrace

Thread 1 (Thread 0x1003 of process 40307):
#0  0x00000001048e1fe0 in std::__1::deque<wchar_t, std::__1::allocator<wchar_t> >::push_front(wchar_t&&) ()
#1  0x00000001048e18a4 in input_common_next_ch(int) ()
#2  0x00000001048dd06e in input_mapping_is_match(input_mapping_t const&) ()
#3  0x00000001048db7b3 in input_readch(bool) ()
#4  0x00000001049079e9 in reader_readline(int) ()
#5  0x000000010490ce4f in reader_read(int, io_chain_t const&) ()
#6  0x000000010486e65d in main ()

@ridiculousfish ridiculousfish modified the milestones: fish 3.1.0, fish 3.0.1 Jan 25, 2019

@ridiculousfish

This comment has been minimized.

Copy link
Member

ridiculousfish commented Jan 27, 2019

What should happen here is that the self-insert binding is always available as a fallback, which will always eat one character lookahead queue - so it shouldn't be possible to get into such a loop.

Unfortunately I think we need steps to reproduce - any hints on how to trigger this @devsnek or @exrok? Does it happen after pasting...? What OS, terminal, etc?

@devsnek

This comment has been minimized.

Copy link
Author

devsnek commented Jan 27, 2019

@ridiculousfish macos 10.14.3 beta, but i don't have any reproduction steps. i've tried a few times and can't find anything that triggers it.

@exrok

This comment has been minimized.

Copy link

exrok commented Jan 27, 2019

@ridiculousfish os: Arch Linux, urxvt terminal, does not seam to happen after pasting. I only ever notice it because high CPU usage and then find in my htop. Even after closing all terminals and tmux sessions the process will still exist and has to be killed directly. I have tried to reproduce it to no avail. It has happened over 10 times combined between my desktop and laptop.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jan 27, 2019

@exrok, @devsnek, @Gonzih, @raichoo: Are all of you using the vi bindings?

In which case:

What should happen here is that the self-insert binding is always available as a fallback

That's not true in normal-mode.

@devsnek

This comment has been minimized.

Copy link
Author

devsnek commented Jan 27, 2019

yes, I do use vi mode

@kbrah

This comment has been minimized.

Copy link

kbrah commented Jan 27, 2019

This happens to me too. I use vi key bindings and I am able to reproduce by killing a terminal when it is in normal mode.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jan 27, 2019

@kbrah: Please try bind -m default "" false, and then kill the terminal in normal mode.

@exrok

This comment has been minimized.

Copy link

exrok commented Jan 27, 2019

@faho I am using vi mode as well, bind -m default "" false and then killing the terminal in normal mode causes this bug for me.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jan 27, 2019

So, I think I almost understand this.

To get it to trigger, set up vi-mode, then enter normal mode and immediately quit the terminal - I think it has to be before the escape timeout has elapsed.

That causes us, for reasons not yet clear to me, to have R_NULL in the input queue, followed by (I think) R_EOF. We read R_NULL, and since we can't do anything with it, put it back. So we never get to the R_EOF, so we never quit.

But I don't quite get what R_NULL is for. Is it a separator between mappings? So you only ever read one at the end of a mapping?

@Gonzih

This comment has been minimized.

Copy link
Contributor

Gonzih commented Jan 27, 2019

Yes I also do use vi mode in shell. I noticed since update to 3.0.0 that exiting back to normal mode "felt" a bit off. bind -m default "" false reproduced the issue on my machine also.

@faho

This comment has been minimized.

Copy link
Member

faho commented Jan 28, 2019

Okay, I think I got it. There's two different issues. One is with bind "" true, because that has "allow_commands == false", so it'll put an R_NULL and loop on R_NULL. That one remains, but isn't an issue in practice because nobody is messing with generic bindings, so it's not critical for 3.0.1.

The other is actually looping on R_EOF (the "symbol" we assign to getting an EOF on stdin)! Because there's no generic binding to read an R_EOF, in this case it'll read R_EOF, try to execute a mapping on it, find nothing, put it back, and repeat, without ever returning. So the solution is simply to do an early return - R_EOF can only ever signal that we need to quit anyway.

faho added a commit that referenced this issue Jan 28, 2019

Quit immediately with R_EOF
If we read an R_EOF, we'd try to match mappings to it.

In emacs mode, that's not an issue because the generic binding was
always available, but in vi-normal mode there is no generic binding,
so we'd endlessly loop, waiting for another character.

Fixes #5528.
@faho

This comment has been minimized.

Copy link
Member

faho commented Jan 28, 2019

And cherry-picked to Integration_3.0.1 as 8feabae.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.