Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpd-git hangs up and becomes unresponsive #151

Closed
mistepien opened this issue Nov 14, 2017 · 16 comments
Closed

mpd-git hangs up and becomes unresponsive #151

mistepien opened this issue Nov 14, 2017 · 16 comments

Comments

@mistepien
Copy link

mistepien commented Nov 14, 2017

There was ae3aeca

Maybe this is the same problem?

In playlist there are a lot of heavy files (eg. 192/24, dsd64 in wavpack) mixed with 44.1/16, meanwhile available output is 96/24.
Script with "sleep 0.3" randomly switches tracks with "play" protocol command, after 20 seconds mpd is stalled and permanently unresponsive. In normal usage it will hang up. The question is only "when". Using mpd involves changing tracks. Stress test makes this much sooner.

This is output of strace:

trace2.txt

@MaxKellermann
Copy link
Member

There's nothing "heavy" with such files - it's a piece of cake for contemporary CPUs, even for a Raspberry Pi.
If MPD freezes completely, try the strace with option -f to capture all threads; that may give me a hint where the problem is.
Or better: after it froze completely, attach with gdb using gdb --pid=$(pidof mpd) and type thread apply all bt then paste the output here. That requires an unstripped MPD binary (which you likely have because you compiled it).

@mistepien
Copy link
Author

mistepien commented Nov 14, 2017

Ok, here it goes.

root@shadow:/tmp/root# file mpd
mpd: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-i386.so.1, with debug_info, not stripped

strace:

root@shadow:/tmp/root# strace -tttfTp $(pidof mpd) -o trace3.txt

trace3.txt

gdb:
gdb1.txt

Both strace and gdb were started when mpd awol occurred (exactly when stress test script hanged and mpc got timeout). Strace was stopped by ^C and next gdb was started.

@MaxKellermann
Copy link
Member

Your gdb file is useless because there are no stack traces; probably because your MPD (and/or Musl) binary is stripped. It shows no information at all.
In your trace file, I can see that all but the I/O thread are deadlocked, but I don't see why. Please start the strace while MPD is operational, and then bring it into deadlocking state while strace is running.

@mistepien
Copy link
Author

gdb:

GNU gdb (GDB) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i586-buildroot-linux-musl".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 5316
[New LWP 5320]
[New LWP 5321]
[New LWP 5322]
[New LWP 5323]
[New LWP 5520]
0xb73d2533 in __cp_end () from /lib/ld-musl-i386.so.1
(gdb) thread apply all bt

Thread 6 (LWP 5520):
#0 0xb5fdeb13 in ?? () from /usr/lib/libgomp.so.1
#1 0xb5fdebfe in ?? () from /usr/lib/libgomp.so.1
#2 0xb5fdec1c in ?? () from /usr/lib/libgomp.so.1
#3 0xb5fdcf96 in ?? () from /usr/lib/libgomp.so.1
#4 0xb73d38d4 in start () from /lib/ld-musl-i386.so.1
#5 0x00000000 in ?? ()

Thread 5 (LWP 5323):
#0 0xb73d2533 in __cp_end () from /lib/ld-musl-i386.so.1

Thread 4 (LWP 5322):
#0 0xb73d2533 in __cp_end () from /lib/ld-musl-i386.so.1

Thread 3 (LWP 5321):
#0 0xb73d2533 in __cp_end () from /lib/ld-musl-i386.so.1

Thread 2 (LWP 5320):
#0 0xb7384ab9 in __kernel_vsyscall ()
#1 0xb73a1b15 in __vsyscall () from /lib/ld-musl-i386.so.1
#2 0x00000010 in ?? ()
#3 0xb73a1b4c in __vsyscall6 () from /lib/ld-musl-i386.so.1
---Type to continue, or q to quit---
#4 0xb5f2ebc8 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Thread 1 (LWP 5316):
#0 0xb73d2533 in __cp_end () from /lib/ld-musl-i386.so.1
(gdb)

@MaxKellermann
Copy link
Member

There is still no stack trace.

@mistepien
Copy link
Author

mistepien commented Nov 14, 2017

Could you tell me what to do?
I'm not familiar with gdb.

btw, is it possible that "strace -tttfTp" interferes with work of mpd? if strace is started when mpd works fine, stress test does not hang out mpd.

@mistepien
Copy link
Author

I have also gdbserver.

@MaxKellermann
Copy link
Member

Yes, strace can affect the timing, and probably your problem is some kind of race conditions where small timing differences can make a big difference for whether MPD deadlocks.
gdbserver doesn't help at all with broken stack traces. What you need first is a Musl binary with debug symbols, and unstripped. I don't know how you installed Musl - if you compiled it, there may be an option to enable debugging. If you downloaded the binary, you may be able to get debug binaries there as well, but I don't know.

@mistepien
Copy link
Author

mistepien commented Nov 14, 2017

but libc.so is unstripped now:

file /lib/libc.so
/lib/libc.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, with debug_info, not stripped

I have put unstripped version of libc.so to filesystem. I use buildroot and all binaries are stripped by default but in build directory there are unstripped files, too.

@mistepien
Copy link
Author

mistepien commented Nov 14, 2017

So, libc.so in unstripped, mpd is unstripped and I have gdb. System is restarted, too.

@MaxKellermann
Copy link
Member

How did you compile Musl? With -g, --enable-debug or other flags? What about -fomit-frame-pointer, or does your toolchain choose this option automatically? Can you recompile with -fno-omit-frame-pointer?
These are just random guesses, because I don't know why gdb is unable to emit useful stack traces. I don't know about your Musl builds, and actually I've never used Musl ...

@mistepien
Copy link
Author

Ok, I will take me some time, but I will do my best.

@mistepien
Copy link
Author

ok, and now?

gdb2.txt

@mistepien
Copy link
Author

and now I compiled mpd-git with latest patches.
gdb3.txt

@MaxKellermann
Copy link
Member

Yay, these stack traces are good! Will check them out (probably not today).

@mistepien
Copy link
Author

Works like a charm. Stress test goes just fine. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants