Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTK2+ applications don't start #41

Closed
cryptomilk opened this Issue Jan 11, 2014 · 48 comments

Comments

Projects
None yet
2 participants
@cryptomilk
Copy link

cryptomilk commented Jan 11, 2014

Hi,

I have problems starting up easytag if qtcurve is in use. I have 1.8.17 here, it works if I use 1.8.16 with the runCommand patch and start it in strace.

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 11, 2014

strace

(gdb) bt
#0  0x00007ffff4ba60d2 in wait () from /lib64/libc.so.6
#1  0x000000000043674b in ?? ()
#2  <signal handler called>
#3  0x00007ffff4b5a3ac in _IO_proc_close@@GLIBC_2.2.5 () from /lib64/libc.so.6
#4  0x00007ffff4b63b20 in __GI__IO_file_close_it () from /lib64/libc.so.6
#5  0x00007ffff4b58688 in fclose@@GLIBC_2.2.5 () from /lib64/libc.so.6
#6  0x00007fffeaf9ee9a in runCommand () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#7  0x00007fffeaf9eed4 in ?? () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#8  0x00007fffeaf9efbe in ?? () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#9  0x00007fffeaf9f502 in qtSettingsInit () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#10 0x00007fffeaf9afb3 in ?? () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#11 0x00007ffff6af5dab in g_type_create_instance () from /usr/lib64/libgobject-2.0.so.0
#12 0x00007ffff6ada275 in ?? () from /usr/lib64/libgobject-2.0.so.0
#13 0x00007ffff6adc06d in g_object_newv () from /usr/lib64/libgobject-2.0.so.0
#14 0x00007ffff6adc81c in g_object_new () from /usr/lib64/libgobject-2.0.so.0
#15 0x00007fffeaf9b78c in theme_create_rc_style () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#16 0x00007ffff791d906 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#17 0x00007ffff791e1d5 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#18 0x00007ffff791e36f in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#19 0x00007ffff791cb68 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#20 0x00007ffff791e1d5 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#21 0x00007ffff791e36f in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#22 0x00007ffff791ebdf in gtk_rc_reparse_all_for_settings () from /usr/lib64/libgtk-x11-2.0.so.0
#23 0x00007ffff793ab12 in gtk_settings_get_for_screen () from /usr/lib64/libgtk-x11-2.0.so.0
#24 0x00007fffeb7fae66 in ?? () from /usr/lib64/gtk-2.0/modules/libcanberra-gtk-module.so
#25 0x00007fffeb7fb128 in gtk_module_init () from /usr/lib64/gtk-2.0/modules/libcanberra-gtk-module.so
#26 0x00007ffff78eb441 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#27 0x00007ffff6ad5318 in g_closure_invoke () from /usr/lib64/libgobject-2.0.so.0
#28 0x00007ffff6ae6cad in ?? () from /usr/lib64/libgobject-2.0.so.0
#29 0x00007ffff6aee9b9 in g_signal_emit_valist () from /usr/lib64/libgobject-2.0.so.0
#30 0x00007ffff6aeec72 in g_signal_emit () from /usr/lib64/libgobject-2.0.so.0
#31 0x00007ffff6ad9695 in ?? () from /usr/lib64/libgobject-2.0.so.0
#32 0x00007ffff6adbc4b in g_object_notify () from /usr/lib64/libgobject-2.0.so.0
#33 0x00007ffff75097b5 in gdk_display_open_default_libgtk_only () from /usr/lib64/libgdk-x11-2.0.so.0
#34 0x00007ffff78d34b4 in gtk_init_check () from /usr/lib64/libgtk-x11-2.0.so.0
#35 0x00007ffff78d34d9 in gtk_init () from /usr/lib64/libgtk-x11-2.0.so.0
#36 0x000000000041140d in ?? ()
#37 0x00007ffff4b0fbe5 in __libc_start_main () from /lib64/libc.so.6
#38 0x0000000000414eed in ?? ()

It hangs in the pclose() in runCommand()

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 11, 2014

Did you get any signal with gdb?
In #2 of your bt it says <signal handler called>.
Also, what's your distribution and glibc/easytag version?

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 11, 2014

That's my pressing ctrl+c

^C
Program received signal SIGINT, Interrupt.
0x00007ffff4ba60d2 in wait () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff4ba60d2 in wait () from /lib64/libc.so.6
#1  0x000000000043674b in ?? ()
#2  <signal handler called>
#3  0x00007ffff4b5a3ac in _IO_proc_close@@GLIBC_2.2.5 () from /lib64/libc.so.6
#4  0x00007ffff4b63b20 in __GI__IO_file_close_it () from /lib64/libc.so.6
#5  0x00007ffff4b58688 in fclose@@GLIBC_2.2.5 () from /lib64/libc.so.6
#6  0x00007fffeaf9ee9a in runCommand (cmd=<optimized out>, result=0x7fffeb1d7be8 <kdeHome.53930>) at /usr/src/debug/qtcurve-1.8.17/gtk2/style/qt_settings.c:2950
#7  0x00007fffeaf9eed4 in getKdeHome () at /usr/src/debug/qtcurve-1.8.17/gtk2/style/qt_settings.c:67
#8  0x00007fffeaf9efbe in kdeFile (f=f@entry=0x7fffeafca8ca "kdeglobals") at /usr/src/debug/qtcurve-1.8.17/gtk2/style/qt_settings.c:101
#9  0x00007fffeaf9f502 in kdeGlobals () at /usr/src/debug/qtcurve-1.8.17/gtk2/style/qt_settings.c:111
#10 qtSettingsInit () at /usr/src/debug/qtcurve-1.8.17/gtk2/style/qt_settings.c:1975
#11 0x00007fffeaf9afb3 in qtcurve_rc_style_init (qtcurve_rc=<optimized out>) at /usr/src/debug/qtcurve-1.8.17/gtk2/style/qtcurve.c:2952
#12 0x00007ffff6af5dab in g_type_create_instance () from /usr/lib64/libgobject-2.0.so.0
#13 0x00007ffff6ada275 in ?? () from /usr/lib64/libgobject-2.0.so.0
#14 0x00007ffff6adc06d in g_object_newv () from /usr/lib64/libgobject-2.0.so.0
#15 0x00007ffff6adc81c in g_object_new () from /usr/lib64/libgobject-2.0.so.0
#16 0x00007fffeaf9b78c in theme_create_rc_style () at /usr/src/debug/qtcurve-1.8.17/gtk2/style/qtcurve.c:3024
#17 0x00007ffff791d906 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#18 0x00007ffff791e1d5 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#19 0x00007ffff791e36f in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#20 0x00007ffff791cb68 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#21 0x00007ffff791e1d5 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#22 0x00007ffff791e36f in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#23 0x00007ffff791ebdf in gtk_rc_reparse_all_for_settings () from /usr/lib64/libgtk-x11-2.0.so.0
#24 0x00007ffff793ab12 in gtk_settings_get_for_screen () from /usr/lib64/libgtk-x11-2.0.so.0
#25 0x00007fffeb7fae66 in ?? () from /usr/lib64/gtk-2.0/modules/libcanberra-gtk-module.so
#26 0x00007fffeb7fb128 in gtk_module_init () from /usr/lib64/gtk-2.0/modules/libcanberra-gtk-module.so
#27 0x00007ffff78eb441 in ?? () from /usr/lib64/libgtk-x11-2.0.so.0
#28 0x00007ffff6ad5318 in g_closure_invoke () from /usr/lib64/libgobject-2.0.so.0
#29 0x00007ffff6ae6cad in ?? () from /usr/lib64/libgobject-2.0.so.0
#30 0x00007ffff6aee9b9 in g_signal_emit_valist () from /usr/lib64/libgobject-2.0.so.0
#31 0x00007ffff6aeec72 in g_signal_emit () from /usr/lib64/libgobject-2.0.so.0
#32 0x00007ffff6ad9695 in ?? () from /usr/lib64/libgobject-2.0.so.0
#33 0x00007ffff6adbc4b in g_object_notify () from /usr/lib64/libgobject-2.0.so.0
#34 0x00007ffff75097b5 in gdk_display_open_default_libgtk_only () from /usr/lib64/libgdk-x11-2.0.so.0
#35 0x00007ffff78d34b4 in gtk_init_check () from /usr/lib64/libgtk-x11-2.0.so.0
#36 0x00007ffff78d34d9 in gtk_init () from /usr/lib64/libgtk-x11-2.0.so.0
#37 0x000000000041140d in ?? ()
#38 0x00007ffff4b0fbe5 in __libc_start_main () from /lib64/libc.so.6
#39 0x0000000000414eed in ?? ()

glibc-2.18-4.11.1.x86_64
openSUSE 13.1

It looks like it happens the second time entering getKdeHome()

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 11, 2014

Nop, it shouldn't be.

This is what it should look like when gdb recieved a signal

^C
Program received signal SIGINT, Interrupt.
0x00007ffff533391d in poll () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff533391d in poll () from /usr/lib/libc.so.6
#1  0x00007ffff6f665d8 in g_main_context_poll (priority=0, context=<optimized out>, 
timeout=<optimized out>, fds=<optimized out>, n_fds=<optimized out>)
at gmain.c:4006
#2  g_main_context_iterate (context=<optimized out>, block=<optimized out>, 
dispatch=<optimized out>, self=<optimized out>) at gmain.c:3707
#3  0x00007ffff6f668ff in g_main_loop_run (loop=loop@entry=0xdc7730) at gmain.c:3906
#4  0x00007ffff78d4cd7 in gtk_main () at gtkmain.c:1257
#5  0x00000000004139a6 in ?? ()
#6  0x00007ffff5278b05 in __libc_start_main () from /usr/lib/libc.so.6
#7  0x0000000000413bd9 in ?? ()

The signal handler (if any) should be called after gdb captures it. (And in any case it shouldn't call another function before gdb).

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 11, 2014

P.S. when pasting the output of some command, please quote it as code.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 11, 2014

info signals should show you how gdb handles each signals.
and handle should allow you to change that.

From the list of signals gdb does not stop or print by default. I guess it's a SIGCHLD. You can check that with handle all print

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 11, 2014

Although I still don't understand why it could hang on wait() ....

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 11, 2014

Ok, popen() executes kde-config which quits with a SIGCHLD. Then we read what it printed and then pclose hangs waiting for the process to terminate, but it already terminated...

See the manpage for pclose():
The pclose() function waits for the associated process to terminate and returns the exit status of the command as returned by wait4(2).

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 11, 2014

I think the pclose()/fclose() sets up the signal handler to detect when the application goes away. The process has already finished and calls waitpid() on a process which doesn't exist. So it sits there and waits forever ...

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 11, 2014

I've tested it outside of qtcurve and it works just fine. I've also set the SA_RESETHAND flag before invoking popen() still doesn't work.

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 11, 2014

I got it working, it is the SIGCHILD handler ...

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

I still don't think this is the right explaination.
The blocking wait() is in the signal handler of SIGCHLD and this should never block (at least the first one.)
Also, if your first patch have any effect at all, the signal handler is definitely not set up by popen()/pclose().

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 15, 2014

The runCommand worked just fine using it in a simple programm but doesn't work with some GTK applications. You didn't like that I reset the signal handler so I implemented it using posix_spawn(). What's wrong with posix_spawn now?

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

I would like to find the real reason for the race, and I don't think the explaination you give is convincing (e.g. who sets up the signal handler? and why it is blocked on a wait() in the signal handler of SIGCHLD).

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 15, 2014

Install "easytag" and debug yourself :)

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

I do. And I cannot reproduce it.

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 15, 2014

Maybe this is a race condition between setting up the SIGCHILD handler and catching it. I have a very fast machine. It is not the first time that I need to fix a race condition with this machine.

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 15, 2014

However calling the command to find out the directories isn't a really nice way. Why don't you link against libkdecore and use:

(void)KGlobal::dirs(); // trigger the creation
(void)KGlobal::config();

KGlobal::dirs()->localkdedir()

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

Linking against Qt or Kdelibs in Gtk application is asking for trouble.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

OK, I think I've finally managed to reproduce this.

The popen() version looks like.

#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>

static void
signalHandler(int sig)
{
    int status;
    printf("%s, start wait\n", __func__);
    int pid = waitpid(-1, &status, 0);
    printf("%s, sig: %d, pid: %d, status: %d\n", __func__, sig, pid, status);
}

int
main(int argc, char **argv)
{
    if (argc > 1) {
        int wait = atoi(argv[1]);
        sleep(wait);
        fprintf(stderr, "%s, pid: %d, wait: %d\n", __func__, getpid(), wait);
        return 0;
    }
    if (!fork()) {
        sleep(20);
        printf("%s, long run child pid: %d\n", __func__, getpid());
        return 0;
    }
    struct sigaction sig_act;
    memset(&sig_act, 0, sizeof(sig_act));
    sig_act.sa_handler = signalHandler;
    sigaction(SIGCHLD, &sig_act, NULL);

    char *cmdline;
    asprintf(&cmdline, "%s %d", argv[0], 4);
    printf("%s, run %s\n", __func__, cmdline);
    FILE *fp = popen(cmdline, "r");
    sleep(1);
    printf("%s, close %s\n", __func__, cmdline);
    pclose(fp);
    printf("%s, close %s done\n", __func__, cmdline);
    free(cmdline);

    return 0;
}

And the version with manually fork() and wait() looks like

#include <signal.h>
#include <sys/wait.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>

static void
signalHandler(int sig)
{
    printf("%s: %d\n", __func__, sig);
    int status;
    int ret = waitpid(-1, &status, 0);
    printf("%s, ret: %d, status: %d\n", __func__, ret, status);
}

int
main()
{
    struct sigaction sig_act;
    memset(&sig_act, 0, sizeof(sig_act));
    sig_act.sa_handler = signalHandler;
    sigaction(SIGCHLD, &sig_act, NULL);

    if (!fork()) {
        sleep(20);
        printf("%s: long run child %d exit.\n", __func__, getpid());
        _exit(0);
    }

    pid_t pid = fork();
    if (!pid) {
        sleep(4);
        printf("%s: %d exit.\n", __func__, getpid());
        _exit(0);
    }
    printf("%s: %d -> %d\n", __func__, getpid(), pid);

    sleep(1);
    printf("%s, start waiting for %d\n", __func__, pid);
    int status;
    int ret = waitpid(pid, &status, 0);
    printf("%s, ret: %d, pid: %d, status: %d\n", __func__, ret, pid, status);

    return 0;
}

Therefore, the hang only happens if there are other child processes (otherwise wait() will never hang) and when the process end inside wait() (somehow waitpid clears the state before the signal handler is triggered).

This is pretty much the opposite of what you said and I feel like your second patch should only make it worse.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

Also there is another unlikely race condition for piping to child process which can only be solved by manually doing fork() and using socketpair() instead of pipe().

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 15, 2014

I would say the problem is waitpid(-1, ...). I don't see a problem in the posix_spawn code.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

On Wed, Jan 15, 2014 at 8:15 AM, Andreas Schneider <notifications@github.com

wrote:

I would say the problem is waitpid(-1, ...). I don't see a

The waitpid(-1) is in the signal handler, and it is not registered by
glibc or us.

problem in the posix_spawn code.


Reply to this email directly or view it on GitHubhttps://github.com//issues/41#issuecomment-32360151
.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

An alternative way to do this is to use a dbus service. This can get rid of a lot of things including string parsing and also save a lot of pids since the dbus daemon can take care of spawning the process. However, this adds some complexity during update and I don't want to implement right now.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

The problem with the posix_spawn code is that if the process end when you are calling waitpid() the signal handler will hang.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 15, 2014

And the version with posix_spawn is here:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#include <spawn.h>

static void
signalHandler(int sig)
{
    int status;
    printf("%s, start wait\n", __func__);
    int pid = waitpid(-1, &status, 0);
    printf("%s, sig: %d, pid: %d, status: %d\n", __func__, sig, pid, status);
}

int
main(int argc, char **argv)
{
    if (argc > 1) {
        int wait = atoi(argv[1]);
        sleep(wait);
        fprintf(stderr, "%s, pid: %d, wait: %d\n", __func__, getpid(), wait);
        return 0;
    }
    if (!fork()) {
        sleep(20);
        printf("%s, long run child pid: %d\n", __func__, getpid());
        return 0;
    }
    struct sigaction sig_act;
    memset(&sig_act, 0, sizeof(sig_act));
    sig_act.sa_handler = signalHandler;
    sigaction(SIGCHLD, &sig_act, NULL);

    pid_t pid;
    posix_spawn(&pid, argv[0], NULL, NULL, (char*[]){argv[0], "2", NULL}, NULL);
    int status;
    printf("%s, start wait.\n", __func__);
    int ret = waitpid(pid, &status, 0);
    printf("%s, ret: %d, pid: %d, status: %d\n", __func__, ret, pid, status);

    return 0;
}

Again, the wait() (or waitpid(-1, ...) here) in the signal handler is registered by the application and is not sth I can control. It might be ok to override it for this application but I don't think it is a good idea to do it in general (there are applications that are doing all linds of funny things with signals....).

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 19, 2014

Can you test the current master?

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 19, 2014

I will try to find some time tomorrow ...

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 20, 2014

easytag doesn't start up, but works if I run it with strace or gdb. So we have a race condition here. I guess the culprit is libglib and not your code.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 20, 2014

Can you run with gdb -p ?

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

(gdb) bt
#0  0x00007f34b27740d2 in wait () from /lib64/libc.so.6
#1  0x000000000043674b in ?? ()
#2  <signal handler called>
#3  0x00007f34b277417c in waitpid () from /lib64/libc.so.6
#4  0x00007f34a894d956 in qtcForkBackground () from /usr/lib64/libqtcurve-utils.so.1
#5  0x00007f34a894d9f2 in qtcSpawn () from /usr/lib64/libqtcurve-utils.so.1
#6  0x00007f34a894daca in qtcPopen () from /usr/lib64/libqtcurve-utils.so.1
#7  0x00007f34a894dd19 in qtcPopenBuff () from /usr/lib64/libqtcurve-utils.so.1
#8  0x00007f34a8b6d0a1 in ?? () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#9  0x00007f34a8b6d17e in ?? () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
#10 0x00007f34a8b6df4f in qtSettingsInit () from /usr/lib64/gtk-2.0/2.10.0/engines/libqtcurve.so
@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

(gdb) bt
#0  0x00007f22376290d2 in wait () from /lib64/libc.so.6
#1  0x000000000043674b in ?? ()
#2  <signal handler called>
#3  0x00007f223762917c in waitpid () from /lib64/libc.so.6
#4  0x00007f222d802956 in qtcForkBackground (cb=cb@entry=0x7f222d802780 <qtcSpawnCb>, data=data@entry=0x7fff1bad9620, fail_cb=fail_cb@entry=0x7f222d802760 <qtcSpawnFailCb>)
    at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/lib/utils/process.c:67
#5  0x00007f222d8029f2 in qtcSpawn (file=file@entry=0x7f222da4e922 "kde4-config", argv=argv@entry=0x7fff1bad98d0, cb=cb@entry=0x7f222d8027c0 <qtcPopenCb>, 
    cb_data=cb_data@entry=0x7fff1bad9670, fail_cb=fail_cb@entry=0x7f222d8027a0 <qtcPopenFailCb>)
    at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/lib/utils/process.c:100
#6  0x00007f222d802aca in qtcPopen (file=file@entry=0x7f222da4e922 "kde4-config", argv=0x7fff1bad98d0, fd_num=fd_num@entry=1, fds=fds@entry=0x7fff1bad97b0)
    at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/lib/utils/process.c:179
#7  0x00007f222d802d19 in qtcPopenBuff (file=file@entry=0x7f222da4e922 "kde4-config", argv=argv@entry=0x7fff1bad98d0, buff_num=buff_num@entry=1, buffs=buffs@entry=0x7fff1bad98b0, 
    timeout=timeout@entry=300) at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/lib/utils/process.c:276
#8  0x00007f222da220a1 in qtcPopenStdout (len=<synthetic pointer>, timeout=300, argv=0x7fff1bad98d0, file=0x7f222da4e922 "kde4-config")
    at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/build/.cmake_utils_base/cmake_c_macros/include_fix/qtcurve-utils/process.h:62
#9  getKdeHome () at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/gtk2/style/qt_settings.c:58
#10 0x00007f222da2217e in kdeFile (f=f@entry=0x7f222da4ea66 "kdeglobals") at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/gtk2/style/qt_settings.c:88
#11 0x00007f222da22f4f in kdeGlobals () at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/gtk2/style/qt_settings.c:101
#12 qtSettingsInit () at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/gtk2/style/qt_settings.c:1579
#13 0x00007f222da19fa3 in qtcurve_rc_style_init (qtcurve_rc=<optimized out>) at /usr/src/debug/qtcurve-ae3f0846a226047184f6eefea94f26570747d1c9/gtk2/style/qtcurve.c:3177
@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

Obviously, the signal handler is set up by easytag (I guess you could install the debug symbol for it to see the name of the handler). I can now reproduce the problem with the tesing code. It seems that the timing from vfork() is not good enough (waitpid is still called before SIGCHLD is recieved although after the child process exit). I guess your problem is not having a too fast machine, rather than having a easytag (plugin) which spawn long run background processes.

I have pushed another workaround to master. I don't totally like this solution but I don't think there is another solution other than resetting the signal handler (and anyway, having 2-3 zombie processes is a lot better than freezing application.) Could you try it again?

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

easytag forks and starts rccexternal for cddb queries. If I send the TERM signal to rccexternal easytag starts up.

https://bugzilla.gnome.org/show_bug.cgi?id=721943

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

I guess it is an easytag issue too.

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

Easytags fault it that it calls wait(NULL) in it's signal handler rather than the non-block version (wait3/wait4 if I remember correctly) or waiting only on the pid which triggers the signal with waitpid.

It will be nice if they improve the signal handling but I think the current solution can fix the problem without reseting the handler. Have you tested it yet?

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

Sorry, the non-blocking version is also waitpid....

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

Or wait4 ...

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

And BTW, if the only purpose of the signal handler is getting rid of zombies, on linux you can also just ignore the SIGCHLD.

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

I guess the culprit is librcc (Russian Charset Conversion Library) which forks rccexternal.

http://rusxmms.sourceforge.net/

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

--- Comment #7 from David King 2014-01-21 20:13:12 UTC ---
I removed the signal handlers in commit
b698dad4c31ffc702121cc06200ff1a0e1e89864 on master. In the two following
commits I switched the two uses of fork() in EasyTAG to instead use
g_spawn_async() and g_child_add_watch() followed by g_child_watch_add() and
g_spawn_close_pid().

https://bugzilla.gnome.org/show_bug.cgi?id=721943

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

...... And have you tested the current QtCurve master yet?

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

No, I will do that now

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

master works

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

Good and close......
I'm still looking for a better solution and I have asked a question on stackoverflow Not sure if there is one though...

@yuyichao yuyichao closed this Jan 21, 2014

@cryptomilk

This comment has been minimized.

Copy link
Author

cryptomilk commented Jan 21, 2014

Thanks, could you release a new version this fix?

@yuyichao

This comment has been minimized.

Copy link
Member

yuyichao commented Jan 21, 2014

I don't think I will make a release "for" this fix but I am planing to do a release soon (by the end of this month) and in fact this is the most important blocking bug...

Most of the short term TODO's are done now and I think there are enough changes/fixes for a new minor version. I will still need to do sth to the QtQuick2 dependency (because it needs update from Qt side to fully functional anyway) and also making sure there isn't any major thing that I'm missing.

If things go well, it can probably happen by the end of this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.