New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spawn: fix propagating exit code #10

Merged
merged 1 commit into from Jun 8, 2018

Conversation

Projects
None yet
2 participants
@wjt
Contributor

wjt commented Jun 5, 2018

The exit status in the HostCommandExited/SpawnExited signal is as returned by waitpid(), which is a combination of flags indicating the way in which the process exited (or was SIGSTOPed or SIGCONTed) and the exit code or signal number. The exact layout is platform-specific, but on Linux the gist is:

  • on exit(x), the exit status is ((x & 0xff) << 8); the low byte is 0
  • on being killed by signal y, the exit status is in the low byte, with its high bit (0x80) set if the process dumped core

The number passed to exit() is specified to be masked with 0xff to fall in the range [0, 255].

These are different numbers, so passing one where the other is expected does not work:

  • A process that calls exit(1) has exit status 256 (0x0100). Since (0x0100 & 0xff) == 0, passing this to exit() means flatpak-spawn will exit with success, despite its child failing.
  • A process killed by SIGTERM 15 (0x0f) without dumping core has exit status 15 (0x000f). Passing this to exit() means flatpak-spawn will exit with status 15, which is at least non-success but gives no indication that it was terminated.
  • A process which segfaults (signal 11, 0x0b) and dumps core has exit status (0x0b | 0x80) == 0x8b (139). This happens to matches what the shell reports for a process killed by SIGSEGV, since the shell sets $? to (128 + signal number), but only because the "dumped core" flag is set.

So, we need to use the macros from sys/wait.h to inspect the exit status and recover the exit code or signal. For normal termination, we can pass the exit code to exit(). For signals, our options are:

  • Reset all signal() handlers to SIG_DFL, then send the signal to ourselves and hope we die
  • Follow the shell convention and exit(128 + signal number)

The latter is implemented here. It means that (for example) g_spawn_check_exit_status() reports "g-spawn-exit-error-quark: Child process exited with code 139 (139)" rather than "g-exec-error-quark: Child process killed by signal 11 (19)", but this is just the old behaviour, give or take the 0x80 bit always being set.

* code specified neither WUNTRACED nor WIFSIGNALED, then exactly one
* of WIFEXITED() or WIFSIGNALED() will be true.
*/
g_warning ("exit status %s is neither WIFEXITED() nor WIFSIGNALED()",

This comment has been minimized.

@wjt

wjt Jun 5, 2018

Contributor

Oops, should be %d, will fix tomorrow

spawn: fix propagating exit code
The exit status in the HostCommandExited/SpawnExited signal is as
returned by waitpid(), which is a combination of flags indicating the
way in which the process exited (or was SIGSTOPed or SIGCONTed) and the
exit code or signal number. The exact layout is platform-specific, but
on Linux the gist is:

* on exit(x), the exit status is ((x & 0xff) << 8); the low byte is 0
* on being killed by signal y, the exit status is in the low byte, with
  its high bit (0x80) set if the process dumped core

The number passed to exit() is specified to be masked with 0xff to fall
in the range [0, 255].

These are different numbers, so passing one where the other is expected
does not work:

* A process that calls exit(1) has exit status 256 (0x0100). Since
  (0x0100 & 0xff) == 0, passing this to exit() means flatpak-spawn will
  exit with success, despite its child failing.
* A process killed by SIGTERM 15 (0x0f) without dumping core has exit
  status 15 (0x000f).  Passing this to exit() means flatpak-spawn will
  exit with status 15, which is at least non-success but gives no
  indication that it was terminated.
* A process which segfaults (signal 11, 0x0b) and dumps core has exit
  status (0x0b | 0x80) == 0x8b (139). This happens to matches what the
  shell reports for a process killed by SIGSEGV, since the shell sets $?
  to (128 + signal number), but only because the "dumped core" flag is
  set.

So, we need to use the macros from sys/wait.h to inspect the exit
status and recover the exit code or signal. For normal termination, we
can pass the exit code to exit(). For signals, our options are:

* Reset all signal() handlers to SIG_DFL, then send the signal to
  ourselves and hope we die
* Follow the shell convention and exit(128 + signal number)

The latter is implemented here. It means that (for example)
g_spawn_check_exit_status() reports "g-spawn-exit-error-quark: Child
process exited with code 139 (139)" rather than "g-exec-error-quark:
Child process killed by signal 11 (19)", but this is just the old
behaviour, give or take the 0x80 bit always being set.

@wjt wjt force-pushed the wjt:fix-spawn-exit-code branch from 045234d to 54c5800 Jun 6, 2018

@alexlarsson alexlarsson merged commit a9afa99 into flatpak:master Jun 8, 2018

alexlarsson added a commit to flatpak/freedesktop-sdk-images that referenced this pull request Jun 8, 2018

@wjt wjt deleted the wjt:fix-spawn-exit-code branch Jun 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment