-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
beef up /var/tmp and /tmp handling; set $SERVICE_RESULT/$EXIT_CODE/$EXIT_STATUS on ExecStop= and make sure root/nobody are always resolvable #3818
Conversation
|
||
*ret = "/tmp"; | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this asks for a helper function which takes either "/tmp" or "/var/tmp" as parameter, and tmp_dir() and var_tmp_dir() should be thin wrappers around that.
Looks nice. Only a few minor comments and questions. |
d0a83f0
to
52901a0
Compare
Force pushed a new version now that makes all suggested changes, except the "dumped" rename, due to the mentioned reasons. Please have another look and merge! |
is one of <literal>exited</literal>, <literal>killed</literal>, | ||
<literal>dumped</literal>. <varname>$EXIT_STATUS</varname> contains the numeric exit code formatted as string | ||
if <varname>$EXIT_CODE</varname> is <literal>exited</literal>, and the signal name in all other cases. Note | ||
that these environment variables are only set if the service manager succeeded to start and identify the main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only set if the service manager succeeded to start
I think we should mention (explicitly) that this doesn't work for Type=oneshot
https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStop=
Note that the commands specified in ExecStop= are only executed when the service started successfully first.
-bash-4.3# systemctl cat self-abort --no-pager
# /etc/systemd/system/self-abort.service
[Service]
Type=oneshot
ExecStart=/bin/sh -x -c 'kill -ABRT $$$$'
ExecStop=/bin/sh -x -c 'set'
-bash-4.3# systemctl start self-abort
Job for self-abort.service failed because a fatal signal was delivered causing the control process to dump core.
See "systemctl status self-abort.service" and "journalctl -xe" for details.
-bash-4.3# journalctl -u self-abort --no-pager -o cat
self-abort.service: Collecting.
self-abort.service: Collecting.
self-abort.service: Collecting.
self-abort.service: Collecting.
self-abort.service: Trying to enqueue job self-abort.service/start/replace
self-abort.service: Installed new job self-abort.service/start as 380
self-abort.service: Enqueued job self-abort.service/start as 380
self-abort.service: About to execute: /bin/sh -x -c 'kill -ABRT $$$$'
self-abort.service: Forked /bin/sh as 298
self-abort.service: Changed dead -> start
Starting self-abort.service...
+ kill -ABRT 298
self-abort.service: Executing: /bin/sh -x -c 'kill -ABRT $$'
self-abort.service: Child 298 belongs to self-abort.service
self-abort.service: Main process exited, code=dumped, status=6/ABRT
self-abort.service: Changed start -> failed
self-abort.service: Job self-abort.service/start finished, result=failed
Failed to start self-abort.service.
self-abort.service: Unit entered failed state.
self-abort.service: Failed with result 'core-dump'.
self-abort.service: cgroup is empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXIT_CODE/EXIT_STATUS do work for ExecStop= as well as ExecStopPost=, but of course by default we don't invoke ExecStop= if ExecStart= failed to start if Type=oneshot is used, except if people used SuccessExitStatus= to mark the a failure as a non-failure...
But yeah, this deserves documentaiton.
Beef up the existing var_tmp() call, rename it to var_tmp_dir() and add a matching tmp_dir() call (the former looks for the place for /var/tmp, the latter for /tmp). Both calls check $TMPDIR, $TEMP, $TMP, following the algorithm Python3 uses. All dirs are validated before use. secure_getenv() is used in order to limite exposure in suid binaries. This also ports a couple of users over to these new APIs. The var_tmp() return parameter is changed from an allocated buffer the caller will own to a const string either pointing into environ[], or into a static const buffer. Given that environ[] is mostly considered constant (and this is exposed in the very well-known getenv() call), this should be OK behaviour and allows us to avoid memory allocations in most cases. Note that $TMPDIR and friends override both /var/tmp and /tmp usage if set.
The ExecParameters structure contains a number of bit-flags, that were so far exposed as bool:1, change this to a proper, single binary bit flag field. This makes things a bit more expressive, and is helpful as we add more flags, since these booleans are passed around in various callers, for example service_spawn(), whose signature can be made much shorter now. Not all bit booleans from ExecParameters are moved into the flags field for now, but this can be added later.
Let's fix up the flags fields in service_spawn() rather than its callers, in order to simplify things a bit.
…xecStopPost= commands This should simplify monitoring tools for services, by passing the most basic information about service result/exit information via environment variables, thus making it unnecessary to retrieve them explicitly via the bus.
Let's extend nss-systemd to also synthesize user/group entries for the UIDs/GIDs 0 and 65534 which have special kernel meaning. Given that nss-systemd is listed in /etc/nsswitch.conf only very late any explicit listing in /etc/passwd or /etc/group takes precedence. This functionality is useful in minimal container-like setups that lack /etc/passwd files (or only have incompletely populated ones).
Previously, the result value of a unit was overriden with each failure that took place, so that the result always reported the last failure that took place. With this commit this is changed, so that the first failure taking place is stored instead. This should normally not matter much as multiple failures are sufficiently uncommon. However, it improves one behaviour: if we send SIGABRT to a service due to a watchdog timeout, then this currently would be reported as "coredump" failure, rather than the "watchodg" failure it really is. Hence, in order to report information about the type of the failure, and not about the effect of it, let's change this from all unit type to store the first, not the last failure. This addresses the issue pointed out here: systemd#3818 (comment)
52901a0
to
b08af3b
Compare
I force pushed a new version now, that addresses @evverx's points. Specifically I added one patch that makes sure the first failure is reported as service result, and no longer the last failure. This has the benefit that a watchdog timeout will result in a correctly reported "watchdog" result code, instead of a "core-dump". I also updated the man page, and dropped the "start-limit" reference entirely, as this can never actually be set for ExecStopPost= processes. I also added some wording clarifying that the result is primarily useful in ExecStopPost=, and not so much in ExecStop=, as the former is called for services that failed during start or runtime, and the latter only for those which failed during runtime. This distinction is relevant, as Type=oneshot/RemainAfterExit=no units have no runtime, and hence ExecStop= is useless, and hence not appropriate for checking $SERVICE_RESULT. Note that I don't explicitly explain the Type=oneshot/RemainAfterExit=no case, as that appeared to specific to me. Instead it says now that ExecStopPost= is more useful because it covers both runtime and startup errors, without going into details that Type=oneshot have no runtime |
@evverx and thanks again for the thorough review! Much appreciated! |
@poettering , thanks! I'll take a look tomorrow. |
Looks all good. I tested the status settings in a few different ways, and the result is always as expected. |
A follow-up for systemd#3818 (992e8f2)
This beefs up three areas: