Skip to content
Browse files
core: fix securebits setting
Desired functionality:
Set securebits for services started as non-root user.

The starting of the service fails if no ambient capability shall be
... systemd[217941]: ...: Failed to set process secure bits: Operation
not permitted
... systemd[217941]: ...: Failed at step SECUREBITS spawning
/usr/bin/abc.service: Operation not permitted
... systemd[1]: abc.service: Failed with result 'exit-code'.

For setting securebits the capability CAP_SETPCAP is required. However
the securebits (if no ambient capability shall be raised) are set after
When setresuid is invoked all capabilities are dropped from the
permitted, effective and ambient capability set. If the securebit
SECBIT_KEEP_CAPS is set the permitted capability set is retained, but
the effective and the ambient set are cleared.

If ambient capabilities shall be set, the securebit SECBIT_KEEP_CAPS is
added to the securebits configured in the service file and set together
with the securebits from the service file before setresuid is executed
(in enforce_user).
Before setresuid is executed the capabilities are the same as for pid1.
This means that all capabilities in the effective, permitted and
bounding set are set. Thus the capability CAP_SETPCAP is in the
effective set and the prctl(PR_SET_SECUREBITS, ...) succeeds.
However, if the secure bits aren't set before setresuid is invoked they
shall be set shortly after the uid change in enforce_user.
This fails as SECBIT_KEEP_CAPS wasn't set before setresuid and in
consequence the effective and permitted set was cleared, hence
CAP_SETPCAP is not set in the effective set (and cannot be raised any
longer) and prctl(PR_SET_SECUREBITS, ...) failes with EPERM.

Proposed solution:
The proposed solution consists of three parts
1. Check in enforce_user, if securebits are configured in the service
   file. If securebits are configured, set SECBIT_KEEP_CAPS
   before invoking setresuid.
2. Don't set any other securebits than SECBIT_KEEP_CAPS in enforce_user,
   but set all requested ones after enforce_user.
   This has the advantage that securebits are set at the same place for
   root and non-root services.
3. Raise CAP_SETPCAP to the effective set (if not already set) before
   setting the securebits to avoid EPERM during the prctl syscall.

For gaining CAP_SETPCAP the function capability_bounding_set_drop is
splitted into two functions:
- The first one raises CAP_SETPCAP (required for dropping bounding
- The second drops the bounding capabilities

Why are ambient capabilities not affected by this change?
Ambient capabilities get cleared during setresuid, no matter if
SECBIT_KEEP_CAPS is set or not.
For raising ambient capabilities for a user different to root, the
requested capability has to be raised in the inheritable set first. Then
the SECBIT_KEEP_CAPS securebit needs to be set before setresuid is
invoked. Afterwards the ambient capability can be raised, because it is
in the inheritable and permitted set.

Security considerations:
Although the manpage is ambiguous SECBIT_KEEP_CAPS is cleared during
execve no matter if SECBIT_KEEP_CAPS_LOCKED is set or not. If both are
set only SECBIT_KEEP_CAPS_LOCKED is set after execve.
Setting SECBIT_KEEP_CAPS in enforce_user for being able to set
securebits is no security risk, as the effective and permitted set are
set to the value of the ambient set during execve (if the executed file
has no file capabilities. For details check man 7 capabilities).

In capability-util.c is a comment complaining about the missing
capability CAP_SETPCAP in the effective set, after the kernel executed
/sbin/init. Thus it is checked there if this capability has to be raised
in the effective set before dropping capabilities from the bounding set.
If this were true all the time, ambient capabilities couldn't be set
without dropping at least one capability from the bounding set, as the
capability CAP_SETPCAP would miss and setting SECBIT_KEEP_CAPS would
fail with EPERM.

(cherry picked from commit dbdc409)
(cherry picked from commit ab6fcd9)
  • Loading branch information
Tobias Kaufmann authored and keszybz committed Sep 21, 2020
1 parent b2b3828 commit 711ca814c9f2e81d3d25ebbed0b837b7d4fbbeda
Showing 1 changed file with 40 additions and 9 deletions.
@@ -1080,26 +1080,42 @@ static int enforce_groups(gid_t gid, const gid_t *supplementary_gids, int ngids)
return 0;

static int set_securebits(int bits, int mask) {
int current, applied;
current = prctl(PR_GET_SECUREBITS);
if (current < 0)
return -errno;
/* Clear all securebits defined in mask and set bits */
applied = (current & ~mask) | bits;
if (current == applied)
return 0;
if (prctl(PR_SET_SECUREBITS, applied) < 0)
return -errno;
return 1;

static int enforce_user(const ExecContext *context, uid_t uid) {
int r;

if (!uid_is_valid(uid))
return 0;

/* Sets (but doesn't look up) the uid and make sure we keep the
* capabilities while doing so. */
* capabilities while doing so. For setting secure bits the capability CAP_SETPCAP is
* required, so we also need keep-caps in this case.

if (context->capability_ambient_set != 0) {
if (context->capability_ambient_set != 0 || context->secure_bits != 0) {

/* First step: If we need to keep capabilities but
* drop privileges we need to make sure we keep our
* caps, while we drop privileges. */
if (uid != 0) {
int sb = context->secure_bits | 1<<SECURE_KEEP_CAPS;

if (prctl(PR_GET_SECUREBITS) != sb)
if (prctl(PR_SET_SECUREBITS, sb) < 0)
return -errno;
/* Add KEEP_CAPS to the securebits */
r = set_securebits(1<<SECURE_KEEP_CAPS, 0);
if (r < 0)
return r;

@@ -3852,12 +3868,27 @@ static int exec_child(

/* PR_GET_SECUREBITS is not privileged, while PR_SET_SECUREBITS is. So to suppress potential EPERMs
* we'll try not to call PR_SET_SECUREBITS unless necessary. */
if (prctl(PR_GET_SECUREBITS) != secure_bits)
* we'll try not to call PR_SET_SECUREBITS unless necessary. Setting securebits requires
if (prctl(PR_GET_SECUREBITS) != secure_bits) {
/* CAP_SETPCAP is required to set securebits. This capabilitiy is raised into the
* effective set here.
* The effective set is overwritten during execve with the following values:
* - ambient set (for non-root processes)
* - (inheritable | bounding) set for root processes)
* Hence there is no security impact to raise it in the effective set before execve
r = capability_gain_cap_setpcap(NULL);
if (r < 0) {
*exit_status = EXIT_CAPABILITIES;
return log_unit_error_errno(unit, r, "Failed to gain CAP_SETPCAP for setting secure bits");
if (prctl(PR_SET_SECUREBITS, secure_bits) < 0) {
*exit_status = EXIT_SECUREBITS;
return log_unit_error_errno(unit, errno, "Failed to set process secure bits: %m");

if (context_has_no_new_privileges(context))
if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) < 0) {

0 comments on commit 711ca81

Please sign in to comment.