snap-confine, snap-seccomp: utilize new seccomp logging features #3998

Open
wants to merge 11 commits into
from

Conversation

Projects
None yet
7 participants
Contributor

tyhicks commented Oct 3, 2017

Important: This PR is a work in progress. The snapd tests need updating since this patch changes the default seccomp confinement from "kill" to "return an errno". @mvo5 asked me to go ahead and propose the merge and that the Snap folks could help see it to completion.

Two new seccomp logging features, introduced in Linux 4.14, allow for better snap confinement:

  1. SECCOMP_FILTER_FLAG_LOG allows for all actions taken by a filter, except for SECCOMP_RET_ALLOW, to be logged. Previously, only SECCOMP_RET_KILL actions were logged. This kept snap confinement from using SECCOMP_RET_ERRNO because applications would experience errors returned from illegal system calls but users and application developers would not have visibility into the decision making of when system calls should be denied. This new filter flag allows us to use the SECCOMP_RET_ERRNO action to return -1 with errno set to EPERM rather than killing snap processes that make illegal system calls.
  1. SECCOMP_RET_LOG is a new action that logs the syscall and then allows it. This was the missing piece that prevented would-be seccomp denials from being logged in devmode. Previously, enabling devmode simply meant that a seccomp filter would not be loaded due to the lack of this feature. Now we can load a filter that defaults to SECCOMP_RET_LOG for any system calls that aren't white listed.

The corresponding upstream libseccomp pull request has been merged. The corresponding upstream libseccomp-golang pull request is still pending final review. Upstream libseccomp and libseccomp-golang requested more complex, ABI friendly features but we can move forward with minimal changes to libseccomp and libseccomp-golang and then take advantage of the more robust "API level" checks in the future once the new versions of libseccomp and libseccomp-golang are released.

I've SRUed a minimal libseccomp change in Ubuntu Artful (2.3.1-2.1ubuntu2) and Ubuntu Zesty (2.3.1-2.1ubuntu2~17.04.1). The Ubuntu Xenial libseccomp SRU is still pending in xenial-proposed (2.3.1-2.1ubuntu2~16.04.1). I've proposed a minimal libseccomp-golang change to @mvo5's fork of libseccomp-golang, as that's what snapd is using.

To be clear, you can apply the patches in this PR today and test out the new seccomp features in Ubuntu Artful or Zesty.

Known issues that need to be addressed before this can be merged:

  • The tests need to be updated to take into account the new default seccomp action of SECCOMP_RET_ERRNO
  • The seccomp BPF files, for each snap installed in devmode, should be recompiled if the user boots into an older kernel that doesn't support the new SECCOMP_RET_LOG action. The kernel treats unknown actions as SECCOMP_RET_KILL which is appropriate from a security standpoint but is far from ideal considering that many devmode snaps would no longer work. The snap-seccomp change falls back to SECCOMP_RET_ALLOW when SECCOMP_RET_LOG is not supported by the currently running kernel.
  • libseccomp 2.3.1-2.1ubuntu2~16.04.1 needs to migrate from xenial-proposed to xenial-updates

tyhicks added some commits Jul 27, 2017

snap-confine, snap-seccomp: Default to SECCOMP_RET_ERRNO
The seccomp policy has historically used SECCOMP_RET_KILL to forcefully
kill a snap process that bumps into the walls of the sandbox. However,
killing the snap is not very user friendly. Changing the policy to use
SECCOMP_RET_ERRNO to return -1 with errno set to EPERM has been desired
but the kernel would not log those denials which could leave users and
developers confused about why their applications were experiencing
errors.

The 4.14 Linux kernel contains new seccomp logging controls which allows
snapd to request SECCOMP_RET_ERRNO to be logged. This patch makes use of
the new logging controls and switches the default action of the seccomp
policy to SECCOMP_RET_ERRNO so that snaps aren't killed when the perform
an illegal system call.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
cmd/snap-seccomp: Use seccomp's SECCOMP_RET_LOG mode for devmode
No seccomp policy was loaded for snaps installed in devmode since
seccomp didn't have the concept of "allow but log". The 4.14 Linux
kernel now has that functionality with the SECCOMP_RET_LOG action. This
patch changes the seccomp policy for devmode to log and allow any system
calls that would not be allowed if the snap was installed in jailmode.

Fixes: https://launchpad.net/bugs/1567597
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Member

chipaca commented Oct 3, 2017

squee!

zyga approved these changes Oct 3, 2017

Everything looks good. Note that we have a convention to use curly braces around if / else statements to ensure there's no chance of mistaking the nesting.

While not strictly could we also take the change to kill the entire process rather than just the offending thread of execution? This would help with "mysterious" problems in multi-threaded applications. AFAIK the kernel has this feature now.

This PR is following the agreements between the snapd and security teams and looks good on the whole. Thanks! Please see comments inline though.

I'm going to leave this as 'Request changes' for now so I can review the final PR.

cmd/snap-confine/seccomp-support.c
+ debug("kernel doesn't support the seccomp(2) syscall");
+ else if (errno == EINVAL)
+ debug("kernel may not support the SECCOMP_FILTER_FLAG_LOG flag");
+
@jdstrand

jdstrand Oct 3, 2017

Contributor

This seems to be where the intended future check could be better (ie, using 'may'). What I was hoping for was that if SECCOMP_FILTER_FLAG_LOG is not available, then snapd would log that seccomp is mediating with EPERM but without logging so people understand the behavior change. Before, commands run under strict mode confinement with access violations would be killed with logging, but now on systems without SECCOMP_FILTER_FLAG_LOG support, they will be returned EPERM and continue to run with no logging, which may be confusing since the new denied behavior looks more like the previous allowed behavior than the previous denied behavior.

All that said, logging the fallback behavior IMO should happen at the snapd level once per snapd invocation since logging in snap-seccomp or in snap-confine would be way too chatty. @mvo5 - I suspect this would be more for whoever picks this up than for Tyler.

@tyhicks

tyhicks Oct 3, 2017

Contributor

That's a good point that logging the fallback shouldn't happen in snap-seccomp or snap-confine as it would be spammy in the logs.

libseccomp upstream was against exporting a function specifically for checking the validity of a filter flag or action. Applications linking against libseccomp will have the option to check the validity of the SECCOMP_FILTER_FLAG_LOG flag indirectly but using an "API level" based check that is currently being implemented. However, that check lumps together multiple features and, in this case, a single API level will be used to represent SECCOMP_FILTER_FLAG_LOG, SECCOMP_RET_LOG, and the unrelated SECCOMP_RET_KILL_PROCESS. That means that the kernel, libseccomp, and libseccomp-golang will all have to be in sync for libseccomp to report that the API level is supported.

Therefore, I think snapd would want to directly use the seccomp(2) syscall for checking if a filter flag is supported. Snapd could just use the technique that libseccomp is going to be using behind the scenes. I added a kernel selftest for the technique so that we can be sure that the kernel doesn't violate the agreement with userspace about how to check if a filter flag is supported. You can see the kernel selftest change here. Calling seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_LOG, NULL) should return -1 with errno set to EFAULT if the flag is supported by the kernel. Other errno values mean other things such as EINVAL meaning that the flag is not supported and ENOSYS meaning that the seccomp(2) syscall isn't even supported (so there's no way to even specify a filter flag).

cmd/snap-seccomp/main.go
+ } else if line == "@complain" {
+ var fallback bool = false
+
+ secFilter, err = seccomp.NewFilter(seccomp.ActLog)
@jdstrand

jdstrand Oct 3, 2017

Contributor

While it would be a bug in the policy to have '@complain' in it more than once, the previous code stopped processing rules as soon as it found '@complain', but this code will add a NewFilter for every '@complain'. I think we should simply skip (perhaps logging with debug) additional '@complain' lines.

@tyhicks

tyhicks Oct 3, 2017

Contributor

It is no different than if one or more white listed system calls are placed above the first @complain directive. In that case, a new filter would be allocated when @complain was encountered and all the previously white listed system calls would be forgotten.

If we're going to handle multiple @complain directives, we should handle the case that I described, too. Unfortunately, libseccomp doesn't allow us to change the default action of an existing filter via seccomp_attr_set(..., SCMP_FLTATR_ACT_DEFAULT, SCMP_ACT_LOG) so an entirely new filter must be allocated.

What is your preferred solution here?

  1. Don't worry about the case that I mentioned. Address your concern by skipping NewFilter() when libseccomp-golang's GetDefaultAction() returns ActLog.
  2. Treat both situations as an error. (I don't know what it means to error out of the compile() function.)
@jdstrand

jdstrand Oct 4, 2017

Contributor

What snap-confine would do before the move to snap-seccomp is it would preprocess the file, looking for @unrestricted or @complain. The rewrite took a shortcut and simply says "forget anything I did before if I find either of these". That was fine before when they were equivalent. Profile authors will sometimes put these directives anywhere in the file, so I think we need to account for that.

Instead of the two listed options, I suggest the previous behavior: preprocess the file, if @unrestricted found 1 or more times, store that off, if @complain found 1 or more times, store that off. Eg something along the lines of:

// Preprocess to allow @unrestricted and @complain to appear anywhere in the profile
unrestricted, complain := preprocess()
if unrestricted {
    seccomp.NewFilter(seccomp.ActAllow)
} else if complain {
    seccomp.NewFilter(seccomp.ActLog)
} else {
    seccomp.NewFilter(seccomp.ActErrno.SetReturnCode(C.EPERM))
}

addSecondaryArches(secFilter)
if not unrestricted {
    for scanner.Scan() {
        line := strings.TrimSpace(scanner.Text())
        parseLine(line, secFilter)
        ...
    }
}
...

This has the added benefit of adding the secondary arch in one spot.

@mvo5 - what do you think about preprocessing?

UPDATE: I adjusted the logic to be only 'if not unrestricted' after noticing the error

cmd/snap-seccomp/main.go
+ return err
+ }
+
+ if fallback {
@jdstrand

jdstrand Oct 3, 2017

Contributor

Can you add a comment here stating that this fallback is resorting to the old pre-logging behavior of just allowing everything?

@tyhicks

tyhicks Oct 3, 2017

Contributor

Yes, good idea.

Contributor

tyhicks commented Oct 3, 2017

While not strictly could we also take the change to kill the entire process rather than just the offending thread of execution? This would help with "mysterious" problems in multi-threaded applications. AFAIK the kernel has this feature now.

@zyga I agree that the new SECCOMP_RET_KILL_PROCESS action would have been nice when the default action was to kill instead of returning an errno but I don't see the need for it at this point since we're changing the default action to SECCOMP_RET_ERRNO. It would require quite a bit of work to backport the kernel changes and then implement and backport the libseccomp and libseccomp-golang changes (nobody has started work on the userspace changes) so I'd like to better understand how we'd use SECCOMP_RET_KILL_PROCESS.

Member

kyrofa commented Oct 3, 2017

\o/ thanks @tyhicks, very pleased to see this!

tyhicks added some commits Oct 3, 2017

snap-confine: Add curly braces to single line conditional blocks
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
snap-seccomp: Add comment describing seccomp fallback behavior
It isn't immediately clear why seccomp rule compilation must end
immediately after falling back to ActAllow as the default action for
devmode. Leave a comment for easier code reading.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Collaborator

mvo5 commented Oct 5, 2017

I started looking into updating the tests. This will require some serious work because right now the tests are a bit naive. They rely on the fact that the kernel kills a process that violates the confinement. With the change to errno the syscallRunner and the tests itself need to be smater at identifying why a syscall has an error result (i.e. was it bogus input data which we give right now in most tests or was it a permission denied).

mvo5 added some commits Oct 5, 2017

snap-seccomp: update tests for new ActErrno behaviour
The pervious tests simply checked if the process got killed when
doing an invalid syscall. The new seccomp behaviour is to use
EPERM as the errno for a denied syscall. This is now reflected
in the tests as well.
snap-seccomp: simplify tests, remove use of main.SeccompRetKill
We no longer use main.SeccompRetKill in the seccomp code and the
name in the tests is misleading. Instead use {Allow,Deny}.
Collaborator

mvo5 commented Oct 6, 2017

This is now failing only because fedora uses the official libseccomp-golang branch - so once the ActLog bits are in the upstream libseccomp-golang fedora should be unblocked.

@pedronis pedronis changed the title from snap-confine, snap-seccomp: Utilize new seccomp logging features to snap-confine, snap-seccomp: utilize new seccomp logging features Oct 25, 2017

Contributor

niemeyer commented Nov 10, 2017

@tyhicks This is lingering for quite a long time. Should we close it until the work is ready to be reviewed and merged?

Contributor

niemeyer commented Nov 10, 2017

@mvo What do you think? (just now noticed you've pushed changes too)

Contributor

tyhicks commented Nov 10, 2017

@niemeyer @mvo5 I was told that that the Snappy team would take this PR over from here so I don't have plans to do any further work on it.

I would hate for the PR to be closed since the final dependency piece (libseccomp SRU in Ubuntu 16.04) should land next week (it is currently in xenial-proposed if anyone would like to test with it).

@zyga zyga added the Decaying label Nov 29, 2017

Contributor

tyhicks commented Jan 2, 2018

@mvo5 is there anything else that you need me to do for this pull request?

From an Ubuntu distro standpoint, the libseccomp SRU in Xenial still needs to go through. There's been renewed progress in verifying that SRU and it should happen very soon.

I don't believe that there's anything else that I need to do in this pull request but I'd like to confirm that.

Contributor

tyhicks commented Jan 2, 2018

libseccomp 2.3.1-2.1ubuntu2~16.04.1 has now migrated to xenial-updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment