-
Notifications
You must be signed in to change notification settings - Fork 224
seccomp: various updates #421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seccomp: various updates #421
Conversation
3e7e589 to
c5e5295
Compare
Several syscalls were enabled globally (SCMP_ACT_ALLOW without any conditions for all containers), but also had conditional rules later in the profile (likely inherited from Docker). The following syscalls do not need special casing because they were globally enabled: * clone, unshare, mount, umount, umount2 all had special CAP_SYS_ADMIN restrictions but those don't make sense since they were also enabled for all containers. * reboot was permitted for CAP_SYS_BOOT and all containers. * name_to_handle_at was permitted for CAP_SYS_ADMIN, CAP_SYS_NICE(?), and all containers. And certain syscalls had globally-enabled rules when they shouldn't have: * socket has special rules for CAP_AUDIT_WRITE but it also had a global "allow unconditionally" rule. It turns out that libseccomp will override unconditional rules with conditional ones but this is somewhat of an implementation detail and it's much safer to remove the rule and use the existing cases. Now the only syscalls remaining with complicated rules (meaning they appear more than once in the profile) are: * sync_file_range2 which is architecture specific (though in principle we could move it to enabled-without-rules because runc ignores unknown syscalls). Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
The generate.go script used to fill the default seccomp profile file is quite important as otherwise distributions will end up having outdated seccomp filters even after a podman update. This script comes from the Docker repo. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This mirrors the Docker and containerd changes, with the caveat that because mount(2) is permitted under podman for all containers we therefore add all of the v2 mount API syscalls as available to all containers. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
c5e5295 to
5aaf1d5
Compare
giuseppe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cyphar, giuseppe The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Thanks @cyphar |
Several syscalls were enabled globally (SCMP_ACT_ALLOW without any
conditions for all containers), but also had conditional rules later in
the profile (likely inherited from Docker). The following syscalls do
not need special casing because they were globally enabled:
restrictions but those don't make sense since they were also enabled
for all containers.
and all containers.
And certain syscalls had globally-enabled rules when they shouldn't
have:
socket has special rules for CAP_AUDIT_WRITE but it also had a global
"allow unconditionally" rule. This actually rendered the conditional
rules ineffective (because libseccomp ignores conditional rules if
there is already an unconditional rule).
This means that socket(AF_NETLINK, NETLINK_AUDIT) was not actually
blocked by seccomp, though luckily there are ordinary kernel checks
which block this from working.
Now the only syscalls remaining with complicated rules (meaning they
appear more than once in the profile) are:
we could move it to enabled-without-rules because runc ignores
unknown syscalls).
In addtion, update the syscall list to Linux 5.11. This mirrors the
Docker and containerd changes, with the caveat that because mount(2) is
permitted under podman for all containers we therefore add all of the v2
mount API syscalls as available to all containers.
Fixes #419
Signed-off-by: Aleksa Sarai cyphar@cyphar.com