feat: mount /sys/kernel/security into kubelet #8641

andrewrynhard · 2024-04-23T17:57:24Z

This allows the kubelet to enforce AppArmor.

smira · 2024-04-23T18:19:55Z

internal/app/machined/pkg/system/services/kubelet.go

@@ -120,6 +120,7 @@ func (k *Kubelet) Runner(r runtime.Runtime) (runner.Runner, error) {
 	mounts := []specs.Mount{
 		{Type: "bind", Destination: "/dev", Source: "/dev", Options: []string{"rbind", "rshared", "rw"}},
 		{Type: "sysfs", Destination: "/sys", Source: "/sys", Options: []string{"bind", "ro"}},
+		{Type: "securityfs", Destination: "/sys/kernel/security", Source: "/sys/kernel/security", Options: []string{"bind", "ro"}},


should it need to be just "ro"? so kubelet only looks into it, but never configures anything?

do we want it to be backported or part of some future work?

bind should be needed because that will propagate changes from the host down into the container (if anything does change over time I am unsure of).

I don't think it needs to be backported but if it wouldn't hurt then it could be useful in 1.7. I can check with the user.

Only ro is technically needed for AppArmor since kubelet is checking if AppArmor is enabled via opencontainer apparmor lib: https://github.com/opencontainers/runc/blob/main/libcontainer/apparmor/apparmor_linux.go#L18

Which check if directory /sys/kernel/security/apparmor exists and a read on /sys/module/apparmor/parameters/enabled returns Y.
Both which should have their final value set by the time kubelet starts.
The part I'm not sure how bind handles is that /sys/kernel/security/is one of these virtual filesystems where this securityfs is a sort of interface to LSM modules in the kernel.

I think bind should be fine, it just exposes the filesystem at another mountpoint, but without bind we do a completely fresh mount.

So do we want it?

yep, I think bind is better, but we also need to handle the case when the filesystem is not mounted on the host (if running in container mode). I believe in container mode we don't mount it explicitly (???) and rely on the container runtime to provide it or not... just to make sure we don't break Talos in containers if securityfs is not mounted on the container runtime host. I don't know if it's even possible though, just as a precaution :)

yep, we don't mount securityfs in container mode:

talos/internal/app/machined/pkg/runtime/v1alpha1/v1alpha1_sequencer.go

Lines 88 to 97 in ccdb4c8

default:

phases = phases.Append(

"systemRequirements",

EnforceKSPPRequirements,

SetupSystemDirectory,

MountBPFFS,

MountCgroups,

MountPseudoFilesystems,

SetRLimit,

).Append(

But as we do bind mount, I believe our code will just mount an empty directory into the kubelet if the host doesn't provide securityfs, so bind >> re-mount :)

andrewrynhard · 2024-04-23T18:47:21Z

Since AppArmor landed with 1.7 kernel it probably makes sense to backport

konrader · 2024-04-24T06:42:24Z

Since AppArmor landed with 1.7 kernel it probably makes sense to backport

I think getting this in to v1.7 fits nicely with the bump to Kubernetes 1.30 that includes the move of AppArmor config to Pod SecurityContext fields. kubernetes/kubernetes#123435

konrader · 2024-04-24T10:11:20Z

I tried to validate if this fixes the problem with AppArmor configured on Pod causing it to get stuck in Pending with message Cannot enforce AppArmor: AppArmor is not enabled on the host.

I mounted the securityfs in kubelet using machine config:

machine:
    kubelet:
        extraMounts:
            - destination: /sys/kernel/security
              type: bind
              source: /sys/kernel/security
              options:
                - bind
                - ro

Starting a Pod with AppArmor profile specified (that I have loaded into the node Kernel beforehand):

spec:
  containers:
  - name: test
  ...
    securityContext:
      appArmorProfile:
        type: Localhost
        localhostProfile: myprofile

I no longer get the message ...AppArmor is not enabled on the host error but a new type of error: Error: failed to get container spec opts: failed to generate apparmor spec opts: apparmor is not supported

Feels like there is something else that also needs to be fixed around the kublet.

smira · 2024-04-24T10:41:21Z

@konrader it might be easier if you could provide us with a full set of steps to launch a pod with AppArmor profile, which we could include into Talos as an integration test. This way we can develop a fix and a test.

konrader · 2024-04-24T11:00:16Z

I will write down the steps needed. It does include loading a AppArmor profile (with apparmor_parser cmdline tool) via a (somewhat) privileged container.

Where are the integration tests located today?

smira · 2024-04-24T11:13:11Z

I will write down the steps needed. It does include loading a AppArmor profile (with apparmor_parser cmdline tool) via a (somewhat) privileged container.

Where are the integration tests located today?

in the internal/integration, but if that only needs some Kubernetes interactions, this is perfectly fine as well

konrader · 2024-04-24T14:46:18Z

Here is a complete script for doing the test of both loading AppArmor profile into kernel and running a Pod with profile enabled.

#!/bin/sh

NS=apparmor

kubectl create namespace $NS

kubectl label ns $NS pod-security.kubernetes.io/enforce=privileged

kubectl apply -n $NS -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: apparmor
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: apparmor
    image: ghcr.io/bifrostsec/apparmor-image/apparmor:22.04
    # Just spin & wait forever
    command: [ "/bin/sh", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
    securityContext:
      privileged: true
      capabilities:
        add:
          - MAC_ADMIN
    volumeMounts:
      - name: sysfs
        mountPath: /host/sys
        readOnly: false
      - name: securityfs
        mountPath: /sys/kernel/security
        readOnly: false
  volumes:
    - name: sysfs
      hostPath:
        path: /sys
    - name: securityfs
      hostPath:
        path: /sys/kernel/security
EOF
kubectl -n $NS wait --for=condition=Ready pod/apparmor

# Copy over simple AppArmor audit profile
kubectl -n $NS cp audit.profile apparmor:tmp/audit.profile
# Load it into host Kernel
kubectl -n $NS exec apparmor -it -- apparmor_parser /tmp/audit.profile

kubectl apply -n $NS -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: test
    image: alpine:3.19.1
    # Just spin & wait forever
    command: [ "/bin/sh", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
    securityContext:
      appArmorProfile:
        type: Localhost
        localhostProfile: audit-all
EOF

This depends on AppArmor profile being in a separate file named audit.profile with this content:

abi <abi/3.0>,

profile audit-all flags=(attach_disconnected,mediate_deleted,complain,audit) {
}

konrader · 2024-04-25T22:04:16Z

I've located the error failed to generate apparmor spec opts: apparmor is not supported to be coming from containerd https://github.com/containerd/containerd/blob/release/1.7/pkg/cri/server/container_create_linux.go#L505

which at the core relies on containerd's own check apparmor.HostSupports() to see if host supports AppArmor.
https://github.com/containerd/containerd/blob/release/1.7/pkg/apparmor/apparmor_linux.go#L34

This check does more than kubelet already does, notably it also checks for /sbin/apparmor_parser being present which it's not on a Talos node.

On Talos (or other Linux platforms) where AppArmor profile loading is done either from other user-land tool or from inside container this check is not necessary since containerd requires the profile to be loaded into kernel before you reference it for a container anyway.
There seems to be a bit of controversy around this with removing extra check and then reverting it back: containerd/containerd#8087

This PR is still a step in the right direction since it gets rid of the problem with kubelet not detecting if AppArmor is enabled on the node.

smira · 2024-04-29T14:49:27Z

@konrader we can put an empty /sbin/apparmor_parser to the rootfs if that helps, but I don't fully understand how it broke Moby? would it break some workloads on Talos (not using AppArmor previously)?

konrader · 2024-04-29T14:59:43Z

I was actually thinking about adding empty /sbin/apparmor_parser to Talos rootfs to try but didn't have my build-instance up.
Not sure either what the connection is between containerd and moby in regards to AppArmor. Moby says on their website site that they use containerd as the default container runtime, but still it's weird that they don't accept referencing AppArmor profiles (that is already loaded into kernel) for containers unless apparmor_parser is present (indicating that userland AppArmor tools are installed). It's a bit like checking if your car repair shop exists before you can start your car.

konrader · 2024-04-30T09:10:04Z

I tried to add empty /sbin/apparmor_parser via machine config but that wasn't allowed.

machine:
    files:
        - content: ''
          permissions: 0o555
          path: /sbin/apparmor_parser
          op: create

Got this message during boot:
create operation not allowed outside of /var: "/sbin/apparmor_parser"

What is the easiest way to add the file in rootfs to try out if this would work?

smira · 2024-04-30T11:02:53Z

What is the easiest way to add the file in rootfs to try out if this would work?

The only way is to rebuild Talos with changes to the Dockerfile to touch that file in the rootfs.

I can get to it with the steps you provided to build a test for AppArmor.

konrader · 2024-04-30T11:45:09Z

@konrader we can put an empty /sbin/apparmor_parser to the rootfs if that helps, but I don't fully understand how it broke Moby? would it break some workloads on Talos (not using AppArmor previously)?

I'm looking into why Moby/Docker want containerd to have the extra check for /sbin/apparmor_parser.
It seems to be related to Moby applying a default AppArmor profile docker-default to started containers (unless overridden by security-opt option) https://docs.docker.com/engine/security/apparmor/#understand-the-policies

Moby removed their own AppArmor enabled check and started relying on containerd's check: moby/moby#42276
Moby uses the result of this check to see if they need to parse and load docker-default profile, which they need apparmor_parser for. https://github.com/moby/moby/blob/master/profiles/apparmor/apparmor.go#L59
So some of their users got problem starting containers when AppArmor was enabled in Kernel but AppArmor userland tools weren't installed.

I do see that the default profile loading stuff is put under /contrib in containerd and it could possible be a problem with missing apparmor_parser if someone deploys a workload on Talos specifying it to apply the runtime's default profile, like this:

securityContext:
  appArmorProfile:
    type: RuntimeDefault

Docs: https://kubernetes.io/docs/tutorials/security/apparmor/#securing-a-pod

I will try to verify this.

konrader · 2024-04-30T14:44:07Z

I could reproduce the scenario with Kubernetes 1.30 running on a Ubuntu 22.04 when I uninstalled AppArmor userland tools (apt remove apparmor and reboot since containerd caches the AppArmor enabled check result).

With no /sbin/apparmor_parser then I could not start Pod with AppArmor securityContext, got error message failed to generate apparmor spec opts: apparmor is not supported (same as on Talos after mounting /sys/kernel/security into kubelet).

I created an empty /sbin/apparmor_parser and when rebooting then various Kubernetes pods fails to start with error:
kubelet[1706]: E0430 16:35:26.865258 1706 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"kube-apiserver\" with CreateContainerError: \"failed to create containerd container: load apparmor profile /tmp/cri-containerd.apparmor.d3636700873: parser error(\\\"\\\"): exec: \\\"apparmor_parser\\\": executable file not found in $PATH\""

So it seem like the CRI implementation in containerd do try to apply some default AppArmor profile for all containers (that does not have specific AppArmor profile specified in security option).

If we do want AppArmor support in Talos then I see the following options:

Include real apparmor_parser in rootfs
Disable applying default profile to containers in containerd CRI

konrader · 2024-05-02T13:16:24Z

It does not seem possible to disable automatic applying of a default AppArmor profile (called cri-containerd.apparmor.d) in containerd for containers started via CRI (the interface Kubernetes uses) and it has deemed AppArmor enabled on host.
I assume this is to make all container runtimes have the same default behaviour: https://github.com/containerd/containerd/blob/main/internal/cri/server/container_create_linux.go#L202

If we would include apparmor_parser (1.5M binary) into the rootfs for everyone then we have the impact of people suddenly having their workloads running with a default profile without knowing it.
The default profile is quite open but it do enforce denying access to certain things that people might not prepared for, see profile template here: https://github.com/containerd/containerd/blob/main/contrib/apparmor/template.go#L42

smira · 2024-05-02T14:17:34Z

@konrader thanks for your research here. I think having AppArmor (better security) is in the best spirit of Talos. We can think whether we want apparmor-parser as an extension for example, but I guess this becomes a bigger change, so would likely be in the timeframe of 1.8

This allows the kubelet to detect AppArmor. Signed-off-by: Andrew Rynhard <andrew@rynhard.io> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>

smira · 2024-05-07T15:12:59Z

I created issue #8715 to track the actual AppArmor work, while this PR makes sense on its own, so should be good to merge

smira · 2024-05-07T15:13:31Z

/m

talos-bot added the status/ok-to-test label Apr 23, 2024

frezbo approved these changes Apr 23, 2024

View reviewed changes

andrewrynhard force-pushed the securityfs branch from 30aece3 to 6f39b85 Compare April 23, 2024 18:01

smira reviewed Apr 23, 2024

View reviewed changes

andrewrynhard force-pushed the securityfs branch from 6f39b85 to 875ffe4 Compare April 23, 2024 18:46

andrewrynhard force-pushed the securityfs branch from 875ffe4 to d9d91ee Compare April 24, 2024 21:53

This was referenced May 2, 2024

Talos 1.8 Release Checklist #8484

Open

support AppArmor profiles for containers #8715

Open

feat: mount /sys/kernel/security into kubelet

4aac5b4

This allows the kubelet to detect AppArmor. Signed-off-by: Andrew Rynhard <andrew@rynhard.io> Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>

smira force-pushed the securityfs branch from d9d91ee to 4aac5b4 Compare May 7, 2024 15:12

smira approved these changes May 7, 2024

View reviewed changes

talos-bot added the status/ok-to-merge label May 7, 2024

talos-bot merged commit 4aac5b4 into siderolabs:main May 7, 2024
17 checks passed

talos-bot removed status/ok-to-test status/ok-to-merge labels May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: mount /sys/kernel/security into kubelet #8641

feat: mount /sys/kernel/security into kubelet #8641

andrewrynhard commented Apr 23, 2024

smira Apr 23, 2024

andrewrynhard Apr 23, 2024

andrewrynhard Apr 23, 2024

konrader Apr 23, 2024

smira Apr 23, 2024

andrewrynhard Apr 24, 2024

smira Apr 24, 2024

smira Apr 24, 2024

andrewrynhard commented Apr 23, 2024

konrader commented Apr 24, 2024

konrader commented Apr 24, 2024

smira commented Apr 24, 2024

konrader commented Apr 24, 2024

smira commented Apr 24, 2024

konrader commented Apr 24, 2024

konrader commented Apr 25, 2024

smira commented Apr 29, 2024

konrader commented Apr 29, 2024

konrader commented Apr 30, 2024

smira commented Apr 30, 2024

konrader commented Apr 30, 2024 •

edited

konrader commented Apr 30, 2024

konrader commented May 2, 2024

smira commented May 2, 2024

smira commented May 7, 2024

smira commented May 7, 2024

	default:
	phases = phases.Append(
	"systemRequirements",
	EnforceKSPPRequirements,
	SetupSystemDirectory,
	MountBPFFS,
	MountCgroups,
	MountPseudoFilesystems,
	SetRLimit,
	).Append(

feat: mount /sys/kernel/security into kubelet #8641

feat: mount /sys/kernel/security into kubelet #8641

Conversation

andrewrynhard commented Apr 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewrynhard commented Apr 23, 2024

konrader commented Apr 24, 2024

konrader commented Apr 24, 2024

smira commented Apr 24, 2024

konrader commented Apr 24, 2024

smira commented Apr 24, 2024

konrader commented Apr 24, 2024

konrader commented Apr 25, 2024

smira commented Apr 29, 2024

konrader commented Apr 29, 2024

konrader commented Apr 30, 2024

smira commented Apr 30, 2024

konrader commented Apr 30, 2024 • edited

konrader commented Apr 30, 2024

konrader commented May 2, 2024

smira commented May 2, 2024

smira commented May 7, 2024

smira commented May 7, 2024

konrader commented Apr 30, 2024 •

edited