Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attacher and provisioner may need privileged too #32

Closed
wongma7 opened this issue Mar 29, 2019 · 21 comments
Closed

Attacher and provisioner may need privileged too #32

wongma7 opened this issue Mar 29, 2019 · 21 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@wongma7
Copy link
Contributor

wongma7 commented Mar 29, 2019

It seems that on SELinux enabled systems, the provisioner & attacher pod can't access the socket created by the privileged plugin pod. Previously: kubernetes/kubernetes#69215 . I am not sure the exact reason, hopefully @bertinatto can explain :))

@wongma7
Copy link
Contributor Author

wongma7 commented Mar 29, 2019

I think it is because /var/lib/kubelet/plugins/csi-hostpath needs to be relabelled to allow random containers to read it. The hostpath plugin will not relabel it for us https://github.com/kubernetes/kubernetes/blob/7f0e04a089125901dce18c7d96507f2b60560e18/pkg/volume/host_path/host_path.go#L213

@bertinatto
Copy link
Contributor

I think it is because /var/lib/kubelet/plugins/csi-hostpath needs to be relabelled to allow random containers to read it

@wongma7, that's correct! On SELinux-enabled systems, --privileged containers will add certain labels to files that are created from them. If I'm not mistaken, those labels vary from namespace to namespace. Having said that, when a non-privileged container tries to access the socket file created by the privileged one, it'll get a permission error due to mismatched labels.

Recently, I found out a better way to handle this. We can assign a certain SELinux label tocsi-hostpath, like this:

SecurityContext: &v1.SecurityContext{
        Privileged: true
        SELinuxOptions: &v1.SELinuxOptions{
                Level: "s0:c0,c1",
        },
},

With that, all files created by this container will have the label above. Then we can assign the same labels to the attacher container:

SecurityContext: &v1.SecurityContext{
        SELinuxOptions: &v1.SELinuxOptions{
                Level: "s0:c0,c1",
        },
},

This should prevent the mismatch and thus the permission error.

@pohly
Copy link
Contributor

pohly commented Apr 1, 2019

What does s0:c0,c1 mean? Are these the right values for all SELinux-enabled systems or just for the one you tried this on?

@bertinatto
Copy link
Contributor

s0:c0,c1 was just a random value that I tried.

The file /etc/selinux/targeted/setrans.conf contains a translation for these values. Note that it may vary from system to system as these values can be configured by the admin through the semanage tool.

@pohly
Copy link
Contributor

pohly commented Apr 1, 2019

s0:c0,c1 was just a random value that I tried. ... Note that it may vary from system to system

That's what I feared. This is a problem for the example deployment and for E2E testing, because we cannot simply put the sections above in our .yaml files.

Is there a certain set of commands that can be used to look up these values? Or is this simply something that a cluster admin needs to know and provide as parameter to the deployment script?

@wongma7
Copy link
Contributor Author

wongma7 commented Apr 1, 2019

I think it is up to the cluster admin to know and assign the meaning of categories (c0,c1) on their machines, then provide them. They would reserve categories for their csi driver deployment and then enforce that other pods use other categories via PodSecurityPolicies

That is selinux levels/mls/multi-level security, but what about selinux types. On my system /var/lib/kubelet/plugins is system_u:object_r:var_lib_t:s0 and so containers with system_u:object_r:container_t:s0 can't read anything. So in this case the cluster admin must also relabel the socket file?

cc @jsafrane

@bertinatto
Copy link
Contributor

That is selinux levels/mls/multi-level security, but what about selinux types. On my system /var/lib/kubelet/plugins is system_u:object_r:var_lib_t:s0 and so containers with system_u:object_r:container_t:s0 can't read anything.

I missed the fact that the driver creates the socket under /var/lib/kubelet. By default, new files inherit the type of their parent directories (even if the file was created by a --privileged container), so my solution above wouldn't work for this case.

So in this case the cluster admin must also relabel the socket file?

That's a possible solution, but it can be that can be cumbersome, specially in worker nodes.

@bertinatto
Copy link
Contributor

To summarize:

Regarding the provisioner and attacher, I believe they don't need to be privileged because the driver shipped in the same pod (that should implement the Controller Service set of RPC calls) doesn't need to be privileged. (In kubernetes/kubernetes#69215 I also made the provisioner and attacher privileged because the driver (privileged) object was used with all sidecars. To prevent that we should have different driver objects specifically tailored each one of the 3 services/sidecars).

As for the Node Service, the driver does need to run as privileged because it formats and mounts volumes. As a result, the socket file it exposes to the registrar has a SELinux context that's not accessible by non-privileged containers.

@bertinatto
Copy link
Contributor

So my question is: do we really want to allow a non-privileged container to access the socket file created by the driver (Node Service)?

If we do so, we might have a potential security problem because:

  1. I could start a non-privileged container that access the socket file exposed by the (privileged) driver
  2. Connect to the RPC server
  3. And format a volume in the host

Whatever solution we find, we must make sure that we don't allow this to happen.

@jsafrane
Copy link
Contributor

jsafrane commented Apr 2, 2019

I triple-checked on a SELinux machine.

  1. Attacher and provisioner need to be priviliged to access driver's socket. We can either:

    • Run attacher and provisioner as privileged. HostPath is the only driver that needs it and it's experimental anyway.
    • Run attacher and provisioner with the same SELinux context as privileged containers. The containers themselves are still not privileged - they can't mount() or do other privileged stuff, however, SELinux won't block access to /var/lib/kubelet. On RHEL/CentOS/Fedora, spc_t is the right context.
      kind: StatefulSet
      spec:
        template:
          spec:
            securityContext:
              seLinuxOptions:
                type: spc_t
            serviceAccountName: csi-provisioner
            containers:
      ...
      

    Since SELinux context names are distro specific, I'd prefer just run all the containers as privileged (HostPath driver only!)

  2. (Non-privileged) node registrar can access socket created by privileged driver in the same pod. They both run with system_u:system_r:spc_t:s0 context.

    • Beware, with SELinuxOptions: &v1.SELinuxOptions{ Level: "s0:c0,c1", },, the registrar runs as container_t:s0:c0,c1, while the driver runs as spc_t:s0 and registrar can't read driver's socket! So any SELinuxOptions is actually harmful in this case! I spent half a day debugging that.

Edit: tested with cri-o as container runtime.

@bertinatto
Copy link
Contributor

bertinatto commented Apr 2, 2019

(Non-privileged) node registrar can access socket created by privileged driver in the same pod. They both run with system_u:system_r:spc_t:s0 context.

For the record, I just tested with:

  • Centos 7 host (SELinux enabled)
  • EBS CSI driver
    • privileged driver container
    • non-privileged node registrar container
  • docker/containerd runtime

The registrar was able to connect to the driver without problems. The non-privileged registrar used to be a problem, but I suppose the SELinux context of containers in the same pod was fixed at some point recently.

@pohly
Copy link
Contributor

pohly commented Apr 2, 2019

Run attacher and provisioner as privileged. HostPath is the only driver that needs it and it's experimental anyway.

Why does only the hostpath driver need this? Because only this deployment runs the driver in a separate pod? There are other deployments which might do the same, for whatever reasons.

@jsafrane
Copy link
Contributor

jsafrane commented Apr 2, 2019

Why does only the hostpath driver need this? Because only this deployment runs the driver in a separate pod? There are other deployments which might do the same, for whatever reasons.

Attacher + provisioner need to access a socket in /var/lib/kubelet/plugins/* on the host instead of EmptyDir. Any CSI driver that's going to use HostPath instead of EmptyDir will face the same issue.

Basically, SELinux does not like any HostPath volumes: we don't want processes that escaped their container messing up the host, even if they run as root. So either admin (or a package) labels special directories as allowed to be used by containers or cluster admin can run these special pods with a special policy.

This special policy will be either distro specific or even cluster specific and is then hard to configure from an e2e test.

@pohly
Copy link
Contributor

pohly commented Apr 2, 2019

So "HostPath is the only driver" isn't about the csi-driver-host-path? You meant the builtin hostpath storage driver?

@jsafrane
Copy link
Contributor

jsafrane commented Apr 3, 2019

It's so confusing. It affects only csi-driver-host-path, because that's the only one that uses in-tree HostPath volume in attacher/provisioner to get to driver socket created by a privileged container in another pod.

@pohly
Copy link
Contributor

pohly commented Apr 3, 2019

It affects only csi-driver-host-path, because that's the only one that uses in-tree HostPath volume in attacher/provisioner to get to driver socket created by a privileged container in another pod.

In other words, this:

volumes:
- hostPath:
path: /var/lib/kubelet/plugins/csi-hostpath
type: DirectoryOrCreate
name: socket-dir

I can see how that is a bit special. Other CSI driver deployments probably have attacher/provisioner/driver all bundled up in a single pod.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 2, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 1, 2019
@msau42
Copy link
Collaborator

msau42 commented Aug 2, 2019

Can we put all the sidecars in the same pod as the driver? Bundling them all together is our recommended way, and the fact that our sample driver is not doing that is confusing to driver devs.

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

7 participants