Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WiP] Set propagation to rslave for hostPath volume mounts #41683

Closed
wants to merge 2 commits into from

Conversation

ivan4th
Copy link
Contributor

@ivan4th ivan4th commented Feb 18, 2017

Based on #31504, but supports only rslave, adds support for CRI /
dockershim and displays warnings of Docker doesn't support mount
propagation options.

Not to be merged till kubernetes/community#193 discussion is resolved.
TODO: e2e test.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 18, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

The following people have approved this PR: ivan4th

Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @vishh @jsafrane
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note-label-needed labels Feb 18, 2017
@k8s-reviewable
Copy link

This change is Reviewable

// is a comma-separated list of the following strings:
// 'ro', if the path is read only
// 'Z', if the volume requires SELinux relabeling
// propagation mode such as 'rshared'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be you can mention rslave here, as that's what this patch adds.

// is a comma-separated list of the following strings:
// 'ro', if the path is read only
// 'Z', if the volume requires SELinux relabeling
// propagation mode such as 'rshared'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@ivan4th
Copy link
Contributor Author

ivan4th commented Feb 18, 2017

Current e2e failures are caused by this:

E0218 03:15:56.780689    3290 kuberuntime_manager.go:715] container start failed: rpc error: code = 2 desc = failed to start container "cb3622a4e409f66d78e5e06a9d5b296fd41b3ccebfd7e2612f4399e00b517b5c": Error response from daemon: linux mounts: Path /var/log is mounted on / but it is not a shared or slave mount.: Start Container Failed
E0218 03:15:56.781042    3290 pod_workers.go:182] Error syncing pod f76dc21cc44d2ff38117b64cc043dd01 ("kube-proxy-gke-e2e-gke-agent-pr-83--default-pool-3a973fae-3z86_kube-system(f76dc21cc44d2ff38117b64cc043dd01)"), skipping: failed to "StartContainer" for "kube-proxy" with rpc error: code = 2 desc = failed to start container "cb3622a4e409f66d78e5e06a9d5b296fd41b3ccebfd7e2612f4399e00b517b5c": Error response from daemon: linux mounts: Path /var/log is mounted on / but it is not a shared or slave mount.: "Start Container Failed"

Apparently newer dockers (like 1.13.x) check whether the source path is a mount and bindmount it on itself if it isn't. Looks like older dockers don't do this. This can be worked around by doing the checks on kubelet/dockershim side, but the problem is (already discussed to some extent but I don't recall any definite solution being found) that docker daemon / containerd may reside in its own mount namespace and thus checking the mounts on the hosts may still mean that they're not there from docker's PoV. A possible smoke test for this is checking current mounts and mount propagation to the Docker using e.g. busybox container at kubelet/dockershim startup but this is ... hacky. Any thoughts on a good way to deal with this? Sorry if it was already mentioned in some of discussions but I missed it.

@ivan4th
Copy link
Contributor Author

ivan4th commented Feb 18, 2017

@gurvindersingh thanks for pointing out the problems in comments, fixed them

@gurvindersingh
Copy link
Contributor

@euank @majewsky @lucab @dchen1107 wondering if you guys can go over the current work from @ivan4th in this PR and review it, so we can fix it with the comments and hopefully make it ready for merge before 1.6 code freeze.

Thanks!!

@euank
Copy link
Contributor

euank commented Feb 21, 2017

@ivan4th if the right way is to do a dryrun/smoketest container, the pause container would be a better option than busybox.

There could also be kubelet code to find dockerd's pid, break into its mount namespace, and do the test from its perspective.

The bigger question is whether we fix easily fixable problems. For example, the test failure in e2e is because / is not a slave mount, but if the kubelet made it one (at least for docker's mountns) it would probably work... and maybe that's the right thing to do for the user.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 1, 2017
@jsafrane
Copy link
Member

the test failure in e2e is because / is not a slave mount, but if the kubelet made it one (at least for docker's mountns) it would probably work... and maybe that's the right thing to do for the user.

That sounds dangerous. Somebody (system admin, distribution author) has decided that Docker should run in its own mount namespace with specific propagation mode and it's not up to Kubernetes to try to fix that.

IMO we should print some error and refuse to do slave or shared mounts instead of trying to hack the system. It would be better to check and report this misconfiguration during node startup and not when the first pod with HostPath is created, but it's IMO not a hard requirement.

@jsafrane
Copy link
Member

I checked major distros, always with the distribution docker (Fedora, CentOS, CoreOS) or the latest from docker.io (Debian, Ubuntu):

  • CoreOS Container Linux 1353.1.0, Fedora 25, Debian stable (jessie), Ubuntu 16.04 LTS run with root mount as shared, which is propagated into docker (i.e. docker does not have its own mount namespace nor remounts the root). This PR should work there without any special config.
  • RHEL7/CentOS7 runs with root as shared, however docker runs in its own mount namespace and sees root as slave. This could be fixed in a future update (no promises though).
  • Debian wheezy (old stable) runs with root as private. That results in error in GCE tests we see here.

So, do we need to support wheezy? Can we change default private root to shared in GCE test infrastructure? Can we upgrade to jessie?

I'll check docker 1.13 and cryptic bind-mounts, but I did not notice anything bad in ubuntu 16.04 with docker community edition that reports itself as version 17.03.0-ce...

@jsafrane
Copy link
Member

Checked with Container Linux 1353.1 which ships with docker 1.13.1, I did not notice any bad bind mounts that would prevent docker run -v xxx:yyy:rslave from working. @ivan4th, on what distro do you see these bind mounts?

@jsafrane
Copy link
Member

@euank @ivan4th, I've been thinking, what kind of autodetection could we do on kubelet startup? There may be many (bin)mounts on the system, either on the host or in docker mount namespace, which one should we check if it's shared? We know only at a HostPath pod creation which path mount(s) need to be shared/slave and which can stay private. Does check of / in docker mount namespace tell us anything?

@vishh
Copy link
Contributor

vishh commented Mar 24, 2017

It is not OK for any arbitrary pod to be able to introduce mounts into the host's mount namespace. I see the need for this feature. At the least it has to be gated by a Pod Security Policy field and only exposed to certain trusted pods.

cc @saad-ali @euank

@euank
Copy link
Contributor

euank commented Mar 25, 2017

@vishh this is about rslave, not rshared.

This change does not allow a pod to change the host's mount namespace anymore than it already can.

@vishh
Copy link
Contributor

vishh commented Mar 27, 2017

@euank sorry, I was focusing mainly on the overall use case. Given what you said, I'm curious what the use case is for this PR?

@euank
Copy link
Contributor

euank commented Mar 27, 2017

@vishh A couple use-cases were already mentioned in the original issue, such as #31504 (comment)

@gurvindersingh
Copy link
Contributor

gurvindersingh commented Apr 10, 2017

@ivan4th any update on this one, thanks.

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 12, 2017
@ivan4th
Copy link
Contributor Author

ivan4th commented Apr 12, 2017

I've updated the PR for current master. Still, I don't know how to solve the current Docker problem. Indeed it sounds like tweaking docker daemon's mount ns without user's consent is scary. Failing loudly is not quite an option either because we're changing the default behavior here, so we may break k8s for some users. Also, the CI env has still this / mount problem.

@kfox1111
Copy link

Is anyone working on shared mounts anymore? how do we ensure shared mounts are in 1.7.

@gurvindersingh
Copy link
Contributor

@ivan4th may be we can gate this setting using experimental flags and mentioned the docker mount settings required to make this option work. For CI issue, @jsafrane created #44389 which can possibly make the tests passing for this PR.

@ivan4th
Copy link
Contributor Author

ivan4th commented Apr 12, 2017

@gurvindersingh yes, this sounds like a plan. I'll add the flag to kubelet in this PR, and let's wait for CI fix to land. Just in case -- here's why some installations specify MountFlags for docker.service: http://blog.oddbit.com/2015/01/18/docker-vs-privatetmp/

Based on kubernetes#31504, but supports only rslave, adds support for CRI /
dockershim and displays warnings of Docker doesn't support mount
propagation options.
@k8s-github-robot k8s-github-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Apr 21, 2017
@ivan4th
Copy link
Contributor Author

ivan4th commented Apr 21, 2017

Added --experimental-enable-host-path-mount-propagation kubelet option, defaults to false.

The option is --experimental-enable-host-path-mount-propagation,
defaults to false.
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Apr 21, 2017

@ivan4th: The following test(s) failed:

Test name Commit Details Rerun command
Jenkins GCI GKE smoke e2e 513a342c361d92def3ae2ee9d91c52ec02e82a70 link @k8s-bot gci gke e2e test this
Jenkins GKE smoke e2e 513a342c361d92def3ae2ee9d91c52ec02e82a70 link @k8s-bot cvm gke e2e test this
Jenkins Bazel Build 2e6aa99210348a7d9f69c6d9c486c22026f765e3 link @k8s-bot bazel test this
Jenkins unit/integration 2e6aa99210348a7d9f69c6d9c486c22026f765e3 link @k8s-bot unit test this
Jenkins verification 2e6aa99210348a7d9f69c6d9c486c22026f765e3 link @k8s-bot verify test this
Jenkins non-CRI GCE Node e2e 2e6aa99210348a7d9f69c6d9c486c22026f765e3 link @k8s-bot non-cri node e2e test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

@ivan4th PR needs rebase

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 28, 2017
jsafrane added a commit to jsafrane/kubernetes that referenced this pull request May 17, 2017
This is heavily adjusted version of kubernetes#41683

- rebase was needed
- removed experimental-enable-host-path-mount-propagation cmdline option
  (too lazy to rebase...)
- added rshared support for privileged pods
@jsafrane
Copy link
Member

yet another implementation attempt: #46444

@k8s-github-robot
Copy link

This PR hasn't been active in 90 days. Closing this PR. Please reopen if you would like to work towards merging this change, if/when the PR is ready for the next round of review.

cc @euank @ivan4th

You can add 'keep-open' label to prevent this from happening again, or add a comment to keep it open another 90 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants