-
Notifications
You must be signed in to change notification settings - Fork 38.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WiP] Set propagation to rslave for hostPath volume mounts #41683
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED The following people have approved this PR: ivan4th Needs approval from an approver in each of these OWNERS Files: We suggest the following people: |
cb4b3e5
to
76614dc
Compare
pkg/kubelet/dockershim/helpers.go
Outdated
// is a comma-separated list of the following strings: | ||
// 'ro', if the path is read only | ||
// 'Z', if the volume requires SELinux relabeling | ||
// propagation mode such as 'rshared' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be you can mention rslave
here, as that's what this patch adds.
// is a comma-separated list of the following strings: | ||
// 'ro', if the path is read only | ||
// 'Z', if the volume requires SELinux relabeling | ||
// propagation mode such as 'rshared' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
76614dc
to
513a342
Compare
Current e2e failures are caused by this:
Apparently newer dockers (like 1.13.x) check whether the source path is a mount and bindmount it on itself if it isn't. Looks like older dockers don't do this. This can be worked around by doing the checks on kubelet/dockershim side, but the problem is (already discussed to some extent but I don't recall any definite solution being found) that docker daemon / containerd may reside in its own mount namespace and thus checking the mounts on the hosts may still mean that they're not there from docker's PoV. A possible smoke test for this is checking current mounts and mount propagation to the Docker using e.g. busybox container at kubelet/dockershim startup but this is ... hacky. Any thoughts on a good way to deal with this? Sorry if it was already mentioned in some of discussions but I missed it. |
@gurvindersingh thanks for pointing out the problems in comments, fixed them |
@euank @majewsky @lucab @dchen1107 wondering if you guys can go over the current work from @ivan4th in this PR and review it, so we can fix it with the comments and hopefully make it ready for merge before 1.6 code freeze. Thanks!! |
@ivan4th if the right way is to do a dryrun/smoketest container, the pause container would be a better option than busybox. There could also be kubelet code to find dockerd's pid, break into its mount namespace, and do the test from its perspective. The bigger question is whether we fix easily fixable problems. For example, the test failure in e2e is because |
That sounds dangerous. Somebody (system admin, distribution author) has decided that Docker should run in its own mount namespace with specific propagation mode and it's not up to Kubernetes to try to fix that. IMO we should print some error and refuse to do slave or shared mounts instead of trying to hack the system. It would be better to check and report this misconfiguration during node startup and not when the first pod with HostPath is created, but it's IMO not a hard requirement. |
I checked major distros, always with the distribution docker (Fedora, CentOS, CoreOS) or the latest from docker.io (Debian, Ubuntu):
So, do we need to support wheezy? Can we change default private root to shared in GCE test infrastructure? Can we upgrade to jessie? I'll check docker 1.13 and cryptic bind-mounts, but I did not notice anything bad in ubuntu 16.04 with docker community edition that reports itself as version 17.03.0-ce... |
Checked with Container Linux 1353.1 which ships with docker 1.13.1, I did not notice any bad bind mounts that would prevent |
@euank @ivan4th, I've been thinking, what kind of autodetection could we do on kubelet startup? There may be many (bin)mounts on the system, either on the host or in docker mount namespace, which one should we check if it's shared? We know only at a HostPath pod creation which path mount(s) need to be shared/slave and which can stay private. Does check of |
@vishh this is about rslave, not rshared. This change does not allow a pod to change the host's mount namespace anymore than it already can. |
@euank sorry, I was focusing mainly on the overall use case. Given what you said, I'm curious what the use case is for this PR? |
@vishh A couple use-cases were already mentioned in the original issue, such as #31504 (comment) |
@ivan4th any update on this one, thanks. |
I've updated the PR for current master. Still, I don't know how to solve the current Docker problem. Indeed it sounds like tweaking docker daemon's mount ns without user's consent is scary. Failing loudly is not quite an option either because we're changing the default behavior here, so we may break k8s for some users. Also, the CI env has still this / mount problem. |
Is anyone working on shared mounts anymore? how do we ensure shared mounts are in 1.7. |
@gurvindersingh yes, this sounds like a plan. I'll add the flag to kubelet in this PR, and let's wait for CI fix to land. Just in case -- here's why some installations specify |
Based on kubernetes#31504, but supports only rslave, adds support for CRI / dockershim and displays warnings of Docker doesn't support mount propagation options.
Added |
The option is --experimental-enable-host-path-mount-propagation, defaults to false.
@ivan4th: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@ivan4th PR needs rebase |
This is heavily adjusted version of kubernetes#41683 - rebase was needed - removed experimental-enable-host-path-mount-propagation cmdline option (too lazy to rebase...) - added rshared support for privileged pods
yet another implementation attempt: #46444 |
This PR hasn't been active in 90 days. Closing this PR. Please reopen if you would like to work towards merging this change, if/when the PR is ready for the next round of review. You can add 'keep-open' label to prevent this from happening again, or add a comment to keep it open another 90 days |
Based on #31504, but supports only rslave, adds support for CRI /
dockershim and displays warnings of Docker doesn't support mount
propagation options.
Not to be merged till kubernetes/community#193 discussion is resolved.
TODO: e2e test.