Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply _netdev mount option in bind mount if available #68626

Merged
merged 1 commit into from
Sep 26, 2018

Conversation

gnufied
Copy link
Member

@gnufied gnufied commented Sep 13, 2018

_netdev mount option is a userspace mount option and
isn't copied over when bind mount is created and remount
also does not copies it over and hence must be explicitly
use with bind mount

Without _netdev mount option, a bind mount might not get unmounted before network is stopped on a node and hence soft-shutting down the node will not work.

Fixes #68692

/sig storage

cc @jingxu97 @msau42

Apply _netdev mount option on bind mount

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Sep 13, 2018
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Sep 13, 2018
_netdev mount option is a userspace mount option and
isn't copied over when bind mount is created and remount
also does not copies it over and hence must be explicitly
used with bind mount
@gnufied
Copy link
Member Author

gnufied commented Sep 14, 2018

/test pull-kubernetes-integration

@gnufied
Copy link
Member Author

gnufied commented Sep 14, 2018

/assign @saad-ali

@@ -286,7 +286,7 @@ func IsNotMountPoint(mounter Interface, file string) (bool, error) {
// use in case of bind mount, due to the fact that bind mount doesn't respect mount options.
// The list equals:
// options - 'bind' + 'remount' (no duplicate)
func isBind(options []string) (bool, []string) {
func isBind(options []string) (bool, []string, []string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this being special cased in the bind mount? Shouldn't it be a mount option for the global mount? That way the bind mounts can be simple?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is - when we bind mount we are dropping ALL mount options supplied by global mount. That is because traditionally Kernel drops most mount options when bind mounting something. So even if global mount did use _netdev, the bind mount will not use it.

So, this fix makes sure that - if global mount had _netdev, then we must carry that option when creating bind mount. We ourselves don't supply _netdev but we expect that mount option to come from global mount (either via storageClass mount option or plugin uses _netdev as one of the default mount options).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just apply all mount options we applied to global mount to the bind mount too? Then we don't have to special case netdev

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just apply all mount options we applied to global mount to the bind mount too? Then we don't have to special case netdev

We should start with just special casing. Suddenly having all mount options apply to bind may have unexpected results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing because it looks like for bindRemountOpts we are applying all the mount options from the global mount.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is - remount also does not copy _netdev mount option over, if original bind mount did not had it. I am somewhat skeptical about using remount to apply mount options to be honest. We know that - mount options that Kernel knows about are automatically copied over from original mount point when bind mount is created. We don't need remount for that.

And userspace mount options aren't copied over on remount.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the "_netdev" mount option coming from since it's not from the global mount?

I see that:

  1. bind options will return "bind" and "_netdev" if it's set
  2. bind remount options will return "bind", "remount", and all other options (including "_netdev")

Do we really need to special case the "_netdev" in the first bind mount if the remount is going to already include it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to special case the "_netdev" in the first bind mount if the remount is going to already include it?

Yeah we do need to. If first bind mount did not include _netdev mount option then _netdev mount option on remount has no effect and it isn't applied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. This looks fine then. Another issue I found is that all other plugins besides rbd are not using storageclass mount options in SetUp, so this fix will only work for rbd. We can handle the other plugins separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I logged #68797 to fix that as a follow up item.


func checkForNetDev(options []string) bool {
for _, option := range options {
if option == "_netdev" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any issues with specifying this option on a non-systemd host?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thing is - this PR does not add _netdev mount option by default, it just says if user or plugin author has specified _netdev mount option, I will try to honor it while creating bind mount.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

if option == "_netdev" {
return true
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NFS has a similar issue, where we can't unmount if share (server is disabled or network issues). Can we apply this to NFS as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be. But systemd at least already knows that NFS mounts are using network. This problem is more severe in case of block storage volume types, which appear as regular block device to systemd but actually are networked and hence must be unmounted before network is toredown during shutdown sequence.

The premise of the PR isn't for Kubernetes to apply _netdev whenever it can, but it is to honor that mount option, if user or plugin author has chosen to use it while creating bind mounts.

@saad-ali
Copy link
Member

/approve

@msau42 for LGTM

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 18, 2018
@msau42
Copy link
Member

msau42 commented Sep 18, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 18, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, msau42, saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nikhita
Copy link
Member

nikhita commented Sep 25, 2018

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Sep 25, 2018
@k8s-ci-robot k8s-ci-robot merged commit c7a67b3 into kubernetes:master Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants