New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically set `may_detach_mounts=1` on startup #34886

Merged
merged 1 commit into from Sep 20, 2017

Conversation

Projects
None yet
7 participants
@cpuguy83
Contributor

cpuguy83 commented Sep 18, 2017

This is kernel config available in RHEL7.4 based kernels that enables
mountpoint removal where the mountpoint exists in other namespaces.
In particular this is important for making this pattern work:

umount -l /some/path
rm -r /some/path

Where /some/path exists in another mount namespace.
Setting this value will prevent device or resource busy errors when
attempting to the removal of /some/path in the example.

This setting is the default, and non-configurable, on upstream kernels
since 3.15.

@runcom

This comment has been minimized.

Show comment
Hide comment
@runcom

runcom Sep 18, 2017

Member

Looks good, ping @rhvgoyal

Member

runcom commented Sep 18, 2017

Looks good, ping @rhvgoyal

@rhvgoyal

This comment has been minimized.

Show comment
Hide comment
@rhvgoyal

rhvgoyal Sep 20, 2017

Contributor

LGTM.

We already do this with the help of sysctl interface. So over a boot, all the sysctl settings are applied and this is already enabled by the time docker runs.

Contributor

rhvgoyal commented Sep 20, 2017

LGTM.

We already do this with the help of sysctl interface. So over a boot, all the sysctl settings are applied and this is already enabled by the time docker runs.

@thaJeztah

minor nits, but LGTM otherwise :)

Show outdated Hide outdated daemon/daemon_unix.go Outdated
// unprivileged container. Ignore the error, but log
// it if we appear not to be in that situation.
if !rsystem.RunningInUserNS() {
logrus.Debugf("Permission denied writing %q to /proc/sys/fs/may_detach_mounts", "1")

This comment has been minimized.

@thaJeztah

thaJeztah Sep 20, 2017

Member

Looks like there's no need to use %q here, because "1" is hardcoded

@thaJeztah

thaJeztah Sep 20, 2017

Member

Looks like there's no need to use %q here, because "1" is hardcoded

This comment has been minimized.

@cpuguy83

cpuguy83 Sep 20, 2017

Contributor

Well, it quotes it, which is pretty ugly to do otherwise.

@cpuguy83

cpuguy83 Sep 20, 2017

Contributor

Well, it quotes it, which is pretty ugly to do otherwise.

This comment has been minimized.

@thaJeztah

thaJeztah Sep 20, 2017

Member

backticks for the string?

`Permission denied writing "1" to ....`
@thaJeztah

thaJeztah Sep 20, 2017

Member

backticks for the string?

`Permission denied writing "1" to ....`
Automatically set `may_detach_mounts=1` on startup
This is kernel config available in RHEL7.4 based kernels that enables
mountpoint removal where the mountpoint exists in other namespaces.
In particular this is important for making this pattern work:

```
umount -l /some/path
rm -r /some/path
```

Where `/some/path` exists in another mount namespace.
Setting this value will prevent `device or resource busy` errors when
attempting to the removal of `/some/path` in the example.

This setting is the default, and non-configurable, on upstream kernels
since 3.15.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
@runcom

runcom approved these changes Sep 20, 2017

@thaJeztah

LGTM

@yongtang yongtang merged commit 7d70d0f into moby:master Sep 20, 2017

6 checks passed

dco-signed All commits are signed
experimental Jenkins build Docker-PRs-experimental 36955 has succeeded
Details
janky Jenkins build Docker-PRs 45622 has succeeded
Details
powerpc Jenkins build Docker-PRs-powerpc 6001 has succeeded
Details
windowsRS1 Jenkins build Docker-PRs-WoW-RS1 17202 has succeeded
Details
z Jenkins build Docker-PRs-s390x 5804 has succeeded
Details

@cpuguy83 cpuguy83 deleted the cpuguy83:may_detach_mount branch Sep 21, 2017

@kolyshkin

This comment has been minimized.

Show comment
Hide comment
@kolyshkin

kolyshkin Sep 21, 2017

Contributor

Wonderful! I was just going to work on a similar PR, but for the devmapper graph driver only (as its "deferred deletion" feature only works in case fs.may_detach_mounts=1). I did not realize it might be helpful in non-devmapper case, too. This patch will also help in case of autodetection whether deferred deletion is working or not (see mbentley/docker-devicemapper-setup#4).

The alternative approach is to add a /usr/lib/sysctl.d/90-docker.conf file to docker rpm spec, containing fs.may_detach_mounts=1, and adding a kludge to the .spec so that the setting is also applied upon rpm installation. This would be cleaner (and this is what Red Hat's runc rpm does) -- the only downside is anyone who's not installing from an rpm would not get it.

Yet another alternative is something like:

// -q: not display the value set, -e: ignore non-existent key
err := exec.Command("sysctl", "-q", "-e", "fs.may_detach_mounts=1").Run()
...

While exec sn ugly in general, this is only done once upon daemon (re)start so it's OK.

Contributor

kolyshkin commented Sep 21, 2017

Wonderful! I was just going to work on a similar PR, but for the devmapper graph driver only (as its "deferred deletion" feature only works in case fs.may_detach_mounts=1). I did not realize it might be helpful in non-devmapper case, too. This patch will also help in case of autodetection whether deferred deletion is working or not (see mbentley/docker-devicemapper-setup#4).

The alternative approach is to add a /usr/lib/sysctl.d/90-docker.conf file to docker rpm spec, containing fs.may_detach_mounts=1, and adding a kludge to the .spec so that the setting is also applied upon rpm installation. This would be cleaner (and this is what Red Hat's runc rpm does) -- the only downside is anyone who's not installing from an rpm would not get it.

Yet another alternative is something like:

// -q: not display the value set, -e: ignore non-existent key
err := exec.Command("sysctl", "-q", "-e", "fs.may_detach_mounts=1").Run()
...

While exec sn ugly in general, this is only done once upon daemon (re)start so it's OK.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Sep 21, 2017

Member

Yet another alternative is something like:

I do like the alternative approach (cleaner); would the permissions denied error still be properly returned with that code?

Member

thaJeztah commented Sep 21, 2017

Yet another alternative is something like:

I do like the alternative approach (cleaner); would the permissions denied error still be properly returned with that code?

@kolyshkin

This comment has been minimized.

Show comment
Hide comment
@kolyshkin

kolyshkin Sep 21, 2017

Contributor

I do like the alternative approach (cleaner); would the permissions denied error
still be properly returned with that code?

Well, what we'll get is an exit code (most probably a generic 1) and a text from stderr (which we can parse for "permission denied" but it's a slippery slope as the locale might change that). I suggest treating any error, not just EPERM, in the same way, like:

s := "fs.may_detach_mounts=1"
// -q: not display the value set, -e: ignore non-existent key
err := exec.Command("sysctl", "-q", "-e", s).Run()
// ignore the error if the daemon is inside userns
if err != nil && !rsystem.RunningInUserNS() {
	logrus.Warnf("Error setting %s: %v", s, err)
}

Or maybe even not try setting this at all if we're in userns:

// do not try to set sysctls from inside userns
if !rsystem.RunningInUserNS() {
	s := "fs.may_detach_mounts=1"
	// -q: not display the value set, -e: ignore non-existent key
	if err := exec.Command("sysctl", "-q", "-e", s).Run(); err != nil {
		logrus.Warnf("Error setting %s sysctl: %v", s, err)
}
Contributor

kolyshkin commented Sep 21, 2017

I do like the alternative approach (cleaner); would the permissions denied error
still be properly returned with that code?

Well, what we'll get is an exit code (most probably a generic 1) and a text from stderr (which we can parse for "permission denied" but it's a slippery slope as the locale might change that). I suggest treating any error, not just EPERM, in the same way, like:

s := "fs.may_detach_mounts=1"
// -q: not display the value set, -e: ignore non-existent key
err := exec.Command("sysctl", "-q", "-e", s).Run()
// ignore the error if the daemon is inside userns
if err != nil && !rsystem.RunningInUserNS() {
	logrus.Warnf("Error setting %s: %v", s, err)
}

Or maybe even not try setting this at all if we're in userns:

// do not try to set sysctls from inside userns
if !rsystem.RunningInUserNS() {
	s := "fs.may_detach_mounts=1"
	// -q: not display the value set, -e: ignore non-existent key
	if err := exec.Command("sysctl", "-q", "-e", s).Run(); err != nil {
		logrus.Warnf("Error setting %s sysctl: %v", s, err)
}
@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Sep 21, 2017

Member

(If @cpuguy83 agrees) feel free to open a PR with that change

Member

thaJeztah commented Sep 21, 2017

(If @cpuguy83 agrees) feel free to open a PR with that change

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Sep 21, 2017

Contributor

Why use exec here?

Contributor

cpuguy83 commented Sep 21, 2017

Why use exec here?

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Sep 21, 2017

Contributor

I'd be fine with setting this in the systemd conf, however that's really a packaging issue which we don't really deal with in this repo (though there are some init scripts in contrib).

Contributor

cpuguy83 commented Sep 21, 2017

I'd be fine with setting this in the systemd conf, however that's really a packaging issue which we don't really deal with in this repo (though there are some init scripts in contrib).

@kolyshkin

This comment has been minimized.

Show comment
Hide comment
@kolyshkin

kolyshkin Sep 21, 2017

Contributor

Why use exec here?

As I said earlier, While exec sn ugly in general, this is only done once upon daemon (re)start so it's OK.

The benefits I see are:

  • using more standard interface
  • less code
Contributor

kolyshkin commented Sep 21, 2017

Why use exec here?

As I said earlier, While exec sn ugly in general, this is only done once upon daemon (re)start so it's OK.

The benefits I see are:

  • using more standard interface
  • less code

opera443399 added a commit to opera443399/ops that referenced this pull request Sep 26, 2017

@kolyshkin kolyshkin referenced this pull request Sep 29, 2017

Open

Multiple run of `docker run --rm` hangs in a host #115

2 of 3 tasks complete

@cpuguy83 cpuguy83 restored the cpuguy83:may_detach_mount branch Sep 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment