Skip to content

Add support for setting sysctls#19265

Merged
vdemeester merged 1 commit into
moby:masterfrom
rhatdan:netsysctl
Apr 13, 2016
Merged

Add support for setting sysctls#19265
vdemeester merged 1 commit into
moby:masterfrom
rhatdan:netsysctl

Conversation

@rhatdan

@rhatdan rhatdan commented Jan 12, 2016

Copy link
Copy Markdown
Contributor

This patch will allow users to specify namespace specific "kernel parameters"
for running inside of a container.

Dan Walsh dwalsh@redhat.com

Signed-off-by: Dan Walsh dwalsh@redhat.com

@rhatdan

rhatdan commented Jan 12, 2016

Copy link
Copy Markdown
Contributor Author

This pull request replaces #16632

@rhatdan

rhatdan commented Jan 12, 2016

Copy link
Copy Markdown
Contributor Author

Opened pull request for engine-api docker-archive-public/docker.engine-api#38

Comment thread opts/opts.go Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we say "not valid" here, or something that gives the user a clue that it's not supported / whitelisted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about
sysctl %s is not a namespaced kernel parameter

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could work; is that true for all kernel versions though? Perhaps we should mention that we don't support it in stead? Open to better suggestions though

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well then we get a bug report telling us it is a namespaced kernel parameter. If we say unsupported, user might ignore.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO mentioning that it's "not whitelisted" makes it more clear to the user that they can come make the case for why it should be added 😇

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if you guys come to consensus I will change the message. :^)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with "not whitelisted"

and good news; this has been moved to code review!

Comment thread opts/opts.go

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these actually namespaced from 3.10+?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

man namespaces
...
   IPC namespaces (CLONE_NEWIPC)
...

       *  The System V IPC interfaces in /proc/sys/kernel, namely: msgmax,
          msgmnb, msgmni, sem, shmall, shmmax, shmmni, and shm_rmid_forced.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you consider adding kernel.perf_event_paranoid and kernel.kptr_restrict to the whitelist?

as a use case: i'm working on a proof of concept around flame graphs, visualizing perf event data in a containerized node.js app following along the work here https://gist.github.com/trevnorris/9616784.

have had a devil of a time sorting out getting these kernel parameters to stick in a running container. came upon your PR, and am stoked to see a --sysctl flag in the works.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mysterlune Do you know if these sysctls are namespaced? If you set them inside of the container, do the settings show up outside of the container? If not, then we would add them, if yes then we will not add them.

You could test this by running a --privileged container.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @rhatdan,

so i ran the container with the --privileged flag as such (as you can see, i'm on a mac; unsure if this matters to this experiment):

docker run -d -p 8080:8080 --name foobar -v /Users/rlune/gitproj/observable_node_poc:/src -v /Users/rlune/gitproj/observable_node_poc:/tmp --privileged observable-node-poc

... and got:

d137a82d7731ab095c484d0e70dd96081c275cf43c50100128ad5ea44e40cb01

... then, exec'd a bash session:

docker exec -it d137a82d7731ab095c484d0e70dd96081c275cf43c50100128ad5ea44e40cb01 bash

... checked the host machine for the kernel flags next...:

16:24 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.perf_event_paranoid
16:25 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.kptr_restrict

... received no output for each lookup. then, set the flags on the instance running in the container:

[root@d137a82d7731 app]# sysctl -w kernel.kptr_restrict=1
kernel.kptr_restrict = 1
[root@d137a82d7731 app]# sysctl -w kernel.perf_event_paranoid=1
kernel.perf_event_paranoid = 1

... read the keys on the image:

[root@d137a82d7731 app]# sysctl -a | grep kernel.kptr_restrict      
kernel.kptr_restrict = 1
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"

[and got the same sort of output for the `kernel.perf_event_paranoid` also]

... then checked the host machine again:

16:24 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.perf_event_paranoid
16:25 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.kptr_restrict

... and did not see any lookup output for those keys.

is this test sufficient to determine that the kernel.perf_event_paranoid and kernel.kptr_restrict keys are namespaced in a way that will allow their addition to this PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it is enough, but it certainly looks good. I guess if you ran another container and made sure they were different.

@jeremyeder PTAL

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dang it... it doesn't look like those flags are namespaced if i run two containers and set the flags within one of them... they show up as such in the other...

so, how does a flag get "namespaced" so that this collision doesn't happen?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You write a kernel patch. :^(

@rhatdan

rhatdan commented Feb 9, 2016

Copy link
Copy Markdown
Contributor Author

Now that we are in docker-1.11 could we get this pull request moving. @calavera @cpuguy83

@thaJeztah thaJeztah added the status/needs-attention Calls for a collective discussion during a review session label Feb 15, 2016
@thaJeztah

Copy link
Copy Markdown
Member

We're okay with this, moving to code review, unless @crosbymichael has major concerns

@thaJeztah thaJeztah added status/2-code-review and removed status/1-design-review status/needs-attention Calls for a collective discussion during a review session labels Feb 18, 2016
@rhatdan rhatdan force-pushed the netsysctl branch 3 times, most recently from 7269711 to ab2c413 Compare February 19, 2016 15:25
@icecrime icecrime added the status/failing-ci Indicates that the PR in its current state fails the test suite label Feb 29, 2016
@thaJeztah

Copy link
Copy Markdown
Member

ping @calavera @cpuguy83 PTAL

@cpuguy83

Copy link
Copy Markdown
Member

Just need to update the engine-api vendor, otherwise code LGTM.

@thaJeztah

Copy link
Copy Markdown
Member

Looks like this needs another rebase @rhatdan 😢

@rhatdan

rhatdan commented Mar 2, 2016

Copy link
Copy Markdown
Contributor Author

Done.

@rhatdan rhatdan force-pushed the netsysctl branch 2 times, most recently from d6a278d to 1a5db4b Compare March 2, 2016 22:12
@thaJeztah thaJeztah mentioned this pull request Mar 8, 2016
@thaJeztah

Copy link
Copy Markdown
Member

ping @vdemeester ptal

@vdemeester

Copy link
Copy Markdown
Member

LGTM 🐰
gccgo is SUCCESS and windowsTP5 fails with a known flakey test.. merging 😉

@vdemeester vdemeester merged commit 988508a into moby:master Apr 13, 2016
@thaJeztah

Copy link
Copy Markdown
Member

🎉 thanks @rhatdan!

@brunoborges

Copy link
Copy Markdown

Has anybody tested --sysctl to tweak net.ipv4.tcp_syn_retries to reduce global timeout for TCP connections inside the container? Does it work?

runcom pushed a commit to runcom/docker that referenced this pull request Jun 8, 2016
Upstream reference: moby#19265

This patch will allow users to specify namespace specific "kernel parameters"
for running inside of a container.

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
@vingrad

vingrad commented Jun 9, 2016

Copy link
Copy Markdown

How can I use sysctl option with a compose file?

@thaJeztah

Copy link
Copy Markdown
Member

@vingrad this feature is not released yet (it will be in docker 1.12), also, that's a better question for the docker compose issue tracker; https://github.com/docker/compose/issues

liusdu pushed a commit to liusdu/moby that referenced this pull request Oct 30, 2017
This patch will allow users to specify namespace specific "kernel parameters"
for running inside of a container.

cherry-pick from: moby#19265

conflicts:
        contrib/completion/bash/docker
        docs/reference/commandline/create.md
        docs/reference/commandline/run.md
        man/docker-create.1.md
        man/docker-run.1.md
        runconfig/opts/parse.go

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Lei Jitang <leijitang@huawei.com>
liusdu pushed a commit to liusdu/moby that referenced this pull request Oct 30, 2017
Add support for setting sysctls

This patch will allow users to specify namespace specific "kernel parameters"
for running inside of a container.

cherry-pick from: moby#19265
conflicts:
        contrib/completion/bash/docker
        docs/reference/commandline/create.md
        docs/reference/commandline/run.md
        man/docker-create.1.md
        man/docker-run.1.md
        runconfig/opts/parse.go

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Lei Jitang <leijitang@huawei.com>

This is a feature request from euleros.


See merge request docker/docker!385
@merryisfriend

Copy link
Copy Markdown

Now can we use sysctl option with a dockerfile? I have some of parameters to set in container. @rhatdan

@cpuguy83

cpuguy83 commented Dec 6, 2017

Copy link
Copy Markdown
Member

No, you can't persist sysctls in an image.

@vinujan59

vinujan59 commented Oct 10, 2019

Copy link
Copy Markdown
ubuntu@ip:~$ docker --version
Docker version 19.03.2, build 6a30dfc

ubuntu@ip:~$ uname -r
4.15.0-1051-aws

ubuntu@ip:~$ sysctl net.core.rmem_default
net.core.rmem_default = 212992

ubuntu@ip:~$ docker run  --privileged  -it ubuntu:16.04 uname -r
4.15.0-1051-aws

ubuntu@ip:~$ docker run  --privileged  -it ubuntu:16.04  sysctl net.core.rmem_default
sysctl: cannot stat /proc/sys/net/core/rmem_default: No such file or directory

ubuntu@ip:~$ docker run  --privileged --sysctl net.core.rmem_default=524288  -it ubuntu:16.04  /bin/bash
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"write sysctl key net.core.rmem_default: open /proc/sys/net/core/rmem_default: no such file or directory\"": unknown.

ubuntu@ip:~$ docker run  --privileged  --network="host" -it ubuntu:16.04  sysctl net.core.rmem_default
net.core.rmem_default = 212992
  • Have the latest docker version
  • the host has the parameter net.core.rmem_default
  • the same kernel will be used by the running container
  • the container doesn't have this parameter and not settable as well
  • but with host network mode, it can be verified in the container, the value is being shared
    -- implies that net.core.rmem_default is namespaced

is docker doesn't support net.core.rmem_default parameter?

@cpuguy83

cpuguy83 commented Oct 10, 2019 via email

Copy link
Copy Markdown
Member

@vinujan59

Copy link
Copy Markdown

Hi Brian,

Thank you for your reply.

I could not get the part "non-root network namespace".

Can you please ellaborate

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.