New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for setting sysctls #19265
Conversation
This pull request replaces #16632 |
Opened pull request for engine-api docker/engine-api#38 |
} | ||
arr := strings.Split(val, "=") | ||
if len(arr) < 2 { | ||
return "", fmt.Errorf("sysctl '%s' is not valid ", val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we say "not valid" here, or something that gives the user a clue that it's not supported / whitelisted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
sysctl %s is not a namespaced kernel parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could work; is that true for all kernel versions though? Perhaps we should mention that we don't support it in stead? Open to better suggestions though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well then we get a bug report telling us it is a namespaced kernel parameter. If we say unsupported, user might ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO mentioning that it's "not whitelisted" makes it more clear to the user that they can come make the case for why it should be added 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well if you guys come to consensus I will change the message. :^)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with "not whitelisted"
and good news; this has been moved to code review!
// ValidateSysctl validates an sysctl and returns it. | ||
func ValidateSysctl(val string) (string, error) { | ||
validSysctlMap := map[string]bool{ | ||
"kernel.msgmax": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all these actually namespaced from 3.10+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
man namespaces
...
IPC namespaces (CLONE_NEWIPC)
...
* The System V IPC interfaces in /proc/sys/kernel, namely: msgmax,
msgmnb, msgmni, sem, shmall, shmmax, shmmni, and shm_rmid_forced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you consider adding kernel.perf_event_paranoid
and kernel.kptr_restrict
to the whitelist?
as a use case: i'm working on a proof of concept around flame graphs, visualizing perf event data in a containerized node.js app following along the work here https://gist.github.com/trevnorris/9616784.
have had a devil of a time sorting out getting these kernel parameters to stick in a running container. came upon your PR, and am stoked to see a --sysctl
flag in the works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mysterlune Do you know if these sysctls are namespaced? If you set them inside of the container, do the settings show up outside of the container? If not, then we would add them, if yes then we will not add them.
You could test this by running a --privileged container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @rhatdan,
so i ran the container with the --privileged
flag as such (as you can see, i'm on a mac; unsure if this matters to this experiment):
docker run -d -p 8080:8080 --name foobar -v /Users/rlune/gitproj/observable_node_poc:/src -v /Users/rlune/gitproj/observable_node_poc:/tmp --privileged observable-node-poc
... and got:
d137a82d7731ab095c484d0e70dd96081c275cf43c50100128ad5ea44e40cb01
... then, exec
'd a bash
session:
docker exec -it d137a82d7731ab095c484d0e70dd96081c275cf43c50100128ad5ea44e40cb01 bash
... checked the host machine for the kernel flags next...:
16:24 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.perf_event_paranoid
16:25 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.kptr_restrict
... received no output for each lookup. then, set the flags on the instance running in the container:
[root@d137a82d7731 app]# sysctl -w kernel.kptr_restrict=1
kernel.kptr_restrict = 1
[root@d137a82d7731 app]# sysctl -w kernel.perf_event_paranoid=1
kernel.perf_event_paranoid = 1
... read the keys on the image:
[root@d137a82d7731 app]# sysctl -a | grep kernel.kptr_restrict
kernel.kptr_restrict = 1
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
[and got the same sort of output for the `kernel.perf_event_paranoid` also]
... then checked the host machine again:
16:24 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.perf_event_paranoid
16:25 rlune@oakm111430f01:~/gitproj/observable_node_poc $ sysctl -a | grep kernel.kptr_restrict
... and did not see any lookup output for those keys.
is this test sufficient to determine that the kernel.perf_event_paranoid
and kernel.kptr_restrict
keys are namespaced in a way that will allow their addition to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is enough, but it certainly looks good. I guess if you ran another container and made sure they were different.
@jeremyeder PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dang it... it doesn't look like those flags are namespaced if i run two containers and set the flags within one of them... they show up as such in the other...
so, how does a flag get "namespaced" so that this collision doesn't happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You write a kernel patch. :^(
We're okay with this, moving to code review, unless @crosbymichael has major concerns |
7269711
to
ab2c413
Compare
Just need to update the engine-api vendor, otherwise code LGTM. |
Looks like this needs another rebase @rhatdan 😢 |
Done. |
d6a278d
to
1a5db4b
Compare
ping @vdemeester ptal |
LGTM 🐰 |
🎉 thanks @rhatdan! |
Has anybody tested --sysctl to tweak |
Upstream reference: moby#19265 This patch will allow users to specify namespace specific "kernel parameters" for running inside of a container. Signed-off-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Antonio Murdaca <runcom@redhat.com>
How can I use sysctl option with a compose file? |
@vingrad this feature is not released yet (it will be in docker 1.12), also, that's a better question for the docker compose issue tracker; https://github.com/docker/compose/issues |
This patch will allow users to specify namespace specific "kernel parameters" for running inside of a container. cherry-pick from: moby#19265 conflicts: contrib/completion/bash/docker docs/reference/commandline/create.md docs/reference/commandline/run.md man/docker-create.1.md man/docker-run.1.md runconfig/opts/parse.go Signed-off-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Lei Jitang <leijitang@huawei.com>
Add support for setting sysctls This patch will allow users to specify namespace specific "kernel parameters" for running inside of a container. cherry-pick from: moby#19265 conflicts: contrib/completion/bash/docker docs/reference/commandline/create.md docs/reference/commandline/run.md man/docker-create.1.md man/docker-run.1.md runconfig/opts/parse.go Signed-off-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Lei Jitang <leijitang@huawei.com> This is a feature request from euleros. See merge request docker/docker!385
Now can we use sysctl option with a dockerfile? I have some of parameters to set in container. @rhatdan |
No, you can't persist sysctls in an image. |
is docker doesn't support net.core.rmem_default parameter? |
The kernel does not support setting that value in a non-root (that is
"root" as in the base, not a user) network namespace.
|
Hi Brian, Thank you for your reply. I could not get the part "non-root network namespace". Can you please ellaborate Thanks |
This patch will allow users to specify namespace specific "kernel parameters"
for running inside of a container.
Dan Walsh dwalsh@redhat.com
Signed-off-by: Dan Walsh dwalsh@redhat.com