Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document per namespace sysctl and how to set them in pods #29572

Closed
bprashanth opened this issue Jul 25, 2016 · 11 comments
Closed

Document per namespace sysctl and how to set them in pods #29572

bprashanth opened this issue Jul 25, 2016 · 11 comments
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@bprashanth
Copy link
Contributor

Source of major confusion in webserver tuning, but I'm sure that's just one victim.

@kubernetes/sig-node

@bprashanth bprashanth added kind/documentation Categorizes issue or PR as related to documentation. sig/node Categorizes an issue or PR as relevant to SIG Node. documentation/confusing labels Jul 25, 2016
@bprashanth bprashanth added this to the next-candidate milestone Jul 25, 2016
@ncdc
Copy link
Member

ncdc commented Jul 25, 2016

@bprashanth see #26057

@bprashanth
Copy link
Contributor Author

So that says:

network namespace: net.*

which is pretty much what I could figure out from the link to docker source above. But experimentation shows that /proc/sys/net/netfilter/nf_conntrack_max is not per ns, and I think i found a couple more. What am i missing?

Without reading through the proposal, setting it on a contaienr pipes through to the host, or does someone still need the daemon set?

@vishh
Copy link
Contributor

vishh commented Jul 25, 2016

The sysctl proposal mentions maintaining a file that will list sysctl that
are namespace aware.
The general strategy is to let kubelet configure sysctls and not have pods
update sysctls themselves.

One of the open questions on that proposal is that of knowing which sysctls
can be safely set to a higher value by default globally... Any inputs in
that regard will help!

On Mon, Jul 25, 2016 at 2:08 PM, Prashanth B notifications@github.com
wrote:

So that says:

network namespace: net.*

which is pretty much what I could figure out from the link to docker
source above. But experimentation shows that
/proc/sys/net/netfilter/nf_conntrack_max is not per ns, and I think i found
a couple more. What am i missing?

Without reading through the proposal, setting it on a contaienr pipes
through to the host, or does someone still need the daemon set?


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
#29572 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGvIKGnH5hlK_SOyfXxGyfvtseqKSmfwks5qZSXegaJpZM4JUicg
.

@bprashanth
Copy link
Contributor Author

bprashanth commented Jul 25, 2016

I can write down the common ones for webserver tuning, when i have some time. Maybe @aledbf or @PiotrSikora know off the top of their head the ones we use in https://github.com/kubernetes/contrib/tree/master/ingress/controllers/nginx or commonly with nginx

@aledbf
Copy link
Member

aledbf commented Jul 25, 2016

net.core.somaxconn and net.ipv4.ip_local_port_range are the ones used in the sysctl example. I've been unable to find a list of the values is possible to change in a network namespace.

@vishh
Copy link
Contributor

vishh commented Jul 25, 2016

A list with safe defaults will be helpful.

On Mon, Jul 25, 2016 at 3:01 PM, Manuel Alejandro de Brito Fontes <
notifications@github.com> wrote:

net.core.somaxconn and net.ipv4.ip_local_port_range are the ones used in
the sysctl
https://github.com/kubernetes/contrib/blob/master/ingress/controllers/nginx/examples/sysctl/change-proc-values-rc.yaml#L80
example. I've been unable to find a list of the values is possible to
change in a network namespace.


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
#29572 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGvIKLz9j-uUYwmR1qMkCUdCDrNEoqFSks5qZTIdgaJpZM4JUicg
.

@sttts
Copy link
Contributor

sttts commented Jul 29, 2016

Here is a good starting point in the kernel source to see which namespaced (per netns) sysctls exist under net.*:

https://github.com/torvalds/linux/search?utf8=%E2%9C%93&q=register_net_sysctl

If I understand it correctly, a number of sysctls only exist in init_net, the initial network namespace. And network namespaces are not nested, but flat. So we have a good chance that we really only get those sysctls which are safe.

It's not that hard to double check: lines like

table[0].data = &net->xfrm.sysctl_aevent_etime;

set a sysctl value destination in the given network namespace object net.

For example here are the netns specific tables for ipv4 and core:

https://github.com/torvalds/linux/blob/a7fd20d1c476af4563e66865213474a2f9f473a4/net/ipv4/sysctl_net_ipv4.c#L668
https://github.com/torvalds/linux/blob/a7fd20d1c476af4563e66865213474a2f9f473a4/net/core/sysctl_net_core.c#L417

@sttts
Copy link
Contributor

sttts commented Jul 29, 2016

Of course, the philosophical question remains: do we want to offer all technically possible sysctls (i.e. all those that are namespaced) or only the most common ones like the two above.

@fejta-bot
Copy link

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 19, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 18, 2018
@yujuhong
Copy link
Contributor

I think we can close this issue now that there is documentation on sysctl and how to use it: https://kubernetes.io/docs/concepts/cluster-administration/sysctl-cluster/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

9 participants