Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase nf_conntrack_tcp_timeout_close_wait #4276

Closed
grampelberg opened this issue Apr 20, 2020 · 4 comments · Fixed by #4409
Closed

Increase nf_conntrack_tcp_timeout_close_wait #4276

grampelberg opened this issue Apr 20, 2020 · 4 comments · Fixed by #4409
Assignees

Comments

@grampelberg
Copy link
Contributor

What problem are you trying to solve?

By default, nf_conntrack_tcp_timeout_close_wait is set to 60 seconds. Any requests taking longer than that get lost by the kernel causing 502 gateway errors to applications. kube-proxy bumped this up to 3600 seconds for its rules (see kubernetes/kubernetes#32551).

How should the problem be solved?

It is not possible to modify this via. the pod spec as it is marked "unsafe" in most k8s installations. However, as it is part of the network namespace, you can set it in init containers and have it persist for the life of the session. We should do this inside the proxy-init container and as part of setup for CNI.

@adleong
Copy link
Member

adleong commented Apr 27, 2020

In a quick spot check, I saw that this is already configured to 3600 on GKE and 60 on AKS.

@adleong
Copy link
Member

adleong commented Apr 29, 2020

It seems that in order to do this, we need to set allowPrivilegeEscalation and privilaged to true in the init container security policy. I'm not really sure what the implications of that are. Is this a tradeoff we're okay with?

@derrickburns
Copy link

derrickburns commented Apr 29, 2020

I deployed this today into a test environment using an initContainer per the suggestion of @grampelberg. I will have feedback in the next few days if this worked to address the issues that we are seeing.

│   initContainers:                                                                                                     │
│   - args:                                                                                                             │
│     - -c                                                                                                              │
│     - sysctl -w net.netfilter.nf_conntrack_tcp_timeout_close_wait=3600                                                │
│     command:                                                                                                          │
│     - /bin/sh                                                                                                         │
│     image: busybox:1.29                                                                                               │
│     imagePullPolicy: IfNotPresent                                                                                     │
│     name: sysctl-buddy                                                                                                │
│     resources:                                                                                                        │
│       requests:                                                                                                       │
│         cpu: 1m                                                                                                       │
│         memory: 1Mi                                                                                                   │
│     securityContext:                                                                                                  │
│       privileged: true                                                                                                │
│     terminationMessagePath: /dev/termination-log                                                                      │
│     terminationMessagePolicy: File                                                                                    │
│     volumeMounts:                                                                                                     │
│     - mountPath: /var/run/secrets/kubernetes.io/serviceaccount                                                        │
│       name: gateway-proxy-token-28bpm                                                                                 │
│       readOnly: true

I shelled into a pod with this and did this:

/modsecurity/data # sysctl -a | grep  net.netfilter.nf_conntrack_tcp_timeout_close_wait
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 3600

@ericsuhong
Copy link

For us, we did not have to increase nf_conntrack_tcp_timeout_close_wait for all containers.

We only needed to increase nf_conntrack_tcp_timeout_close_wait value for Nginx ingress controller only to fix the issue, so maybe making this configurable as an annotation can limit the security concern.

@grampelberg grampelberg added priority/P1 Planned for Release and removed priority/P0 Release Blocker labels May 4, 2020
adleong added a commit that referenced this issue May 21, 2020
Depends on linkerd/linkerd2-proxy-init#10

Fixes #4276 

We add a `--close-wait-timeout` inject flag which configures the proxy-init container to run with `privileged: true` and to set `nf_conntrack_tcp_timeout_close_wait`. 

Signed-off-by: Alex Leong <alex@buoyant.io>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants