Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclosed connections to kubelet #67659

Closed
denis256 opened this issue Aug 21, 2018 · 12 comments
Closed

Unclosed connections to kubelet #67659

denis256 opened this issue Aug 21, 2018 · 12 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@denis256
Copy link

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

What happened:
Noticed that pods on specific worker become unavailable after ~1 day, kubectl exec / kubectl logs return next error:

$ kubectl exec -it pod1 bash
Error from server: error dialing backend: dial tcp 10.128.2.183:10250: connect: cannot assign requested address

Future investigation shows that the number of connections to kubelet is increasing and never close.
On pod1 are periodically executed(5 sec) commands from other pod, like: kubectl exec pod1 ps and after introduction of this change number of opened connection on worker started to grow.

Important points:

  • If I execute kubectl exec pod1 ps master - number of open connections is not growing;
  • If I executed kubectl exec pod1 ps from a pod - number of open connections is increasing;
  • Restarting of kubelet service drop connections and pods are available.
  • Restarting of the pod which do periodically kubectl exec don't help - opened connections aren't reset
  • As CNI is used weave

How to reproduce it (as minimally and precisely as possible):
On a running K8S cluster:

  • deploy a pod in which will be executed kubectl commands (pod1)
  • deploy another pod with kubectl and with permissions to run kubectl exec (pod2)
  • login on a worker node which run pod1, check number of open descriptors for kubelet(ls -1 /proc/<kubelet-pid>/fd | wc -l), check netstat output for kubelet
  • run execute kubectl commands from pod2 inside of pod1, like kubectl exec pod1 ps
  • check number of open descriptors and connections, now for me it is increasing descriptor count and number of connections

Anything else we need to know?:

Example output of netstat on worker:

$ netstat -plant
...
tcp6       0      0 10.128.2.183:10250      10.128.2.179:48439      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:33398      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:43883      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:59231      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:58303      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:35768      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:59308      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:35270      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:34221      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:33154      ESTABLISHED 8376/kubelet        
tcp6       0      0 10.128.2.183:10250      10.128.2.179:54436      ESTABLISHED 8376/kubelet     
...

Connection number:

$ netstat  -plant | grep "kubelet" | wc -l
56270

pods behind 10.128.2.179:

[user@k8s-qa2-master-1 ~]$ kubectl get all --all-namespaces -owide| grep "10.128.2.179"
kube-system     pod/etcd-k8s-qa2-master-1                                             1/1       Running     0          23h       10.128.2.179   k8s-qa2-master-1
kube-system     pod/kube-apiserver-k8s-qa2-master-1                                   1/1       Running     0          23h       10.128.2.179   k8s-qa2-master-1
kube-system     pod/kube-controller-manager-k8s-qa2-master-1                          1/1       Running     0          23h       10.128.2.179   k8s-qa2-master-1
kube-system     pod/kube-proxy-47r8z                                                  1/1       Running     0          4d        10.128.2.179   k8s-qa2-master-1
kube-system     pod/kube-scheduler-k8s-qa2-master-1                                   1/1       Running     0          4d        10.128.2.179   k8s-qa2-master-1
kube-system     pod/weave-net-x97wl                                                   2/2       Running     0          4d        10.128.2.179   k8s-qa2-master-1


Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:08:34Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}

  • Cloud provider or hardware configuration: k8s deployed on gcp vm instances

  • OS (e.g. from /etc/os-release):
    NAME="CentOS Linux"
    VERSION="7 (Core)"
    ID="centos"
    ID_LIKE="rhel fedora"
    VERSION_ID="7"
    PRETTY_NAME="CentOS Linux 7 (Core)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:centos:centos:7"
    HOME_URL="https://www.centos.org/"
    BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

  • Kernel (e.g. uname -a):
    Linux k8s-qa2-master-1 3.10.0-862.9.1.el7.x86_64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Mon Jul 16 16:29:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Others:
    kubelet version v1.11.0
    docker Version: 17.03.2-ce
    weave-kube:2.2.0
    weave-npc:2.2.0
    weaveexec:2.2.1
    kube-apiserver-amd64:v1.11.0
    kube-controller-manager-amd64:v1.11.0
    kube-scheduler-amd64:v1.11.0
    kube-proxy-amd64:v1.11.0
    etcd-amd64:3.2.18
    k8s-dns-kube-dns-amd64:1.14.10
    k8s-dns-sidecar-amd64:1.14.10
    k8s-dns-dnsmasq-nanny-amd64:1.14.10

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. kind/bug Categorizes issue or PR as related to a bug. labels Aug 21, 2018
@denis256
Copy link
Author

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 21, 2018
@jmcmeek
Copy link
Contributor

jmcmeek commented Aug 27, 2018

I was about to open an issue for this from the apiserver perspective:

On an 1.11.2 IBM Cloud Kubernetes Service cluster (and since reproduced in a simple kubeadm cluster), we found the apiserver memory growing. Further investigation showed a large number of long running exec requests reported in the apiserver /metrics endpoint. There were a similar number of connections open to kubelet but only a handful of connections open to the apiserver.

We were able to reproduce this in both environments running a simple exec request in a loop:
for i in $(seq 1 $COUNT); do kubectl exec $POD -- sh -c "echo hi" > /dev/null; sleep 1; done

After multiple iteration loops spread over an hour metrics showed roughly 40% of the exec sessions had "leaked":

apiserver_longrunning_gauge{resource="pods",scope="namespace",subresource="exec",verb="CREATE"} 267
apiserver_request_count{client="kubectl/v1.11.2 (linux/amd64) kubernetes/bb9ffb1",code="0",contentType="",resource="pods",scope="namespace",subresource="exec",verb="CONNECT"} 433

with 267 connections open to kubelet.

We see this to a lesser extent in Kubernetes 1.10.5 - perhaps one percent of requests, and we have only reproduced a handful of occurrences by killing the kubectl exec. The orders of magnitude difference in behavior suggests a different problem - I'll open a separate issue for 1.10.5.

@dlamotte
Copy link

I believe this is related directly to #67913. I've posted a traceback of the issue in kubelet here #67913 (comment)

@jmcmeek
Copy link
Contributor

jmcmeek commented Aug 31, 2018

I think what I see in 1.11 as an easily reproducible problem is different that what was reported in #67913

In my 1.11 single node kubeadm cluster I changed apiserver and kubelet logging to --v-10. I had a single pod running throughout the exercise. It should not have had issues like trying to exec to a container that was still starting and I do not see any evidence of panics.

This is first request that "leaked":

ubuntu@master-1:$ date -u; kubectl exec mypod -- sh -c "echo hi"; date -u
Fri Aug 31 13:16:36 UTC 2018
hi
Fri Aug 31 13:16:36 UTC 2018
ubuntu@master-1:
$ kubectl get --raw /metrics | grep "longrunning.*exec"
apiserver_longrunning_gauge{resource="pods",scope="namespace",subresource="exec",verb="CREATE"} 1

In apiserver log I see:

{"log":"I0831 13:16:36.858650       1 handler.go:153] kube-aggregator: POST \"/api/v1/namespaces/default/pods/mypod/exec\" satisfied by nonGoRestful\n","stream":"stderr","time":"2018-08-31T13:16:36.859275835Z"}
{"log":"I0831 13:16:36.858683       1 pathrecorder.go:247] kube-aggregator: \"/api/v1/namespaces/default/pods/mypod/exec\" satisfied by prefix /api/\n","stream":"stderr","time":"2018-08-31T13:16:36.859304294Z"}
{"log":"I0831 13:16:36.858693       1 handler.go:143] kube-apiserver: POST \"/api/v1/namespaces/default/pods/mypod/exec\" satisfied by gorestful with webservice /api/v1\n","stream":"stderr","time":"2018-08-31T13:16:36.859308199Z"}
{"log":"I0831 13:16:36.860675       1 upgradeaware.go:259] Connecting to backend proxy (intercepting redirects) https://10.0.2.15:10250/exec/default/mypod/nginx?command=sh\u0026command=-c\u0026command=echo+hi\u0026error=1\u0026output=1\n","stream":"stderr","time":"2018-08-31T13:16:36.860835405Z"}
{"log":"  Headers: map[User-Agent:[kubectl/v1.11.2 (linux/amd64) kubernetes/bb9ffb1] Upgrade:[SPDY/3.1] Content-Length:[0] Connection:[Upgrade] X-Stream-Protocol-Version:[v4.channel.k8s.io v3.channel.k8s.io v2.channel.k8s.io channel.k8s.io] X-Forwarded-For:[10.0.2.15]]\n","stream":"stderr","time":"2018-08-31T13:16:36.860854258Z"}
{"log":"I0831 13:16:36.873080       1 upgradeaware.go:291] Writing 159 bytes to hijacked connection\n","stream":"stderr","time":"2018-08-31T13:16:36.873436979Z"}

The next attempt did not leak - the gauge is unchanged.

ubuntu@master-1:$ date -u; kubectl exec mypod -- sh -c "echo hi"; date -u
Fri Aug 31 13:16:46 UTC 2018
hi
Fri Aug 31 13:16:47 UTC 2018
ubuntu@master-1:
$ kubectl get --raw /metrics | grep "longrunning.*exec"
apiserver_longrunning_gauge{resource="pods",scope="namespace",subresource="exec",verb="CREATE"} 1

In apiserver log I see:

{"log":"I0831 13:16:47.059990       1 handler.go:153] kube-aggregator: POST \"/api/v1/namespaces/default/pods/mypod/exec\" satisfied by nonGoRestful\n","stream":"stderr","time":"2018-08-31T13:16:47.060784191Z"}
{"log":"I0831 13:16:47.060023       1 pathrecorder.go:247] kube-aggregator: \"/api/v1/namespaces/default/pods/mypod/exec\" satisfied by prefix /api/\n","stream":"stderr","time":"2018-08-31T13:16:47.060817212Z"}
{"log":"I0831 13:16:47.060033       1 handler.go:143] kube-apiserver: POST \"/api/v1/namespaces/default/pods/mypod/exec\" satisfied by gorestful with webservice /api/v1\n","stream":"stderr","time":"2018-08-31T13:16:47.060830203Z"}
{"log":"I0831 13:16:47.061992       1 upgradeaware.go:259] Connecting to backend proxy (intercepting redirects) https://10.0.2.15:10250/exec/default/mypod/nginx?command=sh\u0026command=-c\u0026command=echo+hi\u0026error=1\u0026output=1\n","stream":"stderr","time":"2018-08-31T13:16:47.062184914Z"}
{"log":"  Headers: map[Content-Length:[0] Connection:[Upgrade] Upgrade:[SPDY/3.1] User-Agent:[kubectl/v1.11.2 (linux/amd64) kubernetes/bb9ffb1] X-Stream-Protocol-Version:[v4.channel.k8s.io v3.channel.k8s.io v2.channel.k8s.io channel.k8s.io] X-Forwarded-For:[10.0.2.15]]\n","stream":"stderr","time":"2018-08-31T13:16:47.062201941Z"}
{"log":"I0831 13:16:47.072543       1 upgradeaware.go:291] Writing 159 bytes to hijacked connection\n","stream":"stderr","time":"2018-08-31T13:16:47.072806553Z"}
{"log":"I0831 13:16:47.098223       1 handler.go:153] kube-aggregator: GET \"/api/v1/namespaces/kube-system/endpoints/kube-controller-manager\" satisfied by nonGoRestful\n","stream":"stderr","time":"2018-08-31T13:16:47.099195211Z"}
{"log":"I0831 13:16:47.098245       1 pathrecorder.go:247] kube-aggregator: \"/api/v1/namespaces/kube-system/endpoints/kube-controller-manager\" satisfied by prefix /api/\n","stream":"stderr","time":"2018-08-31T13:16:47.099226729Z"}
{"log":"I0831 13:16:47.098255       1 handler.go:143] kube-apiserver: GET \"/api/v1/namespaces/kube-system/endpoints/kube-controller-manager\" satisfied by gorestful with webservice /api/v1\n","stream":"stderr","time":"2018-08-31T13:16:47.099230712Z"}
{"log":"I0831 13:16:47.100452       1 wrap.go:42] GET /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (2.628166ms) 200 [[kube-controller-manager/v1.11.2 (linux/amd64) kubernetes/bb9ffb1/leader-election] 10.0.2.15:46906]\n","stream":"stderr","time":"2018-08-31T13:16:47.100724662Z"}
{"log":"I0831 13:16:47.101325       1 handler.go:153] kube-aggregator: PUT \"/api/v1/namespaces/kube-system/endpoints/kube-controller-manager\" satisfied by nonGoRestful\n","stream":"stderr","time":"2018-08-31T13:16:47.101533933Z"}
{"log":"I0831 13:16:47.101366       1 pathrecorder.go:247] kube-aggregator: \"/api/v1/namespaces/kube-system/endpoints/kube-controller-manager\" satisfied by prefix /api/\n","stream":"stderr","time":"2018-08-31T13:16:47.101544082Z"}
{"log":"I0831 13:16:47.101385       1 handler.go:143] kube-apiserver: PUT \"/api/v1/namespaces/kube-system/endpoints/kube-controller-manager\" satisfied by gorestful with webservice /api/v1\n","stream":"stderr","time":"2018-08-31T13:16:47.101546749Z"}
{"log":"I0831 13:16:47.103997       1 wrap.go:42] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager: (2.909748ms) 200 [[kube-controller-manager/v1.11.2 (linux/amd64) kubernetes/bb9ffb1/leader-election] 10.0.2.15:46906]\n","stream":"stderr","time":"2018-08-31T13:16:47.104322501Z"}
{"log":"E0831 13:16:47.126710       1 upgradeaware.go:310] Error proxying data from client to backend: write tcp 10.0.2.15:48540-\u003e10.0.2.15:10250: write: broken pipe\n","stream":"stderr","time":"2018-08-31T13:16:47.127479248Z"}
{"log":"I0831 13:16:47.127020       1 wrap.go:42] POST /api/v1/namespaces/default/pods/mypod/exec?command=sh\u0026command=-c\u0026command=echo+hi\u0026container=nginx\u0026container=nginx\u0026stderr=true\u0026stdout=true: (67.210314ms) hijacked [[kubectl/v1.11.2 (linux/amd64) kubernetes/bb9ffb1] 10.0.2.15:48200]\n","stream":"stderr","time":"2018-08-31T13:16:47.127543949Z"}

I went back double checked that "Error proxying data from client to backend" only shows for "non-leaking" exec requests. I assume that should be there and its absence is part of the puzzle.

In kubelet.log (covering both requests) I see:

Aug 31 13:16:27 master-1 kubelet[17582]: E0831 13:16:27.432827   17582 upgradeaware.go:310] Error proxying data from client to backend: readfrom tcp 127.0.0.1:39730->127.0.0.1:46134: write tcp 127.0.0.1:39730->127.0.0.1:46134: write: broken pipe
Aug 31 13:16:47 master-1 kubelet[17582]: E0831 13:16:47.126534   17582 upgradeaware.go:310] Error proxying data from client to backend: readfrom tcp 127.0.0.1:39870->127.0.0.1:46134: write tcp 127.0.0.1:39870->127.0.0.1:46134: write: broken pipe

The entry at 13:16:27 corresponds to an earlier exec request that did not leak.
The entry at 13:16:47 corresponds to the last exec request I made - again, did not leak.
There is nothing at the time of the "leaked" request.

These were the last lines of the kubelet log - so no chance that I am omitting kubelet logs entries.

@dlamotte
Copy link

@jmcmeek Yeah, agreed. There might be two issues here (or this issue and #67913 are indeed different). I have seen those write: broken pipe messages as well. We're working through cleaning up our code to remove polling kubectl exec's before the container starts and we'll see if it continues to rise.

In our cluster, we are seeing it consistently. Rising to around 40k fds open in kubelet daily (we automated restarting kubelet when it hits this many now to mitigate the issue). Might slow as we remove the polling exec's, but we'll know soon.

Not sure what would help move this issue forward at this point but I'd be happy to provide it.

@wwsean08
Copy link

wwsean08 commented Sep 6, 2018

I am also seeing this issue on 1.11.x (not sure what x is ATM), if there are any logs or debugging information you would like to help determine the root cause please let me know

@jmcmeek
Copy link
Contributor

jmcmeek commented Sep 15, 2018

What I was seeing in 1.11 - frequent "leaked" connections for exec sessions to running containers - seems to be fixed in 1.12.0-beta.1 by #67288.

@P1ng-W1n
Copy link

Can confirm - This issue seems to be fixed in 1.12

@veeramarni
Copy link

veeramarni commented Oct 24, 2018

I'm using Kubectl client version 1.12.1 on a cluster with 1.10.7 version. And I still see the kubectl connection leaks.

@P1ng-W1n
Copy link

I'm using Kubectl client version 1.12.1 on a cluster with 1.10.7 version. And I still see the kubectl connection leaks.

Right.. Those connections are kept alive by kubelet, and not by kubectl - that's why you're still affected. In my case upgrading the whole cluster (including kubelet) to 1.12 was the solution (my cluster is used for CI purposes, and we had roughly ~50k "orphan" connections after one or two weeks - since the upgrade the number of connections are bellow 150-200 after a few weeks)
In case you can't upgrade your cluster just yet, my experience is that as a workaround you could restart the kubelet daemon on the k8s workers - that would do the trick, though only temporarily..

@MHBauer
Copy link
Contributor

MHBauer commented Dec 22, 2018

Fix is to upgrade to something containing #67288, which should now be in all releases 1.10+, or to periodically restart the kubelet.
/close

@k8s-ci-robot
Copy link
Contributor

@MHBauer: Closing this issue.

In response to this:

Fix is to upgrade to something containing #67288, which should now be in all releases 1.10+, or to periodically restart the kubelet.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

8 participants