Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idle connections are stuck after establishing connection. They never send PacketType_DATA #311

Closed
ipochi opened this issue Dec 15, 2021 · 10 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@ipochi
Copy link
Contributor

ipochi commented Dec 15, 2021

While exploring more on issue #276, it was noticed that the idle connections are somehow stuck in a weird state where the connection is established i.e the kube-apiserver receives a DIAL_RSP from the proxy-server, but then thereafter there is complete inactivity from the apiserver i.e no packets of PacketType_DATA are sent.

In order to reproduce this, I added some debug statements to the konnectivity-client library which is vendored in Kubernetes and built a local cluster.

This is the diff in the kubernetes repo to build a local cluster

diff --git a/hack/local-up-cluster.sh b/hack/local-up-cluster.sh
index abe85bb29fc..49b526ef080 100755
--- a/hack/local-up-cluster.sh
+++ b/hack/local-up-cluster.sh
@@ -553,7 +553,20 @@ kind: EgressSelectorConfiguration
 egressSelections:
 - name: cluster
   connection:
-    proxyProtocol: Direct
+    # This controls the protocol between the API Server and the Konnectivity
+    # server. Supported values are "GRPC" and "HTTPConnect". There is no
+    # end user visible difference between the two modes. You need to set the
+    # Konnectivity server to work in the same mode.
+    proxyProtocol: GRPC
+    transport:
+      # This controls what transport the API Server uses to communicate with the
+      # Konnectivity server. UDS is recommended if the Konnectivity server
+      # locates on the same machine as the API Server. You need to configure the
+      # Konnectivity server to listen on the same UDS socket.
+      # The other supported transport is "tcp". You will need to set up TLS
+      # config to secure the TCP transport.
+      uds:
+        udsName: /tmp/uds-proxy
 - name: controlplane
   connection:
     proxyProtocol: Direct
@@ -617,6 +630,7 @@ EOF
       --requestheader-allowed-names=system:auth-proxy \
       --proxy-client-cert-file="${CERT_DIR}/client-auth-proxy.crt" \
       --proxy-client-key-file="${CERT_DIR}/client-auth-proxy.key" \
+      --api-audiences=https://kubernetes.default.svc,system:konnectivity-server \
       --cors-allowed-origins="${API_CORS_ALLOWED_ORIGINS}" >"${APISERVER_LOG}" 2>&1 &
     APISERVER_PID=$!

diff --git a/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/client.go b/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/client.go
index 31bf9e11fe5..82880b2f997 100644
--- a/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/client.go
+++ b/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/client.go
@@ -109,6 +109,7 @@ func (t *grpcTunnel) serve(c clientConn) {
 		switch pkt.Type {
 		case client.PacketType_DIAL_RSP:
 			resp := pkt.GetDialResponse()
+			fmt.Println("DEBUG - Receive PacketType_DIAL_RSP for connection ID=", resp.ConnectID)
 			t.pendingDialLock.RLock()
 			ch, ok := t.pendingDial[resp.Random]
 			t.pendingDialLock.RUnlock()
@@ -120,7 +121,7 @@ func (t *grpcTunnel) serve(c clientConn) {
 					err:    resp.Error,
 					connid: resp.ConnectID,
 				}
-				select  {
+				select {
 				case ch <- result:
 				default:
 					klog.ErrorS(fmt.Errorf("blocked pending channel"), "Received second dial response for connection request", "connectionID", resp.ConnectID, "dialID", resp.Random)
@@ -136,6 +137,7 @@ func (t *grpcTunnel) serve(c clientConn) {

 		case client.PacketType_DATA:
 			resp := pkt.GetData()
+			fmt.Println("DEBUG - Receive PacketType_DATA for connection ID=", resp.ConnectID)
 			// TODO: flow control
 			t.connsLock.RLock()
 			conn, ok := t.conns[resp.ConnectID]
@@ -155,6 +157,7 @@ func (t *grpcTunnel) serve(c clientConn) {
 			}
 		case client.PacketType_CLOSE_RSP:
 			resp := pkt.GetCloseResponse()
+			fmt.Println("DEBUG - Receive PacketType_CLOSE_RSP for connection ID=", resp.ConnectID)
 			t.connsLock.RLock()
 			conn, ok := t.conns[resp.ConnectID]
 			t.connsLock.RUnlock()
@@ -211,7 +214,7 @@ func (t *grpcTunnel) DialContext(ctx context.Context, protocol, address string)
 	klog.V(5).Infoln("DIAL_REQ sent to proxy server")

 	c := &conn{stream: t.stream, random: random}
-
+	fmt.Println("DEBUG - Send PacketType_DIAL_REQ")
 	select {
 	case res := <-resCh:
 		if res.err != "" {
diff --git a/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/conn.go b/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/conn.go
index cc6e66be237..acf0ee7ce61 100644
--- a/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/conn.go
+++ b/vendor/sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client/conn.go
@@ -18,6 +18,7 @@ package client

 import (
 	"errors"
+	"fmt"
 	"io"
 	"net"
 	"time"
@@ -61,6 +62,7 @@ func (c *conn) Write(data []byte) (n int, err error) {
 	if err != nil {
 		return 0, err
 	}
+	fmt.Println("DEBUG - Send PacketType_DATA for connection ID=", c.connID)
 	return len(data), err
 }

@@ -116,6 +118,7 @@ func (c *conn) Close() error {
 	klog.V(4).Infoln("closing connection")
 	var req *client.Packet
 	if c.connID != 0 {
+		fmt.Println("DEBUG - Send PacketType_CLOSE_REQ for connection ID=", c.connID)
 		req = &client.Packet{
 			Type: client.PacketType_CLOSE_REQ,
 			Payload: &client.Packet_CloseRequest{

Once the local cluster is up with ./hack/local-up-cluster.sh, you can start the proxy-server and agent as local processes in separate terminals with:

make clean && make certs && make gen && make build

./bin/proxy-server --server-port=0 --uds-name=/tmp/uds-proxy --cluster-ca-cert=certs/agent/issued/ca.crt --cluster-cert=certs/agent/issued/proxy-frontend.crt --cluster-key=certs/agent/private/proxy-frontend.key --warn-on-channel-limit --v 5 --keepalive-time 1m --frontend-keepalive-time 1m  --enable-profiling 2>&1 | tee server.log

./bin/proxy-agent --ca-cert=certs/agent/issued/ca.crt --agent-cert=certs/agent/issued/proxy-agent.crt --agent-key=certs/agent/private/proxy-agent.key

Stress the cluster, with concurrent kubectl log requests

Create some workloads

kubectl create deployment nginx --image nginx --replicas 20

get-logs.sh script:

#!/bin/bash

NAMESPACE=${NAMESPACE:-"default"}
pods=$(kubectl get pods -n ${NAMESPACE} --no-headers | awk '{print $1}')
FLOOR=1;
arr=(`echo ${pods}`);
CEILING=${#arr[@]};

for i in {1..3}
do
   RANGE=$(($CEILING-$FLOOR));
   random=$RANDOM;
   let "random %= $RANGE";
   random=$(($random+$FLOOR));
   kubectl logs -n ${NAMESPACE} ${arr[${random}]} > /dev/null 2> /dev/null
done

run.sh script:

#!/bin/bash

set -euo pipefail

for i in {1..500}
do
    ./get-logs.sh &
done

Execute the run.sh script after saving and giving execute permissions:

./run.sh

To verify the above hypothesis, I've created a small script, that prints those connection IDs which are idle i.e CLOSE_REQ has not been sent by the api-server and hence as a result never received a CLOSE_RSP from the proxy-server.

check.sh script:

#!/bin/bash

set -eou pipefail

total_connection_ids=$(rg "DEBUG - Send PacketType_DIAL_REQ" /tmp/kube-apiserver.log | wc -l)
for connection_id in $(seq 1 $total_connection_ids)
do
  close_req="DEBUG - Send PacketType_CLOSE_REQ for connection ID= ${connection_id}"
  close_rsp="DEBUG - Receive PacketType_CLOSE_RSP for connection ID= ${connection_id}"

  req_status=0
  set +e
  rg "$close_req\$" /tmp/kube-apiserver.log > /dev/null || req_status=1

  res_status=0
  rg "$close_rsp\$" /tmp/kube-apiserver.log > /dev/null || res_status=1

  set -e

  if [[ ( $req_status == 1 && $res_status == 1 ) ]]; then
    echo "Both missing for $connection_id"
  fi

  if [[ ( $req_status == 0 && $res_status == 1 ) ]]; then
    echo "CloseResponse missing for $connection_id"
  fi

  if [[ ( $req_status == 1 && $res_status == 0 ) ]]; then
    echo "Interesting case for $connection_id"
  fi
done

We are interested in the case of Both missing from $connection_id. All of the connection ids, that are printed for this case if cross-checked against the proxy-server logs have only the following logs snippet, example connectionID=505

I1215 16:36:06.122100   15067 server.go:714] "Received DIAL_RSP" dialID=4406796995779167332 agentID="f902ce27-b9b6-44a6-aef7-cfad5aff3611" connectionID=505
I1215 16:36:06.122132   15067 server.go:269] "Register frontend for agent" frontend=&{Mode:grpc Grpc:0xc000ee5f80 HTTP:<nil> CloseHTTP:<nil> connected:0xc000c58840 connectID:505 agentID:f902ce27-b9b6-44a6-aef7-cfad5aff3611 start:{wall:13863927826982877159 ext:149658304691 loc:0x212d6e0} backend:0xc000616108} agentID="f902ce27-b9b6-44a6-aef7-cfad5aff3611" connectionID=505

You can also verify if PacketType_DATA is sent by the apiserver for the same connection id

grep "DEBUG - Send PacketType_DATA for connection ID=" /tmp/kube-apiserver.log | grep 505

The problematic aspect of this is that in order to close/terminate these idle connections, we need connectionID which is only received when DIAL_RSP is sent by the agent and received by proxy-server and when PacketType_DATA is sent by apiserver to proxy-server.

/cc @iaguis @rata

ipochi added a commit to kinvolk/apiserver-network-proxy that referenced this issue Dec 15, 2021
This commit builds on top of the previous commit.

It is noticed that the memory consumption of Konnectivity server,
increases when idle connections pile up. The issue is described in more
detail in kubernetes-sigs#311

In order to mitigate the issue, we create a separate goroutine, that
tracks all the connetions per stream and if there is no activity in the
stream after a specified timeout (configured via --grpc-max-idle-time),
we close the stream thereby releasing the resources.

The idea behind this approach is that in a sucessful connection
lifecycle, the apiserver sends a CLOSE_REQ and proxy-server responds
with a CLOSE_RSP but for idle connections this never happens.

In order to to simulate the behaviour of apiserver sending the
CLOSE_REQ, the proxy-server fires up a goroutine and tracks inactive
connections, upon finding so, itself sends a CLOSE_REQ to the same
channel on which it is listening from the api-server.

Upon receiving the simulated CLOSE_REQ, proxy-server follows the
codepath of the successful lifecyle of the packet i.e forwards the
CLOSE_REQ to the agent, to which the agent responds with CLOSE_RSP and
proxy-server forwards the CLOSE_RSP back to the apiserver.

There are also those connections which are stuck by never receiving the
PacketType_DATA, for those connections, we close the stream once the
timeout occurs.

This also fixes the issue of leaking file descriptors in kubernetes-sigs#276

Signed-off-by: Imran Pochi <imran@kinvolk.io>
ipochi added a commit to kinvolk/apiserver-network-proxy that referenced this issue Dec 23, 2021
It is noticed that the memory consumption of Konnectivity server,
increases when idle connections pile up. The issue is described in more
detail in kubernetes-sigs#311

In order to mitigate the issue, we create a separate goroutine, that
tracks all the connetions per stream and if there is no activity in the
stream after a specified timeout (configured via --grpc-max-idle-time),
we close the stream thereby releasing the resources.

The idea behind this approach is that in a sucessful connection
lifecycle, the apiserver sends a CLOSE_REQ and proxy-server responds
with a CLOSE_RSP but for idle connections this never happens.

In order to to simulate the behaviour of apiserver sending the
CLOSE_REQ, the proxy-server fires up a goroutine and tracks inactive
connections, upon finding so, itself sends a CLOSE_REQ to the same
channel on which it is listening from the api-server.

There are also those connections which are stuck by never receiving the
PacketType_DATA, for those connections, we close the stream once the
timeout occurs.

This also fixes the issue of leaking file descriptors in kubernetes-sigs#276

Signed-off-by: Imran Pochi <imran@kinvolk.io>
ipochi added a commit to kinvolk/apiserver-network-proxy that referenced this issue Dec 23, 2021
It is noticed that the memory consumption of Konnectivity server,
increases when idle connections pile up. The issue is described in more
detail in kubernetes-sigs#311

In order to mitigate the issue, we create a separate goroutine, that
tracks all the connetions per stream and if there is no activity in the
stream after a specified timeout (configured via --grpc-max-idle-time),
we close the stream thereby releasing the resources.

The idea behind this approach is that in a sucessful connection
lifecycle, the apiserver sends a CLOSE_REQ and proxy-server responds
with a CLOSE_RSP but for idle connections this never happens.

In order to to simulate the behaviour of apiserver sending the
CLOSE_REQ, the proxy-server fires up a goroutine and tracks inactive
connections, upon finding so, itself sends a CLOSE_REQ to the same
channel on which it is listening from the api-server.

There are also those connections which are stuck by never receiving the
PacketType_DATA, for those connections, we close the stream once the
timeout occurs.

This also fixes the issue of leaking file descriptors in kubernetes-sigs#276

Signed-off-by: Imran Pochi <imran@kinvolk.io>
@rata
Copy link
Contributor

rata commented Jan 18, 2022

@ipochi is it possible that you tested this without #290 ? I think that might be the case and that is why it seems the dial rsp was received, although it was an invalid one. Let me elaborate.

The diff of the k8s code you show doesn't include the return added in PR #290 and, if you look carefully at this part of the diff you pasted:

 		case client.PacketType_DIAL_RSP:
 			resp := pkt.GetDialResponse()
+			fmt.Println("DEBUG - Receive PacketType_DIAL_RSP for connection ID=", resp.ConnectID)
 			t.pendingDialLock.RLock()
 			ch, ok := t.pendingDial[resp.Random]
 			t.pendingDialLock.RUnlock()

You are printing you receive the response before checking if it is a valid response or not. See the rest of the context here: https://github.com/kubernetes-sigs/apiserver-network-proxy/pull/290/files#diff-a37bb159e5baec4208de32e78a3905ecef6282ceaf095115842e4e3738d96310L109-R118. Just after that, we check if the dialResp is recognized or not.

That will make it print that we received a dialResp, even though it is not recognized and ignored. Therefore, no PacketType_DATA will be sent here and, by those prints, it will appear that the DialResp was received although it was not valid.

So, if my analysis is correct, it would be completely possible for you to see this without patch #290. If you apply it, you should probably not see this issue.

@ipochi can you retest with #290 applied?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2022
@andrewsykim
Copy link
Member

@ipochi any chance we can test this again with the recent fixes that merged?

@ipochi
Copy link
Contributor Author

ipochi commented May 5, 2022

Thanks @andrewsykim for the fixes. I'll be looking into it sometime next week and circle back on results.

@ipochi
Copy link
Contributor Author

ipochi commented May 5, 2022

UPDATE:

I still see many idle connections are stuck after establishing connection.

I followed the above reproducing steps on v0.0.30 tag.

In all there were 1412 dial requests were sent, as identified by

rg "DEBUG - Send PacketType_DIAL_REQ" /tmp/kube-apiserver.log | wc
1412

Out of which 90 were missing close responses and none of those 90 connections had send PacketType_DATA

./check.sh
Both missing for 42
Both missing for 45
Both missing for 46
Both missing for 95
Both missing for 96
Both missing for 99
Both missing for 103
Both missing for 132
Both missing for 154
Both missing for 160
Both missing for 243
Both missing for 269
Both missing for 273
Both missing for 275
Both missing for 281
Both missing for 283
Both missing for 284
Both missing for 285
Both missing for 287
Both missing for 288
Both missing for 290
Both missing for 292
Both missing for 294
Both missing for 295
Both missing for 297
Both missing for 298
Both missing for 299
Both missing for 300
Both missing for 301
Both missing for 302
Both missing for 305
Both missing for 307
Both missing for 308
Both missing for 309
Both missing for 312
Both missing for 320
Both missing for 321
Both missing for 323
Both missing for 349
Both missing for 350
Both missing for 351
Both missing for 358
Both missing for 360
Both missing for 362
Both missing for 363
Both missing for 365
Both missing for 368
Both missing for 369
Both missing for 373
Both missing for 374
Both missing for 380
Both missing for 398
Both missing for 400
Both missing for 469
Both missing for 470
Both missing for 494
Both missing for 517
Both missing for 519
Both missing for 525
Both missing for 529
Both missing for 558
Both missing for 581
Both missing for 598
Both missing for 621
Both missing for 623
Both missing for 718
Both missing for 777
Both missing for 788
Both missing for 802
Both missing for 809
Both missing for 828
Both missing for 836
Both missing for 841
Both missing for 845
Both missing for 849
Both missing for 856
Both missing for 863
Both missing for 865
Both missing for 866
Both missing for 867
Both missing for 869
Both missing for 870
Both missing for 873
Both missing for 877
Both missing for 878
Both missing for 880
Both missing for 881
Both missing for 1188
Both missing for 1330
Both missing for 1331
Both missing for 1390
Both missing for 1391
./check.sh | wc -l
90

@rata
Copy link
Contributor

rata commented May 5, 2022

@ipochi sorry, but which Kubernetes version are you using? That is a key thing, as the bug fixes are the vendored files from k8s. It will be good to make sure the k8s version you are using contains all the ANP fixes too.

@ipochi
Copy link
Contributor Author

ipochi commented May 5, 2022

@rata latest master that has the v0.0.30 vendored konnectivity-client.

k8s version - master branch which includes the v0.0.30 konnectivity-client
apiserver-network-proxy version - v0.0.30 tag

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 4, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

5 participants