New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client wait forever if kube-apiserver restart in slb environment #107266
Comments
/sig api-machinery |
@smileusd: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This has been fixed in recent versions that use HTTP2, by default there is an idle timeout of 30s in the connection that will detect that this is stale and restart it, solving this problem. |
I think it went in in 1.19 #87615 |
@aojea Thank you for reply. But i can not find the log from client "use of closed network connection", In my case client didn't recieved any events include EOF or |
@aojea I try to update golang from 1.14 to 1.16.12 and build a version. But issues still exist. The update can not fix this problem. |
is not a golang version update, you need kubernetes 1.19 or greater. |
@aojea This is my own controller and use client-go. I update golang to 1.16.12, client-go to newest and using newest "golang.org/x/net/http2" pkg. But also not work. Looks like http2 and healthcheck has been enabled |
how can I reproduce it? please, be specific :) |
@aojea That could be special condition. in our environment, we use slb(gateway) between kube-apiserver and client. I am not sure the specific technical details about the slb, but it will produce two tcp connections kube-apiserver <-> slb and slb <-> client. From my sight the slb hold the long connection from client side when kube-apiserver pod was deleted. The client side connection still active and health but controller can not watch any events because server has been changed to another ip. We are trying to add arguments in slb to auto close the client connection. |
After add client idle timeout from slb can resolved this. Client can receive EOF and rebuild a watcher:
|
yeah, that should be added by default kubernetes/staging/src/k8s.io/apimachinery/pkg/util/net/http.go Lines 180 to 189 in 2af53e9
|
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened?
We found if use slb between kube-apiserver and client, when kube-apiserver restart, client watcher connection not close and wait an event forever. In this case has two connections: kube-apiserver <-> slb, slb <-> client. After close the first connection could not ensure the second connection also closed if slb hard to handle this long connection situation.
What did you expect to happen?
rebuild the client watcher and watch the new events
How can we reproduce it (as minimally and precisely as possible)?
restart apiserver and patch a new resource and watch the log from client, will find nothing.
Anything else we need to know?
No response
Kubernetes version
v1.18.8
Cloud provider
no
OS version
Install tools
Container runtime (CRI) and and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: