-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Watch does not fail on complete loss of connectivity #8980
Comments
This is already working (since v3.2.10+, master branch--pre v3.3.0). You are running a single node, and the watch client is expected to be stuck with the single node, retrying--even with keepalive settings (e.g. As of today (pre v3.3.0, gRPC v1.7.4), the retrial logic is all manual: just try again on transient failure. This was partially because gRPC keepalive does not expose connection state (seems resolved, related grpc/grpc-go#1444). In the future, we may improve this on a single node case. |
Sorry, but its not clear to me what you're saying. Is it intended behaviour that the broken watch just sits there doing nothing? I downloaded v3.2.11 which exposes the same problematic behaviour with my command and does not support the keepalive flags in your command. I do not understand what you mean by To be clear, the behaviour I expect to see at some point is: given the correct connection timeout options, a watch that gets disconnected will fail and return to the caller. |
A temp connection issue should not fail watcher unless you explicitly specify a timeout (but the timeout will also fail long running watchers, so probably not anyone would like to do that) or a keepalive mechanism to kill it. |
You should provide etcd client all available endpoints, and the client expects to re-connect by itself. If you want to control reconnection yourself, you basically need to build your own balancer, which is a non-trivial of effort to get right. |
This is exactly what I'm looking for, and can't find. Trying your etcdctl watch command with all the timeouts, the command does not terminate when i sigstop etcd. edit: this is with etcd/etcdctl build from master |
I do not want to control reconnection. I want my own service's requests to fail when the service is completely unable to reach etcd, as the service is unable to operate correctly. |
Hi - any updates? As far as I can tell this is still unresolved, despite the rather premature closing of the issue. |
@devnev Sorry! I was unclear. And we need better docs around this.
It's retrying the single endpoint, since your command only runs a single node cluster. Again, if you want balancer failover, provide other endpoints in
This is expected. Watch command won't exit on connection loss or server shutdowns. Keepalive in watch was added to detect connection failures and trigger endpoint switches, not to error out. We will document this clearly in v3.3 release.
In client-side, there's no way to tell if the failure is transient or permanent. That's why client retries among available endpoints, in case the server comes back. When etcd server goes down, gRPC client receives Again, watch is not really meant for detecting disconnects. If you want to detect connection loss, I would try |
Thank you for the clarification and the workaround. As a final note, even with multiple etcds, there can be total connection loss to the etcd cluster (in our case this was caused by some as of yet undiagnosed routing issue affecting only specific kubernetes pods). One of the benefits of watches I was expecting was to receive updates quickly, making the system more reactive. If a watch gets completely disconnected, the system loses this property. I am glad to hear there's a workaround using sessions, but I am now going to have to wrap every single one of my watches in a session to ensure the reactivity of the system. |
Agree. We will address this in the next release cycle (hopefully v3.4). |
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in silently missed events. To upper bound such silent errors, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in silently missed events. To upper bound such silent errors, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in silently missed events. To upper bound such silent errors, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also update dep lock.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in silently missed events. To upper bound such silent errors, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also update dep lock.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in missed events. To upper bound the effects of such missed events, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also: - Make sure to return all etcd errors. - Update dep lock.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in missed events. To upper bound the effects of such missed events, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also: - Make sure to return all etcd errors. - Update dep lock.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in missed events. To upper bound the effects of such missed events, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also: - Make sure to return all etcd errors. - Update dep lock.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in missed events. To upper bound the effects of such missed events, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also: - Make sure to return all etcd errors. - Update dep lock.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in missed events. To upper bound the effects of such missed events, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also make sure to return all etcd errors.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in missed events. To upper bound the effects of such missed events, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also make sure to return all etcd errors.
For an etcd Watch with context, as long as the context hasn't been canceled or timed out, the Watch will retry on recoverable errors forever until connected. This includes Watches during significant etcd connection loss, which can result in missed events. To upper bound the effects of such missed events, use a periodic full refresh of the full tree to interrupt every Watch, preventing any long-lived Watch. Currently the periodic full tree refresh is 10 minutes. Since this affects all Watches, we'll have to keep an eye on etcd performance and configure the period accordingly. See etcd-io/etcd#8980. Also make sure to return all etcd errors.
Thank you for this useful discussion here. I am seeing the same behavior when simulating partitions (e.g. cut [virtual] network connection or use My observation is only correct when the client itself is partitioned from the majority, i.e. it cannot switch (balance) to another endpoint. This could happen when the client runs on the same etcd instance/node or in the same failure domain/zone. This is in-line with what it discussed here and an open issue according to the // TODO(v3.4): configure watch retry policy, limit maximum retry number
// (see https://github.com/coreos/etcd/issues/8980) I see the same behavior with the cli, err := clientv3.New(clientv3.Config{
Endpoints: []string{"etcd1:2379", "etcd2:2379", "etcd3:2379"},
DialTimeout: 2 * time.Second,
DialKeepAliveTime: 5 * time.Second,
DialKeepAliveTimeout: 2 * time.Second,
DialOptions: []grpc.DialOption{grpc.WithBlock()},
})
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
leaderCtx := clientv3.WithRequireLeader(ctx)
rch := cli.Watch(leaderCtx, "", clientv3.WithPrefix())
[...] It would be great to update the docs for For more details please also see kubernetes/kubernetes#89488 |
Watches against etcd in the API server can hang forever if the etcd cluster loses quorum, e.g. the majority of nodes crashes. This fix improves responsiveness (detection and reaction time) of API server watches against etcd in some rare (but still possible) edge cases so that watches are terminated with `"etcdserver: no leader" (ErrNoLeader)`. Implementation behavior described by jingyih: ``` The etcd server waits until it cannot find a leader for 3 election timeouts to cancel existing streams. 3 is currently a hard coded constant. The election timeout defaults to 1000ms. If the cluster is healthy, when the leader is stopped, the leadership transfer should be smooth. (leader transfers its leadership before stopping). If leader is hard killed, other servers will take an election timeout to realize leader lost and start campaign. ``` For further details, discussion and validation see kubernetes/kubernetes#89488 (comment) and etcd-io/etcd#8980. Closes: kubernetes/kubernetes#89488 Signed-off-by: Michael Gasch <mgasch@vmware.com> Kubernetes-commit: 70c9f770d7aa2194bfd3f58fe01756a7d200b866
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
Somehow I was excavating the old issue, the symptom described by @embano1 was not reproducible at least on main branch. I have tested both manually e2e by injecting iptable rules across a 3 node etcd hosts with a client running remotely and locally (adhere to a server node the watch connects to). Not to mention etcd has TestWatchWithRequireLeader and TestBalancerUnderNetworkPartitionWatchFollower to ensure watch will be cancelled with Implementation: etcd/server/etcdserver/api/v3rpc/interceptor.go Lines 336 to 346 in 9e1abba
However, the original issue is not resolved. If a single node is abruptly stopped, the client watch will hang forever.
It looks promising and optional for user to enable. |
Steps to reproduce:
etcd
ETCDCTL_API=3 etcdctl watch --prefix ""
Expected outcome:
After some timeout, the watch will fail. This would allow the caller to propagate an error further up the stack, telling clients to retry elsewhere where there might be a healthy connection.
Observed outcome:
The watch blocks forever.
The text was updated successfully, but these errors were encountered: