Watcher.Watch() hangs when some endpoints are not available #7247

cw9 · 2017-01-28T07:08:49Z

Hi, I've experienced a few times that Wathcer.Watch() would hang forever, the hanging happens on the following code:

watcher.Watch(
		context.Background(),
		key,
		clientv3.WithProgressNotify(),
		clientv3.WithCreatedNotify(),
	)

I'm using the release-3.1 version of clients, is this behavior expected? What is the reason for the hang? The key being watched on is already existing, but I don't think that matters?

My current plan of work around is to add timeout and retry to this call, let me know if you have any concerns with this approach.

The text was updated successfully, but these errors were encountered:

heyitsanthony · 2017-01-28T15:20:21Z

It will block until key is modified or until 10 minutes has elapsed for a progress notification.

cw9 · 2017-01-28T16:51:00Z

Sorry to be more clear, it's not the returned channel that is blocking, it's this Watch() call not returning a watch channel and just hanging here because I used context.Background().

I dig around a bit more, seem to find something related. I was running a cluster of 5 etcd nodes: etcd[1-5], but my etcd1 was sad:

etcdctl cluster-health
cluster may be unhealthy: failed to list members
Error:  client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
; error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused

error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused

on the rest of the cluster:

etcdctl cluster-health
member 1ccd4d9efffe92ee is healthy: got healthy result from etcd2:2379
member 42b142598e45c3df is healthy: got healthy result from etcd3:2379
member c41764c24eb810c9 is healthy: got healthy result from etcd4:2379
member e5472c006de57d1f is healthy: got healthy result from etcd5:2379
cluster is healthy

So when I try to create the etcd client with etcd[2-5], the Watch() works just fine, but when I create the etcd client with only etcd1 in the endpoint list, the Watch() just hangs.

When I create client with endpoints: etcd[1-5], the client will randomly talk to one of the endpoints, so when they connect to etcd[2-5], they are fine, but when they are connected to etcd1, it hangs. My question is why does the simpleBalancer in the etcd client not trying to contact other instances, does this mean the watch connection is sticky?

heyitsanthony · 2017-01-28T17:12:59Z

@cw9 OK, something may be wrong with the etcd1 member. Should the client endpoint be 127.0.0.1?

The balancer will pin an address if it can open a connection. If requests time out, it won't know about that; it will issue requests on that endpoint so long as the connection is up. This isn't a problem necessarily isolated to watches. It seems like 127.0.0.1:2379 is accepting connections, then doing nothing. For example, this "hangs" in a similar manner:

$ nc -l -p 2379 &
$ ETCDCTL_API=3 etcdctl watch abc

The fix would probably involve some kind of endpoint poisoning in the balancer so the client can abandon malfunctioning nodes.

xiang90 · 2017-01-28T20:32:49Z

@cw9 @heyitsanthony

If you provide etcd clientv3 a blackhole endpoint, it will hang for watch, put or whatever request without a timeout. Watch is especially important here since you do not really want to put a timeout for most cases.

A off channel endpoint health checking mechanism is required to break out the rpc waiting I assume. See https://github.com/grpc/grpc/blob/master/doc/health-checking.md.

Not sure if gRPC-go already support this or not.

cw9 · 2017-01-30T18:27:31Z

thanks for the explanation, I'll probably do something to prevent this sort of blackholing from happening from my end, besides that I'd like to confirm 2 things:

1, when there is a network partition, let's say the 5 nodes splits into 3+2, if the client can still talk to any of the etcd boxes, will it be able to get watch updates that happened in the bigger partition if the original watch channel was connected to an etcd instance that is now in the smaller partition?
I assume Get() calls will still success if it got sent to one of the boxes in the bigger partition?

2, if the client is connected to proxy etcd instances and 1 of the proxy instances now lost connection to the main etcd cluster, would the watch channel that established with this box catch up/retry other proxy boxes?

xiang90 · 2017-01-30T18:55:19Z

when there is a network partition, let's say the 5 nodes splits into 3+2, if the client can still talk to any of the etcd boxes, will it be able to get watch updates that happened in the bigger partition if the original watch channel was connected to an etcd instance that is now in the smaller partition?
I assume Get() calls will still success if it got sent to one of the boxes in the bigger partition?

By default, the watchers connected to the minority will hang there until the network partition recovers. However, you can provide option requireLeader (https://github.com/coreos/etcd/blob/master/clientv3/client.go#L306) to the watchers to ensure they can be aborted, and then be switched to the majority when partition happens.

You probably want to try this feature yourself to understand how it works. Let us know if you see any issues with it.

I assume Get() calls will still success if it got sent to one of the boxes in the bigger partition?

Correct. If you enable serializable reads, local read is allowed.

would the watch channel that established with this box catch up/retry other proxy boxes?

It should. But the gRPC proxy is an alpha feature, I have not tried it personally.

cw9 · 2017-01-31T21:30:09Z

However, you can provide option requireLeader (https://github.com/coreos/etcd/blob/master/clientv3/client.go#L306) to the watchers to ensure they can be aborted, and then be switched to the majority when partition happens.

Cool, this looks interesting, I'll def try it out

It should. But the gRPC proxy is an alpha feature, I have not tried it personally.

Got it, I'm not using the proxy feature as well, will make sure before I onboard to that.

xiang90 · 2017-10-04T22:20:35Z

@gyuho is this fixed on master?

gyuho · 2017-10-04T22:22:47Z

Closing via #8545.

xiang90 · 2017-10-04T22:23:17Z

@cw9 probably give it a try with current master + current master client. thank you!

1. Send an ErrWatchStopped to the caller only once. - Currently ErrWatchStopped gets sent to the caller multiple times causing a resubscribing watch to fail as well. 2. Use context with leader requirement for Watch API. - By default the etcd watchers will hang in case of a network partition and they are connected to the minority. - As mentioned here - etcd-io/etcd#7247 (comment) setting the leader requirement for watchers allows them to switch to the majority partition.

cw9 changed the title ~~Is Watcher.Watch() Blocking?~~ Watcher.Watch() hangs when some endpoints are not available Jan 28, 2017

xiang90 added the area/clientv3 label Jan 31, 2017

heyitsanthony added this to the unplanned milestone Jan 31, 2017

heyitsanthony mentioned this issue May 1, 2017

should I maintain a heartbeat in application layer? #7838

Closed

mwf mentioned this issue Aug 10, 2017

etcd3 - slow watchers and watcher count blow up #8387

Closed

gyuho closed this as completed Oct 4, 2017

adityadani mentioned this issue Nov 24, 2018

Etcdv3: Watch API Improvements portworx/kvdb#66

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Watcher.Watch() hangs when some endpoints are not available #7247

Watcher.Watch() hangs when some endpoints are not available #7247

cw9 commented Jan 28, 2017

heyitsanthony commented Jan 28, 2017

cw9 commented Jan 28, 2017

heyitsanthony commented Jan 28, 2017

xiang90 commented Jan 28, 2017

cw9 commented Jan 30, 2017

xiang90 commented Jan 30, 2017

cw9 commented Jan 31, 2017

xiang90 commented Oct 4, 2017

gyuho commented Oct 4, 2017

xiang90 commented Oct 4, 2017

Watcher.Watch() hangs when some endpoints are not available #7247

Watcher.Watch() hangs when some endpoints are not available #7247

Comments

cw9 commented Jan 28, 2017

heyitsanthony commented Jan 28, 2017

cw9 commented Jan 28, 2017

heyitsanthony commented Jan 28, 2017

xiang90 commented Jan 28, 2017

cw9 commented Jan 30, 2017

xiang90 commented Jan 30, 2017

cw9 commented Jan 31, 2017

xiang90 commented Oct 4, 2017

gyuho commented Oct 4, 2017

xiang90 commented Oct 4, 2017