New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Watch endpoint should have a timeout option #2468
Comments
Hi, I am a newbie to ETCD and right now studying it closely. |
It may be possible to add a pub/sub interface, but that would be a new feature and should probably be discussed separately. |
We just built the timeout into our java etcd client instead. So when we watch we only watch for 20 seconds max and then timeout and retry so we avoid this. Once we get events then we can remove this sorta thing. |
We have some discussion about this internally:
But these are for v3 api. |
I think option 2 and 3 do not really solve the problem, because the etcd index is intentionally time-independent, but TCP connections are explicitly time-dependent. So in an idle cluster you may not receive any index changes for a period of time, and then still observe timeouts. Why does this need to be a v3 feature? At least the simple version (adding a timeout query parameter) is not a breaking change, so it would be nice to consider it for a minor release rather than waiting all the way for v3? |
They are relevant things. Client needs to detect the server status and server also needs to detect client status. It might not help with you specific problem. But a lot of people want to control the timeout of watchers themselves and they need to when where to start re-watch.
This is a client-side thing. It should not be handled at server side. etcd server are detecting dead client via tcp keepalive. |
cc @hchoudh |
I'd like to see a timeout parameter and a dedicated "nothing happened" response such as a 204 at the end of it. The current behaviour interacts very poorly with HTTP connection pooling because the server never completes the HTTP transaction if no events occur. Since HTTP 1.x is request-response based, there's no way to time out a request without also tearing down the underlying socket. Doing that is very expensive; it wastes file descriptors at both ends and adds an additional TCP or even a TLS handshake in order to reconnect. |
This is a good point. We are actually considering switch to stream based communication. |
Streaming is nice but it's hard to write a correct client for that. I'd recommend doing something simple to allow connection reuse and then adding streaming later for advanced users via format=stream, for example. |
I've just come across another corner case that is hard to handle without a response. I want my code to be able to detect if the cluster has been replaced wholesale by a backup, for example. In that case, I'll poll against the backup server but with an index from the pre-backup server; that index will be far in the future for the new server so I'll never get a response. Without a response, I can't check the X-Etcd-Cluster-Id header and spot I polled against the wrong one. |
Current etcd implementation does not yet support timeout option when `wait=true`: etcd-io/etcd#2468 Originaly I've implemented `watch` method for `Etcd` class in a following manner: if the leader key was updated just because master needs to update ttl and watch timeout is not yet expired, I was recalculating timeout and starting `watch` call once again. Usually after "restart" we were getting urllib3.exceptions.TimeoutError. The only possible way to recover after such exception - close socket and establish a new connection. With pure http it's relatively cheap, but with https and some kind of authorization on etcd side it would became rather expensive and should be avoided.
We also just ran into the "backup restore" edge case -- we had a storage outage and restored from backup, all clients that were watching ended up in an indefinite hang as the cluster index had moved far into the past. |
@stevenschlansker Can you file a new bug? I am confused on what you mean by "backup restore" edge case. |
This blog post resports some odd behavior: http://www.projectclearwater.org/adventures-in-debugging-etcd-http-pipelining-and-file-descriptor-leaks/
|
For clients that know the max amount of time they want to wait for a watch response they can specify waitTimeout in nanoseconds. ``` $ time curl 'http://127.0.0.1:2379/v2/keys/foo?wait=true&waitTimeout=2000000000' real 0m2.008s user 0m0.003s sys 0m0.004s ``` Potential fix to etcd-io#2468
For clients that know the max amount of time they want to wait for a watch response they can specify waitTimeout in nanoseconds. ``` $ time curl 'http://127.0.0.1:2379/v2/keys/foo?wait=true&waitTimeout=2000000000' real 0m2.008s user 0m0.003s sys 0m0.004s ``` Potential fix to etcd-io#2468
Adding a Also, in the case described on projectclearwater.org, the connection was shut down cleanly, so using |
I agree. So I am not convinced that we should support it now.
We tried. But timeout at the server side breaks the connections. And go client library seems to have a bug that cannot handle it well. See golang/go#8946. We set tcp level keepalive and also use |
@bdarnell Yicheng reminded me that you might suggest set the timeout for context, not for the connection. So the server will finish the HTTP. That probably works. |
Sorry, I thought you weren't using I don't follow how a timeout on the server side breaks things. When the timeout fires you should just send back an empty response, or whatever you want to use to signify to the client that the timeout elapsed without any news. |
I was wrong. Sorry. See #2468 (comment). |
Just a note, keep in mind that TCP keep-alive timeout parameters are configured system-wide (at least on Linux), and generally are set such that timeout happens after 10+ minutes. It's a nice thing to have as a fallback, but may not meet the needs of finding dead connections within a reasonable time for some etcd use cases. |
It can be configured per connection. (see 4.3 of http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/) |
You are right, thanks. That will work for the server to detect unreachable clients, but I doubt the client will be able to set it properly through grpc, where the socket is quite hidden. I don't see any support in https://github.com/grpc/grpc grpc-java has an issue open for it here grpc/grpc-java#1648 |
You can turn TCP keepalives on and off per socket, but the duration of the timeout is system-wide (and usually very large - hours in some cases). If you need to detect dropped connections faster than that you have to use your own timer and send some data over the connection. This is tricky to do cleanly with HTTP/1; it's easier in HTTP/2 or GRPC. |
@bdarnell that was my understanding as well, but on careful re-reading of that doc, there are setsockopt options to override the system-wide parameters. TCP_KEEPCNT, TCP_KEEPIDLE, TCP_KEEPINTVL. |
You can tune that too, although it is not that portable and usually at minute level. For etcd, we do application level keepalive on peer communication. On clientside, transportation is the responsibility of gRPC. We do application level keep alive on long living streams. |
This is irrelevant to v3 api which uses stream to support watch and clients can require from periodically etcd layer keep alive notification. We do not have plan to add timeout feature as a workaround into v2 either. |
Closing. v3 cancellation works well so far without timeout option. |
@xiang90 so what's the solution? Could you provided some examples of using v3' watcher to distinguish the status of no event and server crashs which being unable to reply?
|
Currently, the
?wait=true
option to GET (to create a watch) will wait indefinitely for the key to change. This interacts poorly with socket timeouts.If you have any non-infinite socket timeout, you must know that a socket timeout exception can actually happen simply if no change has happened before the timeout. You can't differentiate between this state and the server disappearing and being unable to reply.
If you have an infinite socket timeout, and the remote server crashes, you may wait forever for a connection which is long dead.
Thoughts on how to improve this:
The text was updated successfully, but these errors were encountered: