New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect SessionExpiredError when ZooKeeper leader process down #374
Comments
|
Said change seems ok although I'm not sure on the history of why it checks against both. |
|
Thank you for take a look! |
|
I tried to remove checking KeeperState.CONNECTIONG, but tests failed because the below test in test_client.py assumes that. Hmm. By the way, when I also remove the assertion of case of KeeperState.CONNECTING, the all of rest tests are passed. |
|
@bbangert Can you please comment on this? |
|
Offhand I can't recall why its returning expired when it isn't expired. |
|
Was there any consensus about fixing the issue? |
…n-zk#374) As discussed in python-zk#570 (comment): * Requests issued in 'CONNECTING' state get queued, and a "token" added to the pipe, as in the 'CONNECTED' state; * All test pass; * TODO: This patch does not try to limit the maximum queue length. How should that be controlled? A new 'KazooClient' parameter? If so, what should it default to?
…n-zk#374) As discussed in python-zk#570 (comment): With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a (misleading) 'SessionExpiredError'. * [TODO] This patch does not prevent requests which have been queued but not emitted from being rejected with 'ConnectionLoss'. It should also get rid of the second part of '_notify_pending', shouldn't it? * [TODO] This patch does not try to limit the maximum queue length, which should probably be controlled via a new 'KazooClient' parameter. (How about 'max_queue_length'? What should it default to?)
…n-zk#374) As discussed in python-zk#570 (comment): With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a (misleading) 'SessionExpiredError'. * [TODO] This patch does not prevent requests which have been queued but not emitted from being rejected with 'ConnectionLoss'. It should also get rid of the second part of '_notify_pending', shouldn't it? * [TODO] This patch does not try to limit the maximum queue length, which should probably be controlled via a new 'KazooClient' parameter. (How about 'max_queue_length'? What should it default to?)
…#374) As discussed in python-zk#570 (comment): With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a (misleading) 'SessionExpiredError'.
…thon-zk#374) If the connection is lost, but the state is 'CONNECTING', the client is trying to revalidate an existing session. When that happens, the requests which have been dequeued but are still pending are interrupted with a 'ConnectionLoss' exception--as we don't know if the packet reached the server, and the ACKs will never come back anyway. Without this patch, the requests which are still queued, and thus haven't been emitted are also cancelled with the same exception (see bottom half of '_notify_pending'). It seems that there is no reason to cancel such requests when the new state is 'CONNECTING', as the client is trying to validate an existing session; these "pristine" requests could very well be submitted in-order over the new connection if the session is recovered. This also seems more consistent with the first patch for issue python-zk#374: it does not matter whether a request was queued just before or during the "outage"--as long as it was not emitted. This patch implements that feature. ATTN: The patch is marked WIP because it should most probably *NOT* be merged--as it turns out that both the Java and the C client cancel such requests when the connection is lost. Java: ClientCnxn socket error -> cleanAndNotifyState -> cleanup -> conLossPacket/remove C: Socket error -> handle_socket_error_msg -> handle_error -> cleanup -> cleanup_bufs -> free_buffers/free_completions
…#374) With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a misleading 'SessionExpiredError'. This fixes python-zk#374, and brings Kazoo more in line with the Java and C clients. See the 'kazoo.client.KazooClient.state' documentation as well as these discussions for more details: python-zk#570 (comment) python-zk#583 (comment)
…#374) With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a misleading 'SessionExpiredError'. This fixes python-zk#374, and brings Kazoo more in line with the Java and C clients. See the 'kazoo.client.KazooClient.state' documentation as well as these discussions for more details: python-zk#570 (comment) python-zk#583 (comment)
With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a misleading 'SessionExpiredError'. This fixes #374, and brings Kazoo more in line with the Java and C clients. See the 'kazoo.client.KazooClient.state' documentation as well as these discussions for more details: #570 (comment) #583 (comment)
bwtakacy commentedDec 28, 2015
I encountered SessionExpiredError when ZooKeeper leader process down with below stack trace, but there was no session expiration at that time in ZooKeeper server logs:
, where zha.py is a tools using kazoo.
After debugging, I have found that _call() of KazooClient raised this error when KeeperState became CONNECTING and this is what happened in my environment.
Is there some historical reason about this?
Is it OK to change the condition will catch only the KeeperState.EXPIRED_SESSION ?
The text was updated successfully, but these errors were encountered: