Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Connection is not re-established if a socket connection error occurs #47

Closed
diranged opened this Issue · 2 comments

2 participants

@diranged

We seem to have a reproducable problem where a 'Broken pipe' (caused by stunnel failing its connection to our zookeeper servers) breaks Kazoo and it never tries to re-connect:

Jan  7 06:52:09.568578 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.protocol.connection,_send_request]: (ERROR) ('socket connection error: %s', 'Broken pipe')
Jan  7 06:52:11.542409 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.client,_session_callback]: (INFO) Zookeeper session lost, state: CLOSED
Jan  7 06:52:11.542409 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,nd_service_registry.KazooServiceRegistry,_state_listener]: (WARNING) Zookeeper connection state changed: LOST
Jan  7 06:52:11.542409 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.handlers.threading,thread_worker]: (WARNING) Exception in worker queue thread
Jan  7 06:52:11.569085 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.handlers.threading,thread_worker]: (ERROR) Connection has been closed
Jan  7 06:52:16.097672 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,WatcherDaemon.ServiceWatcher.frontend,_update]: (WARNING) [frontend] could not update path /services/production/uswest1/frontend/ec2-184-72-14-165.us-west-1.compute.amazonaws.com:80 with state True: Service is down. Try again later.
Jan  7 06:53:16.862000 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,WatcherDaemon.ServiceWatcher.frontend,_update]: (WARNING) [frontend] could not update path /services/production/uswest1/frontend/ec2-184-72-14-165.us-west-1.compute.amazonaws.com:80 with state True: Service is down. Try again later.
Jan  7 06:54:17.434420 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,WatcherDaemon.ServiceWatcher.frontend,_update]: (WARNING) [frontend] could not update path /services/production/uswest1/frontend/ec2-184-72-14-165.us-west-1.compute.amazonaws.com:80 with state True: Service is down. Try again later.

I'll look into possible fixes.. but I wanted to get this issue opened up so you guys know about it.

@hannosch
Owner

This sounds a lot like issue #39 / #41 - which we fixed and released in 0.9.

@diranged

You may be right ... unfortunately reproducing this error is extremely hard. I'm fine closing this for now and we'll re-open if I see it again (we've now upgraded to 0.9 everywhere).

@diranged diranged closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.