Connection is not re-established if a socket connection error occurs #47

Closed
diranged opened this Issue Jan 7, 2013 · 2 comments

Comments

Projects
None yet
2 participants
@diranged
Contributor

diranged commented Jan 7, 2013

We seem to have a reproducable problem where a 'Broken pipe' (caused by stunnel failing its connection to our zookeeper servers) breaks Kazoo and it never tries to re-connect:

Jan  7 06:52:09.568578 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.protocol.connection,_send_request]: (ERROR) ('socket connection error: %s', 'Broken pipe')
Jan  7 06:52:11.542409 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.client,_session_callback]: (INFO) Zookeeper session lost, state: CLOSED
Jan  7 06:52:11.542409 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,nd_service_registry.KazooServiceRegistry,_state_listener]: (WARNING) Zookeeper connection state changed: LOST
Jan  7 06:52:11.542409 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.handlers.threading,thread_worker]: (WARNING) Exception in worker queue thread
Jan  7 06:52:11.569085 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,kazoo.handlers.threading,thread_worker]: (ERROR) Connection has been closed
Jan  7 06:52:16.097672 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,WatcherDaemon.ServiceWatcher.frontend,_update]: (WARNING) [frontend] could not update path /services/production/uswest1/frontend/ec2-184-72-14-165.us-west-1.compute.amazonaws.com:80 with state True: Service is down. Try again later.
Jan  7 06:53:16.862000 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,WatcherDaemon.ServiceWatcher.frontend,_update]: (WARNING) [frontend] could not update path /services/production/uswest1/frontend/ec2-184-72-14-165.us-west-1.compute.amazonaws.com:80 with state True: Service is down. Try again later.
Jan  7 06:54:17.434420 prod-fe-uswest1-39-i-7400912d zk_watcher[30841,WatcherDaemon.ServiceWatcher.frontend,_update]: (WARNING) [frontend] could not update path /services/production/uswest1/frontend/ec2-184-72-14-165.us-west-1.compute.amazonaws.com:80 with state True: Service is down. Try again later.

I'll look into possible fixes.. but I wanted to get this issue opened up so you guys know about it.

@hannosch

This comment has been minimized.

Show comment Hide comment
@hannosch

hannosch Jan 8, 2013

Member

This sounds a lot like issue #39 / #41 - which we fixed and released in 0.9.

Member

hannosch commented Jan 8, 2013

This sounds a lot like issue #39 / #41 - which we fixed and released in 0.9.

@diranged

This comment has been minimized.

Show comment Hide comment
@diranged

diranged Jan 8, 2013

Contributor

You may be right ... unfortunately reproducing this error is extremely hard. I'm fine closing this for now and we'll re-open if I see it again (we've now upgraded to 0.9 everywhere).

Contributor

diranged commented Jan 8, 2013

You may be right ... unfortunately reproducing this error is extremely hard. I'm fine closing this for now and we'll re-open if I see it again (we've now upgraded to 0.9 everywhere).

@diranged diranged closed this May 15, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment