Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs on stop() #582

Open
efimkin opened this issue Jan 15, 2020 · 1 comment
Open

Hangs on stop() #582

efimkin opened this issue Jan 15, 2020 · 1 comment

Comments

@efimkin
Copy link

efimkin commented Jan 15, 2020

Hi!
Sometimes our application hangs on stop()
kazoo (2.6.1)
timeout=1

(gdb) py-bt
Traceback (most recent call first):
  <built-in method acquire of _thread.lock object at remote 0x7fac03095df0>
  File "/usr/lib/python3.6/threading.py", line 1072, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
  File "/usr/lib/python3.6/threading.py", line 1056, in join
    self._wait_for_tstate_lock()
  File "/opt/yandex/pgsync/lib/python3.6/site-packages/kazoo/protocol/connection.py", line 193, in stop
    self._connection_routine.join()
  File "/opt/yandex/pgsync/lib/python3.6/site-packages/kazoo/client.py", line 528, in _safe_close
    if not self._connection.stop(timeout):
  File "/opt/yandex/pgsync/lib/python3.6/site-packages/kazoo/client.py", line 635, in stop
    self._safe_close()
  File "/opt/yandex/pgsync/lib/python3.6/site-packages/pgsync/zk.py", line 165, in __del__
    self._zk.stop()
azat added a commit to azat-archive/kazoo that referenced this issue Dec 28, 2022
In case of AUTH_FAILED in the zk-loop thread it will call
client._session_callback which will reset the queue.

However another thread can add to this queue CloseInstance event, and
if the _session_callback() will be called after CloseInstance was added
to the queue, then stop() will never return (and zk-loop will endlessly
spin).

Here is how it looks like with addititional logging:

    39: [ Thread-3 (zk_loop) ] INFO: client.py:568: _session_callback: Zookeeper session closed, state: AUTH_FAILED
    39: [ MainThread ] Level 5: client.py:721: stop: Sending CloseInstance
    39: [ Thread-3 (zk_loop) ] Level 5: client.py:403: _reset: Reseting the client
    39: [ Thread-3 (zk_loop) ] Level 5: connection.py:625: _connect_attempt: Connecting
    39: [ Thread-3 (zk_loop) ] Level 5: connection.py:625: _connect_attempt: Connecting

You can find details in this gist [1].

  [1]: https://gist.github.com/azat/bc7aaea1c32a4f1ea75ad646d26280e9

Fixes: python-zk#582
@azat
Copy link
Contributor

azat commented Dec 28, 2022

I've also came across this issue, in my case the problem was in race after AUTH_FAILED error, here is a fix - #688

And here is a gist with details - https://gist.github.com/azat/bc7aaea1c32a4f1ea75ad646d26280e9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants