Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module imports aren't thread safe #559

Closed
scottmac opened this issue Aug 21, 2020 · 6 comments
Closed

module imports aren't thread safe #559

scottmac opened this issue Aug 21, 2020 · 6 comments

Comments

@scottmac
Copy link
Contributor

scottmac commented Aug 21, 2020

See, #559 (comment) module imports in filters aren't thread safe

Ubuntu 20.04
Python 3.8.2

  File "/usr/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()
  File "/usr/lib/python3.8/threading.py", line 558, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 1268, in as_completed
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/worker.py", line 45, in run_parallel
    for future in concurrent.futures.as_completed(executor.submit(func, item) for item in items):
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/worker.py", line 571, in run_jobs
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/main.py", line 96, in run_jobs
    run_jobs(self)
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/command.py", line 297, in run
    self.urlwatcher.run_jobs()
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/cli.py", line 368, in main
  File "/usr/local/bin/urlwatch", line 8, in <module>
    sys.exit(main())
   #10 Frame 0x2b497c0, for file /usr/lib/python3.8/threading.py, line 302, in wait (self=<Condition(_lock=<_thread.lock at remote 0x7f005027c5d0>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f005027c5d0>, release=<built-in method release of _thread.lock object at remote 0x7f005027c5d0>, _waiters=<collections.deque at remote 0x7f0050271760>) at remote 0x7f005027c610>, timeout=None, waiter=<_thread.lock at remote 0x7f0051821300>, saved_state=None, gotit=False)
    waiter.acquire()
#16 Frame 0x7f0050ac0420, for file /usr/lib/python3.8/threading.py, line 558, in wait (self=<Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f005027c5d0>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f005027c5d0>, release=<built-in method release of _thread.lock object at remote 0x7f005027c5d0>, _waiters=<collections.deque at remote 0x7f0050271760>) at remote 0x7f005027c610>, _flag=False) at remote 0x7f005027c5b0>, timeout=None, signaled=False)
    signaled = self._cond.wait(timeout)
#22 Frame 0x2b3b390, for file /usr/lib/python3.8/concurrent/futures/_base.py, line 1268, in as_completed (fs={<Future(_condition=<Condition(_lock=<_thread.RLock at remote 0x7f0050b0f690>, acquire=<built-in method acquire of _thread.RLock object at remote 0x7f0050b0f690>, release=<built-in method release of _thread.RLock object at remote 0x7f0050b0f690>, _release_save=<built-in method _release_save of _thread.RLock object at remote 0x7f0050b0f690>, _acquire_restore=<built-in method _acquire_restore of _thread.RLock object at remote 0x7f0050b0f690>, _is_owned=<built-in method _is_owned of _thread.RLock object at remote 0x7f0050b0f690>, _waiters=<collections.deque at remote 0x7f0050abd940>) at remote 0x7f0050b0f7f0>, _state='RUNNING', _result=None, _exception=None, _waiters=[<_AsCompletedWaiter(event=<Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f005027c5d0>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f005027c5d0>, release=<built-in method release of _thread.lock object at remote...(truncated)
#26 Frame 0x7f0050aa29a0, for file /usr/local/lib/python3.8/dist-packages/urlwatch/worker.py, line 45, in run_parallel (items=<generator at remote 0x7f0050b58cf0>, future=<Future(_condition=<Condition(_lock=<_thread.RLock at remote 0x7f0050ab5f00>, acquire=<built-in method acquire of _thread.RLock object at remote 0x7f0050ab5f00>, release=<built-in method release of _thread.RLock object at remote 0x7f0050ab5f00>, _release_save=<built-in method _release_save of _thread.RLock object at remote 0x7f0050ab5f00>, _acquire_restore=<built-in method _acquire_restore of _thread.RLock object at remote 0x7f0050ab5f00>, _is_owned=<built-in method _is_owned of _thread.RLock object at remote 0x7f0050ab5f00>, _waiters=<collections.deque at remote 0x7f005024d5e0>) at remote 0x7f0050ab5c40>, _state='FINISHED', _result=<JobState(cache_storage=<CacheRedisStorage(filename='redis://localhost:6379/0', db=<Redis(connection_pool=<ConnectionPool(connection_class=<type at remote 0x2b1dac0>, connection_kwargs={'host': 'localhost', 'port': 6...(truncated)
    for future in concurrent.futures.as_completed(executor.submit(func, item) for item in items):
@scottmac
Copy link
Contributor Author

scottmac commented Sep 4, 2020

Finally found another, got more of a backtrace this time.

--

#10 Frame 0xfa45a0, for file /usr/lib/python3.8/threading.py, line 302, in wait (self=<Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote 0x7f3c2f445c60>, _waiters=<collections.deque at remote 0x7f3c2f45b520>) at remote 0x7f3c2f445af0>, timeout=None, waiter=<_thread.lock at remote 0x7f3c301be270>, saved_state=None, gotit=False)
waiter.acquire()
(gdb) py-locals
self = <Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote 0x7f3c2f445c60>, _waiters=<collections.deque at remote 0x7f3c2f45b520>) at remote 0x7f3c2f445af0>
timeout = None
waiter = <_thread.lock at remote 0x7f3c301be270>
saved_state = None
gotit = False

--

#16 Frame 0x7f3c2f45d040, for file /usr/lib/python3.8/threading.py, line 558, in wait (self=<Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote 0x7f3c2f445c60>, _waiters=<collections.deque at remote 0x7f3c2f45b520>) at remote 0x7f3c2f445af0>, _flag=False) at remote 0x7f3c2f445f10>, timeout=None, signaled=False)
signaled = self._cond.wait(timeout)
(gdb) py-locals
self = <Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote 0x7f3c2f445c60>, _waiters=<collections.deque at remote 0x7f3c2f45b520>) at remote 0x7f3c2f445af0>, _flag=False) at remote 0x7f3c2f445f10>
timeout = None
signaled = False

--

#22 Frame 0xf994e0, for file /usr/lib/python3.8/concurrent/futures/_base.py, line 1268, in as_completed (fs={<Future(_condition=<Condition(_lock=<_thread.RLock at remote 0x7f3c2f445270>, acquire=<built-in method acquire of _thread.RLock object at remote 0x7f3c2f445270>, release=<built-in method release of _thread.RLock object at remote 0x7f3c2f445270>, _release_save=<built-in method _release_save of _thread.RLock object at remote 0x7f3c2f445270>, _acquire_restore=<built-in method _acquire_restore of _thread.RLock object at remote 0x7f3c2f445270>, _is_owned=<built-in method _is_owned of _thread.RLock object at remote 0x7f3c2f445270>, _waiters=<collections.deque at remote 0x7f3c2f45b100>) at remote 0x7f3c2f445580>, _state='RUNNING', _result=None, _exception=None, _waiters=[<_AsCompletedWaiter(event=<Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote ...(truncated)
(gdb) py-locals
fs = {<Future(_condition=<Condition(_lock=<_thread.RLock at remote 0x7f3c2f445270>, acquire=<built-in method acquire of _thread.RLock object at remote 0x7f3c2f445270>, release=<built-in method release of _thread.RLock object at remote 0x7f3c2f445270>, _release_save=<built-in method _release_save of _thread.RLock object at remote 0x7f3c2f445270>, _acquire_restore=<built-in method _acquire_restore of _thread.RLock object at remote 0x7f3c2f445270>, _is_owned=<built-in method _is_owned of _thread.RLock object at remote 0x7f3c2f445270>, _waiters=<collections.deque at remote 0x7f3c2f45b100>) at remote 0x7f3c2f445580>, _state='RUNNING', _result=None, _exception=None, _waiters=[<_AsCompletedWaiter(event=<Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote 0x7f3c2f445c60>, _waiters=<collections.deque at remote 0x7f3c2f45b520>) at remote 0x7f3c2f445af0>, _flag...(truncated)
timeout = None
total_futures = 2
finished = []
pending = {<Future(_condition=<Condition(_lock=<_thread.RLock at remote 0x7f3c2f445270>, acquire=<built-in method acquire of _thread.RLock object at remote 0x7f3c2f445270>, release=<built-in method release of _thread.RLock object at remote 0x7f3c2f445270>, _release_save=<built-in method _release_save of _thread.RLock object at remote 0x7f3c2f445270>, _acquire_restore=<built-in method _acquire_restore of _thread.RLock object at remote 0x7f3c2f445270>, _is_owned=<built-in method _is_owned of _thread.RLock object at remote 0x7f3c2f445270>, _waiters=<collections.deque at remote 0x7f3c2f45b100>) at remote 0x7f3c2f445580>, _state='RUNNING', _result=None, _exception=None, _waiters=[<_AsCompletedWaiter(event=<Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote 0x7f3c2f445c60>, _waiters=<collections.deque at remote 0x7f3c2f45b520>) at remote 0x7f3c2f445af0>, _flag...(truncated)
waiter = <_AsCompletedWaiter(event=<Event(_cond=<Condition(_lock=<_thread.lock at remote 0x7f3c2f445c60>, acquire=<built-in method acquire of _thread.lock object at remote 0x7f3c2f445c60>, release=<built-in method release of _thread.lock object at remote 0x7f3c2f445c60>, _waiters=<collections.deque at remote 0x7f3c2f45b520>) at remote 0x7f3c2f445af0>, _flag=False) at remote 0x7f3c2f445f10>, finished_futures=[], lock=<_thread.lock at remote 0x7f3c2f445e10>) at remote 0x7f3c2f445df0>
wait_timeout = None

--

#26 Frame 0x7f3c2f457220, for file /usr/local/lib/python3.8/dist-packages/urlwatch/worker.py, line 45, in run_parallel (items=<generator at remote 0x7f3c2f4e7f20>)
for future in concurrent.futures.as_completed(executor.submit(func, item) for item in items)
(gdb) py-locals
items = <generator at remote 0x7f3c2f4e7f20>

--

#30 Frame 0x7f3c2f456440, for file /usr/local/lib/python3.8/dist-packages/urlwatch/worker.py, line 571, in run_jobs (jobs=[<UrlJob(cookies=None, data=None, method='GET', ssl_no_verify=None, ignore_cached=None, http_proxy=None, https_proxy=None, headers=None, encoding=None, timeout=None, ignore_timeout_errors=None, ignore_too_many_redirects=None, max_tries=None, diff_tool=None, compared_versions=None, url='https://...., ignore_connection_errors=True, ignore_http_error_codes='500, 502, 504', name='...', filter=[{'css': 'ul#scn'}, {'html2text': {'ul_item_mark': '-', 'strong_mark': ''}}]) at remote 0x7f3c2f481eb0>, <UrlJob(cookies=None, data=None, method='GET', ssl_no_verify=None, ignore_cached=None, http_proxy=None, https_proxy=None, headers=None, encoding=None, timeout=None, ignore_timeout_errors=None, ignore_too_many_redirects=None, max_tries=None, diff_tool=None, compared_versions=None, url='https://...', ignore_connection_errors=Tru...(truncated)
(gdb) py-locals
jobs = [<UrlJob(cookies=None, data=None, method='GET', ssl_no_verify=None, ignore_cached=None, http_proxy=None, https_proxy=None, headers=None, encoding=None, timeout=None, ignore_timeout_errors=None, ignore_too_many_redirects=None, max_tries=None, diff_tool=None, compared_versions=None, url='https://...', ignore_connection_errors=True, ignore_http_error_codes='500, 502, 504', name='...', filter=[{'css': 'ul#scn'}, {'html2text': {'ul_item_mark': '-', 'strong_mark': '
'}}]) at remote 0x7f3c2f481eb0>, <UrlJob(cookies=None, data=None, method='GET', ssl_no_verify=None, ignore_cached=None, http_proxy=None, https_proxy=None, headers=None, encoding=None, timeout=None, ignore_timeout_errors=None, ignore_too_many_redirects=None, max_tries=None, diff_tool=None, compared_versions=None, url='https://...', ignore_connection_errors=True, ignore_http_error_codes='500, 502, 504', name='...', filter=[{'css': 'ul#of'}, {'html2text': {'ul_item_mar...(truncated)
report = <Report(config={'display': {'new': True, 'error': True, 'unchanged': False}, 'report': {'text': {'line_length': 75, 'details': True, 'footer': False, 'minimal': False}, 'markdown': {'details': True, 'footer': True, 'minimal': False}, 'html': {'diff': 'unified'}, 'stdout': {'enabled': True, 'color': True}, 'email': {'enabled': True, 'html': True, 'to': '..., 'from': '...', 'subject': '{count} changes: {jobs}', 'method': 'smtp', 'smtp': {'host': '...', 'user': '...', 'port': 587, 'starttls': True, 'auth': True, 'insecure_password': ''}, 'sendmail': {'path': 'sendmail'}}, 'pushover': {'enabled': False, 'app': '', 'device': None, 'sound': 'spacealarm', 'user': '', 'priority': 'normal'}, 'pushbullet': {'enabled': False, 'api_key': ''}, 'telegram': {'enabled': False, 'bot_token': '', 'chat_id': ''}, 'slack': {'enabled': True, 'webhook_url': 'https://.../slack.php'}, 'matrix': {'enabled': False, 'homeserver': '', 'access_t...(truncated)

@thp
Copy link
Owner

thp commented Sep 6, 2020

Can you post the backtraces of each thread? Chances are it's just a network request that doesn't have a timeout set or has a timeout that has not expired.

@scottmac
Copy link
Contributor Author

scottmac commented Sep 6, 2020

I forgot to get a thread dump, happens once a week so I'll grab the next.

I've not explicitly set a timeout so assumed it was the default 60s. All of my requests are to localhost which then fetches and re-formats some external content.

@scottmac
Copy link
Contributor Author

Got another, looks like both threads are trying to import html2text

(gdb) info threads
  Id   Target Id                                     Frame 
* 1    Thread 0x7f7f1d98b740 (LWP 804466) "urlwatch" futex_abstimed_wait_cancelable (private=0, abstime=0x0, clockid=0, expected=0, 
    futex_word=0x1d51640) at ../sysdeps/nptl/futex-internal.h:320
  2    Thread 0x7f7f1aa90700 (LWP 804472) "urlwatch" 0x000000000056c201 in _Py_DECREF (filename=<synthetic pointer>, lineno=2979, op=140183884859136)
    at ../Include/object.h:470
  3    Thread 0x7f7f1a27b700 (LWP 804474) "urlwatch" futex_abstimed_wait_cancelable (private=0, abstime=0x0, clockid=0, expected=0, 
    futex_word=0x7f7f0c095550) at ../sysdeps/nptl/futex-internal.h:320
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f7f1aa90700 (LWP 804472))]
(gdb) py-bt
Traceback (most recent call first):
  File "<frozen importlib._bootstrap>", line 75, in has_deadlock
  File "<frozen importlib._bootstrap>", line 93, in acquire
  File "<frozen importlib._bootstrap>", line 202, in _lock_unlock_module
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/html2txt.py", line 62, in html2text
    import html2text
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/filters.py", line 536, in filter
    def filter(self, data, subfilter):
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/filters.py", line 145, in process
    return filtercls(state.job, state).filter(data, subfilter)
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/handler.py", line 92, in process
    data = FilterBase.process(filter_kind, subfilter, self, data)
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/worker.py", line 59, in <lambda>
    for job_state in run_parallel(lambda job_state: job_state.process(),
  File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python3.8/concurrent/futures/thread.py", line 80, in _worker
    work_item.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f7f1a27b700 (LWP 804474))]
(gdb) py-bt
Traceback (most recent call first):
  <built-in method __enter__ of _thread.lock object at remote 0x7f7f1ab08720>
  File "<frozen importlib._bootstrap>", line 88, in acquire
  File "<frozen importlib._bootstrap>", line 202, in _lock_unlock_module
  <built-in method __import__ of module object at remote 0x7f7f1d3aa0e0>
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1298, in _handle_fromlist
  File "/home/scott/.local/lib/python3.8/site-packages/html2text/__init__.py", line 10, in <module>
    from . import config
  <built-in method exec of module object at remote 0x7f7f1d3aa0e0>
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 1039, in exec_module
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/html2txt.py", line 62, in html2text
    import html2text
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/filters.py", line 536, in filter
    def filter(self, data, subfilter):
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/filters.py", line 145, in process
    return filtercls(state.job, state).filter(data, subfilter)
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/handler.py", line 92, in process
    data = FilterBase.process(filter_kind, subfilter, self, data)
  File "/usr/local/lib/python3.8/dist-packages/urlwatch/worker.py", line 59, in <lambda>
    for job_state in run_parallel(lambda job_state: job_state.process(),
  File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python3.8/concurrent/futures/thread.py", line 80, in _worker
    work_item.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()

@scottmac scottmac changed the title urlwatch sporadically hanging module imports aren't thread safe Sep 11, 2020
@scottmac
Copy link
Contributor Author

The deadlock in the deadlock detection itself is a bug in python, see python/cpython#17518

I'm not sure what the behaviour we can expect once this is fixed though.

@thp
Copy link
Owner

thp commented Dec 11, 2020

@scottmac Can you check if Git master branch fixes this for you? Imports should now happen at startup instead of on-demand when the module would first be needed (in a worker thread).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants