Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Deadlock due to __del__() and ThreadLogObserver queue #25

Closed
rmorehead opened this Issue · 7 comments

2 participants

Rod Morehead Itamar Turner-Trauring
Rod Morehead

Crochet deadlock can happen when overridden __del__() method logs while current thread already holds the ThreadLogObserver queue lock.

Analysis: __del__() methods can be called from thread at any time due to garbage collection timing. If the del method attempts to acquire a non-recursive lock that the current thread already has locked, deadlock will occur. In Twisted classes like Deferred override __del__() and can log from the __del__() method.

Key part of Backtrace is as follows:

File: '/home/rmore/work/rmorehea-rcdn-testtng/virtualenv/local/lib/python2.7/site-packages/twisted/python/threadable.py', line 53, in sync
  return function(self, *args, **kwargs)
File: '/home/rmore/work/rmorehea-rcdn-testtng/virtualenv/local/lib/python2.7/site-packages/twisted/python/log.py', line 191, in msg
  self.observers[i](actualEventDict)
File: '/home/rmore/work/rmorehea-rcdn-testtng/virtualenv/local/lib/python2.7/site-packages/crochet/_eventloop.py', line 288, in __call__
  self._queue.put(msg)
File: '/usr/lib/python2.7/Queue.py', line 138, in put
  self.not_empty.notify()
File: '/usr/lib/python2.7/threading.py', line 392, in notify
  for waiter in waiters:
File: '/home/rmore/work/rmorehea-rcdn-testtng/virtualenv/local/lib/python2.7/site-packages/twisted/internet/defer.py', line 683, in __del__
  log.msg("Unhandled error in Deferred:", isError=True)
File: '/home/rmore/work/rmorehea-rcdn-testtng/virtualenv/local/lib/python2.7/site-packages/twisted/python/threadable.py', line 53, in sync
  return function(self, *args, **kwargs)
File: '/home/rmore/work/rmorehea-rcdn-testtng/virtualenv/local/lib/python2.7/site-packages/twisted/python/log.py', line 191, in msg
  self.observers[i](actualEventDict)
File: '/home/rmore/work/rmorehea-rcdn-testtng/virtualenv/local/lib/python2.7/site-packages/crochet/_eventloop.py', line 288, in __call__
  self._queue.put(msg)
File: '/usr/lib/python2.7/Queue.py', line 118, in put
  self.not_full.acquire()
Itamar Turner-Trauring
Owner

Thanks for the bug report.

My first thought is "version of Queue that uses RLock".

Rod Morehead

I initially thought that would be enough, but the problem is that the del is called from the same thread, while another Queue operation could be in flight. The RLock acquire would succeed (because the thread is the same) but it could result in out of sync Queue fields depending on ordering (counts for example).

Itamar Turner-Trauring
Owner

http://bugs.python.org/issue13697 suggests RLock won't help.

Itamar Turner-Trauring
Owner

The issue with Queue is reported to Python at http://bugs.python.org/issue14976

Itamar Turner-Trauring
Owner

A completely different solution is to use the self-pipe trick. The easiest way to do that is to run another reactor in the writing thread, and then use its reactor.callFromThread method to schedule stuff from the main reactor thread. This assumes the relevant code paths have been sufficiently weaned off the use of global reactor; given Twisted has many unit tests at this point that create new reactors for each test that is probably not going to be a problem.

Rod Morehead

The code I submitted only uses RLock() so that it can be determined if the Queue() is already locked by the current thread.

I also now don't think it is GC, because GC won't dispose of object's with __del__() method, so I think it is reference counting only, combined with a misleading stack trace due to the trace indicating the last line executed, not what stack context is being pop'd that is resulting in the current decref.

I do agree that there is a fundamental issue, but I'm not 100% sure I understand the exact scenario/sequence of events.

Itamar Turner-Trauring
Owner

GC can dispose of stuff with __del__. E.g. A points at itself, so circular reference, and A points at B which has __del__. GC can also do it due to weakrefs. And I've reproduced it with signal handling.

Itamar Turner-Trauring itamarst closed this in 1b3fda7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.