Fix PseudoFixtureDef reference cycle. #3249
Hi there! Apologies in advance for the long-winded description!
I was working on a talk for a local Python meetup on reference cycles in Python and noticed that a reference cycle was generated from within pytest which was causing test failures in my example code.
Here's the fixture I was running:
import pytest import gc @pytest.fixture(autouse=True) def garbage_detect(request): # Clean up all objects gc.set_debug(0) gc.collect() # Save garbage and run test gc.set_debug(gc.DEBUG_SAVEALL) yield # Invoke the garbage collector gc.collect() # Check that there is no garbage garbage_expected = request.node.get_marker('garbage') if not garbage_expected: assert len(gc.garbage) == 0
The fixture is checking that garbage is not generated by the code under test. I was hitting test failures due to a reference cycle generated by the insertion of the
This PR updates the implementation to avoid the situation where
This isn't really a problem/bug per-se since the garbage collector will take care of this but I figured I'd open a PR to fix the issue anyway.
Python types have reference cycles to themselves when they are created. This is partially caused by descriptors which get / set values from the __dict__ attribute for getattr / setattr on classes. This is not normally an issue since types tend to remain referenced for the lifetime of the Python process (and thus never become garbage). However, in the case of PseudoFixtureDef, the class is generated in _get_active_fixturedef and later discarded when pytest_fixture_setup returns. As a result, the generated PseudoFixtureDef type becomes garbage. This is not really a performance issue but it can lead to some problems when making tests and assertions about garbage when using pytest. This garbage creation problem can be rectified by returning a namedtuple instance which is functionally the same. In the modified code, the namedtuple is allocated / deallocated using reference counting rather than having to use the garbage collector.
I believe #2798 is related to the lifetime of tracebacks. In CPython, tracebacks hold references to their frames so if a frame (at any level in the traceback) stores the traceback in a variable, it creates a reference cycle. Python also stores references to the last traceback in
Here's some example code illustrating this issue:
import sys import traceback import weakref class Obj: pass ref = None def test1(): obj = Obj() global ref ref = weakref.ref(obj) assert 0 def test2(): assert ref() is None, ref def run(): TESTS = (test1, test2) for test in TESTS: try: test() except: print("FAIL: %s" % str(test)) # A reference cycle is created here because the run frame holds a # reference to the traceback which holds a reference to run # (through test1) _, _, tb = sys.exc_info() traceback.print_tb(tb) if __name__ == '__main__': run()
In the example code above, both tests will fail because of the reference to
import sys import traceback import weakref class Obj: pass ref = None def test1(): obj = Obj() global ref ref = weakref.ref(obj) assert 0 def test2(): assert ref() is None, ref def run(): TESTS = (test1, test2) for test in TESTS: try: test() except: print("FAIL: %s" % str(test)) # A reference cycle is created here because the run frame holds a # reference to the traceback which holds a reference to run # (through test1) _, _, tb = sys.exc_info() traceback.print_tb(tb) # Remove the frame reference cycle tb = None # This removes the test1 traceback from sys.last_traceback try: raise Exception except: pass if __name__ == '__main__': run()
I'd be happy to investigate and hopefully root cause the issue sometime this weekend! I'm happy to contribute in any way I can. I use PyTest a lot (it's awesome!) and I'm more than happy to devote some time to this issue!
Great @a-feld, thanks a lot for taking the time to investigate this!
That would be great, we really appreciate the help!