New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PseudoFixtureDef reference cycle. #3249

Merged
merged 5 commits into from Feb 22, 2018

Conversation

Projects
None yet
4 participants
@a-feld
Contributor

a-feld commented Feb 22, 2018

Hi there! Apologies in advance for the long-winded description!

I was working on a talk for a local Python meetup on reference cycles in Python and noticed that a reference cycle was generated from within pytest which was causing test failures in my example code.

Here's the fixture I was running:

import pytest
import gc


@pytest.fixture(autouse=True)
def garbage_detect(request):
    # Clean up all objects
    gc.set_debug(0)
    gc.collect()

    # Save garbage and run test
    gc.set_debug(gc.DEBUG_SAVEALL)
    yield

    # Invoke the garbage collector
    gc.collect()

    # Check that there is no garbage
    garbage_expected = request.node.get_marker('garbage')

    if not garbage_expected:
        assert len(gc.garbage) == 0

The fixture is checking that garbage is not generated by the code under test. I was hitting test failures due to a reference cycle generated by the insertion of the request fixture into garbage_detect.

Here's an overview of the cycle
image

The PseudoFixtureDef class becomes garbage when the reference to the generated class is discarded in pytest_fixture_setup (details in the commit message).

This PR updates the implementation to avoid the situation where PseudoFixtureDef becomes garbage by using a class instance. I've also included a test which exposes the leaked PseudoFixtureDef types.

This isn't really a problem/bug per-se since the garbage collector will take care of this but I figured I'd open a PR to fix the issue anyway.

Fix reference cycle caused by PseudoFixtureDef.
Python types have reference cycles to themselves when they are created. This is
partially caused by descriptors which get / set values from the __dict__
attribute for getattr / setattr on classes.

This is not normally an issue since types tend to remain referenced for the
lifetime of the Python process (and thus never become garbage).

However, in the case of PseudoFixtureDef, the class is generated in
_get_active_fixturedef and later discarded when pytest_fixture_setup returns.
As a result, the generated PseudoFixtureDef type becomes garbage.

This is not really a performance issue but it can lead to some problems when
making tests and assertions about garbage when using pytest.

This garbage creation problem can be rectified by returning a namedtuple
instance which is functionally the same. In the modified code, the namedtuple
is allocated / deallocated using reference counting rather than having to use
the garbage collector.

@a-feld a-feld changed the title from Fix reference cycle caused by PseudoFixtureDef class creation. to Fix PseudoFixtureDef reference cycle. Feb 22, 2018

@coveralls

This comment has been minimized.

coveralls commented Feb 22, 2018

Coverage Status

Coverage increased (+0.09%) to 92.593% when pulling 4854876 on a-feld:request-fixture-reference-cycle into dd97c94 on pytest-dev:master.

@RonnyPfannschmidt

overall great job in finding this and providing a working solution

@@ -23,6 +23,8 @@
)
from _pytest.outcomes import fail, TEST_OUTCOME
PseudoFixtureDef = namedtuple('PseudoFixtureDef', ('cached_result', 'scope'))

This comment has been minimized.

@RonnyPfannschmidt

RonnyPfannschmidt Feb 22, 2018

Member

please use a frozen attrs class here,

This comment has been minimized.

@a-feld

a-feld Feb 22, 2018

Contributor

Hi @RonnyPfannschmidt , thanks for the feedback! Can you elaborate on the use of a frozen attr class? Do you mean using a class with __slots__ attribute set?

@@ -24,6 +24,10 @@
from _pytest.outcomes import fail, TEST_OUTCOME
class PseudoFixtureDef(object):

This comment has been minimized.

@RonnyPfannschmidt

RonnyPfannschmidt Feb 22, 2018

Member

im sorry for not being precise enough, by frozen attrs class i meant http://www.attrs.org/en/stable/api.html#influencing-initialization - the frozen=True example

This comment has been minimized.

@a-feld

a-feld Feb 22, 2018

Contributor

Ah, ok. Will do! Thanks for the clarification!

@a-feld

This comment has been minimized.

Contributor

a-feld commented Feb 22, 2018

@RonnyPfannschmidt Updated to use a frozen attr class! Thanks again for that pointer!

@RonnyPfannschmidt

This comment has been minimized.

Member

RonnyPfannschmidt commented Feb 22, 2018

@a-feld thanks for quickly and swiftly following up with my comments, its always a extra pleasure to observe a quick and precise followup

@nicoddemus

This comment has been minimized.

Member

nicoddemus commented Feb 22, 2018

@a-feld thanks for the PR! I didn't get a chance to look at it in detail, but I wonder if this fixes #2798?

@a-feld

This comment has been minimized.

Contributor

a-feld commented Feb 22, 2018

Hi @nicoddemus , I took a look at #2798 and I don't believe this issue is related; however, I think I have a pretty good theory for what's going on in #2798 !

I believe #2798 is related to the lifetime of tracebacks. In CPython, tracebacks hold references to their frames so if a frame (at any level in the traceback) stores the traceback in a variable, it creates a reference cycle. Python also stores references to the last traceback in sys.last_traceback which can cause the frame's lifetime to extend past what would be expected.

Here's some example code illustrating this issue:

import sys
import traceback
import weakref

class Obj:
    pass

ref = None

def test1():
    obj = Obj()
    global ref
    ref = weakref.ref(obj)
    assert 0

def test2():
    assert ref() is None, ref


def run():
    TESTS = (test1, test2)

    for test in TESTS:
        try:
            test()
        except:
            print("FAIL: %s" % str(test))
            # A reference cycle is created here because the run frame holds a
            # reference to the traceback which holds a reference to run
            # (through test1)
            _, _, tb = sys.exc_info()
            traceback.print_tb(tb)


if __name__ == '__main__':
    run()

In the example code above, both tests will fail because of the reference to tb inside of run. To fix this issue, the tb variable must be explicitly reassigned to None from within the run frame prior to executing test2. The traceback is also implicitly held in sys.last_traceback so that can be cleared out by raising and catching a dummy exception (there has to be a better way for this, but it works for now)

Fixed code:

import sys
import traceback
import weakref

class Obj:
    pass

ref = None

def test1():
    obj = Obj()
    global ref
    ref = weakref.ref(obj)
    assert 0

def test2():
    assert ref() is None, ref


def run():
    TESTS = (test1, test2)

    for test in TESTS:
        try:
            test()
        except:
            print("FAIL: %s" % str(test))
            # A reference cycle is created here because the run frame holds a
            # reference to the traceback which holds a reference to run
            # (through test1)
            _, _, tb = sys.exc_info()
            traceback.print_tb(tb)

            # Remove the frame reference cycle
            tb = None
        
        # This removes the test1 traceback from sys.last_traceback
        try:
            raise Exception
        except:
            pass


if __name__ == '__main__':
    run()

I'd be happy to investigate and hopefully root cause the issue sometime this weekend! I'm happy to contribute in any way I can. I use PyTest a lot (it's awesome!) and I'm more than happy to devote some time to this issue!

@nicoddemus

This comment has been minimized.

Member

nicoddemus commented Feb 22, 2018

Great @a-feld, thanks a lot for taking the time to investigate this!

The traceback is also implicitly held in sys.last_traceback so that can be cleared out by raising and catching a dummy exception (there has to be a better way for this, but it works for now)

It seems sys.last_traceback is writable, perhaps just sys.last_traceback = None will do the trick?

I'd be happy to investigate and hopefully root cause the issue sometime this weekend! I'm happy to contribute in any way I can. I use PyTest a lot (it's awesome!) and I'm more than happy to devote some time to this issue!

That would be great, we really appreciate the help!

@nicoddemus nicoddemus merged commit bedceaa into pytest-dev:master Feb 22, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@a-feld a-feld deleted the a-feld:request-fixture-reference-cycle branch Feb 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment