New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: reference cycle in np.vectorize #11977

Merged
merged 5 commits into from Jan 9, 2019

Conversation

Projects
None yet
4 participants
@mattip
Copy link
Member

mattip commented Sep 18, 2018

Issue #11867 demonstrated a leak when using np.vectorize on a bound class method in CPython 3. The test demonstrates the issue.

Still WIP since I have not found the root cause of the leak.

Edit: adding GC_Track to ufunc allows the garbage collector to break the cycle

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Sep 18, 2018

My guess would be that the problem is np.ufunc not implementing a traverse method to tell the GC that it holds a cycle

@mattip mattip changed the title WIP, BUG: reference cycle in np.vectorize BUG: reference cycle in np.vectorize Sep 18, 2018

@mattip mattip force-pushed the mattip:vectorize-refcount branch from df298f9 to 7a57924 Sep 22, 2018

@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Sep 25, 2018

ping

def test_leaks(self):
# gh-11867
# np.frompyfunc(<class method>) creates a reference cycle
# that requires a gc collection cycle to break (on CPython 3)

This comment has been minimized.

@eric-wieser

eric-wieser Sep 25, 2018

Member

Have we worked out why this is the case? Am I right in thinking that this patch changes it from being unbreakable to just requiring a gc to break?

This comment has been minimized.

@mattip

mattip Sep 25, 2018

Author Member

Correct. I do not really understand the root cause and why only in this specific case

@mattip mattip force-pushed the mattip:vectorize-refcount branch from 7a57924 to f055831 Sep 25, 2018

@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Sep 25, 2018

Squashed to a single commit.

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Sep 26, 2018

I think the commit message is wrong - you didn't break the reference cycle, you simply allowed the GC to find it (and break it on the python side)

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Sep 26, 2018

I wonder if you can use assert_no_gc_cycles here, which prints out the cycle upon failure, if possible

@mattip mattip force-pushed the mattip:vectorize-refcount branch 2 times, most recently from a358d0d to 8f75986 Oct 10, 2018

@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Oct 10, 2018

fixed first commit message, reworked test to focus on class.atrribute = frompyfunc(class.method) where the reference cycle is actually created

@mattip mattip force-pushed the mattip:vectorize-refcount branch from 8f75986 to 0794da8 Oct 10, 2018

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Nov 10, 2018

Would be good to finish this up

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Nov 10, 2018

tp_traverse will need minor changes to accommodate the extra reference added in #8955 - I suppose the PR of the two that doesn't go in first will need updating.

Fix will be as simple as:

if (ufunc->identity == PyUFunc_IdentityValue) {
    Py_VISIT(ufunc->identity_value);
}
@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Nov 10, 2018

merging with master to resolve conflicts

@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Nov 11, 2018

i will wait for #8955 and then rework this PR

@mattip mattip force-pushed the mattip:vectorize-refcount branch 2 times, most recently from d1279ce to b2c7901 Dec 4, 2018

@mattip mattip force-pushed the mattip:vectorize-refcount branch from b2c7901 to b91a047 Dec 4, 2018

@mattip mattip force-pushed the mattip:vectorize-refcount branch from b91a047 to f73d96d Dec 18, 2018

@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Dec 18, 2018

rebased at git merge-base master maintenance/1.16.x. Would it be possible to get this into 1.16?

return np.int(0)

def npy_bound(self, a):
return types.MethodType(np.frompyfunc(self.bound, 1, 1), a)

This comment has been minimized.

@eric-wieser

eric-wieser Dec 18, 2018

Member

I'm very confused by what exactly we're trying to test here.

This comment has been minimized.

@mattip

mattip Dec 18, 2018

Author Member

self.bound is part of a reference cycle that requires a gc collection cycle to break. I added more clarification in the test docstring. Improvements welcome

This comment has been minimized.

@seberg

seberg Jan 7, 2019

Member

The PR looks good to me, if nobody complains will merge in a bit probably. The test couldbe a bit more obvious to read, or maybe add a trivial one:

def func(a):
    return 0
func.vectorized = np.frompyfunc(func)

giving a minimal cycle. The other part would probably be simpler by making it a realistic use case:

class A(object):
    def __init__(self):
        # Wrap bound_method into a ufunc with `np.frompyfunc`. This creates
        # a reference cycle:
        #     self.method -> self   # since it is a bound method (knows "self")
        #     self -> self.wrapped_method  # Set here
        #     self.wrapped_method -> self.method  # Stores self.method to call it
        self.wrapped_method = np.frompyfunc(self.method)

    def method(self):
        return 0

This comment has been minimized.

@seberg

seberg Jan 9, 2019

Member

Ah, I guess that breaks the parameterization. Although, I do not think you need the types.MethodType if the just do the self.bound_vectorized = np.frompyfunc(self.bound, 1, 1) before returning it?

This comment has been minimized.

@mattip

mattip Jan 9, 2019

Author Member

I'm not following. Where would I add that line?

This comment has been minimized.

@seberg

seberg Jan 9, 2019

Member

Sorry, I think what confuses me is that I believe you do not need this at all. You can simply remove the npy_bound and npy_unbound logic and just make it a.f = np.frompyfunc(func, 1, 1).

This comment has been minimized.

@mattip

mattip Jan 9, 2019

Author Member

Adopted

@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Dec 18, 2018

should be squash-merged or I can squash and force-push

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 18, 2018

If you want it in 1.16, use the 1.16.1 milestone.

@mattip

This comment has been minimized.

Copy link
Member Author

mattip commented Dec 31, 2018

@seberg

seberg approved these changes Jan 9, 2019

Show resolved Hide resolved numpy/core/src/umath/ufunc_object.c Outdated
return np.int(0)

def npy_bound(self, a):
return types.MethodType(np.frompyfunc(self.bound, 1, 1), a)

This comment has been minimized.

@seberg

seberg Jan 9, 2019

Member

Ah, I guess that breaks the parameterization. Although, I do not think you need the types.MethodType if the just do the self.bound_vectorized = np.frompyfunc(self.bound, 1, 1) before returning it?

mattip added some commits Jan 9, 2019

@seberg

This comment has been minimized.

Copy link
Member

seberg commented Jan 9, 2019

Thanks @mattip lets put it in before we find something else to nitpick about ;).

@eric-wieser
Copy link
Member

eric-wieser left a comment

Test now makes sense to me too - thanks for persisting

@seberg

This comment has been minimized.

Copy link
Member

seberg commented Jan 9, 2019

Cool, mac OS job is expected to fail. Next thing will be the object arrays ;).

@seberg seberg merged commit f5b6850 into numpy:master Jan 9, 2019

8 checks passed

LGTM analysis: C/C++ No alert changes
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No alert changes
Details
Shippable Run 2278 status is SUCCESS.
Details
ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 85.71% of diff hit (target 84.95%)
Details
codecov/project 85.72% (+0.76%) compared to 982f12b
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment