Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CallbackRegistry fix #4118

Merged
merged 6 commits into from Mar 12, 2015
Merged

Conversation

OceanWolf
Copy link
Contributor

As weakref.ref() does not accept bound or unbound functions, this means that the current if func in self._func_cid_map[s]: fails in these cases, and allows (s, func) pairs to get added many times. _BoundMethodProxy fixed this problem for the CallbackRegistry.callbacks dict, but not for the CallbackRegistry._func_cid_map, here we correct this.

While researching the bug (and how to fix it), I discovered that weakref.ref() also supports a callback for when the weakref dies. This seems like a much better way to do garbage collection then at present as CallbackRegistry only does GC only on the specific signal as it gets processed, so potentially quite late, this I fixed in the second commit.

@OceanWolf
Copy link
Contributor Author

As a quick test to see what goes on, try this code:

from matplotlib.cbook import CallbackRegistry

s = 'say_hello'

class Foo(object):
    def __init__(self):
        self._callbacks = CallbackRegistry()

    def connect(self, s, func):
        return self._callbacks.connect(s, func)

class Bar(object):
    def add_cbk(self, foo):
        cid = foo.connect(s, self.talk)
        print('Callback id:', cid)

    def talk(self):
        print('Hello World')

a, b = Foo(), Bar()

# Test CallbackRegistry.connect()
b.add_cbk(a) # Should print 1
b.add_cbk(a) # Should *also* print 1
print(a._callbacks.callbacks) # Should only contain the one item

# Test Garbage Collection
del b
print(a._callbacks._func_cid_map[s].keys()) # Should give an empty list
print(a._callbacks.callbacks) # Should contain no callbacks

print('Done')

@tacaswell tacaswell added this to the next point release milestone Feb 18, 2015
@tacaswell
Copy link
Member

On thing that seems sort of funny to me is that the destruction call backs can only ever be triggered in one of three cases, but can be added in any of the cases.

attn @mdboom

@tacaswell
Copy link
Member

Sorry, one of two cases

del self.callbacks[s][cid]
else:
proxy(*args, **kwargs)
proxy(*args, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the exceptions raised above, I would still wrap this in a

try ReferenceException:
    proxy(*args, **kwargs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at where the ReferenceError in _BoundMethodProxy.__call__ gets raised, it only occurs if the method if the weakref has died. The improved GC auto deletes all traces of the proxy from CallbackRegistry when this happens. and thus it should never raise such an error here, unless you think a thread could spark a race-condition (I have very limited knowledge of threads and thread safety).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err, that should have read

try:
    proxy(*args, **kwargs)
except ReferenceException:
    self.remove_proxy(proxy)

Anything involving an event loop makes me twitchy. The cost of the try is minimal and may prevent inscrutable impossible to reproduce errors later. In this case I would rather be safe than sorry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cough ReferenceError.

And I think I should make remove_proxy private, a la _remove_proxy as we only every use the concept of proxies internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making it _remove_proxy sounds like a good idea to me.

@tacaswell
Copy link
Member

Left a few comments trying to understand this code.

Thanks for tackling a rather hairy bit of the code base!

@OceanWolf
Copy link
Contributor Author

My pleasure, not so hairy once one gets to grips the low level python. I especially enjoyed coding in the callbacks into _BoundMethodProxy. As I described to a friend "a callback wrapper containing a callback wrapper to tell the outer callback wrapper that the callback wrapper no longer exists".

@tacaswell what do you mean by "two cases to to add, but only one to destroy"? I only see one case for registering a callback i.e. with cid = (signal, function).

try:
try:
self.inst = ref(cb.im_self)
self.inst = ref(cb.im_self, self._destroy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the life cycle rules on the reference to this object that the ref assigned to self.inst holds?

[edited for clarity]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not crystal clear, I think you ask about the life-cycle of cb.im_self?

In which case, the same life-cycle as if it weren't here, from the definition of a weakref, it doesn't affect the life-cycle of the object it holds, i.e. as soon as the object looses all non-weakref-references it will get destroyed and GC'ed.

from weakref import ref

class Foo(object):
    pass

def make_ref(var):
    print var # 1st print
    r = ref(var)
    print r # 2nd print
    return r

print make_ref(Foo()) # 3rd print

prints

<__main__.Foo object at 0x7f2600f06650>
<weakref at 0x7f2600ef2e68; to 'Foo' at 0x7f2600f06650>
<weakref at 0x7f2600ef2e68; dead>

because var looses scope as soon as the function ends and thus gets destroyed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even fuller example:

from weakref import ref

class Foo(object):
    pass

def say_goodbye(wr):
    print 'Goodbye!', wr

def make_ref(var):
    print var
    r = ref(var, say_goodbye)
    print r
    return r

print make_ref(Foo())
print 'Done'

prints

<__main__.Foo object at 0x7ffe2aef0610>
<weakref at 0x7ffe2aedce68; to 'Foo' at 0x7ffe2aef0610>
Goodbye! <weakref at 0x7ffe2aedce68; dead>
<weakref at 0x7ffe2aedce68; dead>
Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that I am just being dense here. I am worried about the circular reference that the Proxy objects holds a (hard) ref to the weakref and the weakref holds a (I assume) hard ref back to the object. But I think that is ok as if the weakref dies the call-back will be be triggered and presumably released. However in the case where we lose all refs to the Proxy object, there is the 3 element cycle (proxy.inst -> ref -> proxy._destroy). But, again, I think that is the correct behaviour as if you register a call-back function you expect it to always run and keeping the proxy object around in probably the right move.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand you correctly, by object you refer to CallbackRegistry. Proxy holds no hardrefs back to CallbackRegistry, you see in _BoundMethodProxy.add_callback() I used a _BoundMethodProxy, aka weakref back to CallbackRegistry, only level one recursion used here (as we don't add a callback to the callback's callback), so no worries.

In the case where we lose all hardrefs to the object held in the Proxy object, the 3 element-cycle will run, and callback to all the callbacks still there (I added the try except here to deal with that).

In the case where we loose all hardrefs to the proxy object itself, then nothing happens because nothing needs to happen, it has no hardrefs to it so it just gets GC'ed like any other object.

@tacaswell
Copy link
Member

I think I have convinced my self that this is ok

@mdboom ?

for callback in self._callbacks:
try:
callback(self)
except ReferenceError: pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put pass on a newline here?

@mdboom
Copy link
Member

mdboom commented Feb 19, 2015

This all looks good and makes sense to me on reading, aside from my minor nits.

What I'm struggling with is a good suggested test strategy. It would be nice to add some unit tests for this (just at the CallbackRegistry level). Aside from that, I think we should confirm that all of the event_handling examples work now as they did before.

@OceanWolf
Copy link
Contributor Author

As for test strategy, would a modified version of the code above in comment 2 help? Do we do automated testing on all the examples as well?

@WeatherGod
Copy link
Member

I was just reviewing the section on the callback registry in my book when I
came across an important note I made to myself about a subtle intended
behavior. I haven't had time to read through the changes, so I will just
state it here in case it wasn't obvious in the original code. Each unique
combination of callback function and event name gets a unique cid, not each
unique callback function. You should be able to connect the same callback
to multiple events and get different ids for them.

Hopefully, these changes preserved that behavior.

On Thu, Feb 19, 2015 at 6:06 PM, OceanWolf notifications@github.com wrote:

As for test strategy, would a modified version of the code above in
comment 2 help? Do we do automated testing on all the examples as well?


Reply to this email directly or view it on GitHub
#4118 (comment)
.

@OceanWolf
Copy link
Contributor Author

@WeatherGod hehe, yes, I presumed that, I even said so above in reply to your comment ;).

@WeatherGod
Copy link
Member

heh, indeed. Carry on, then!

On Thu, Feb 19, 2015 at 10:45 PM, OceanWolf notifications@github.com
wrote:

@WeatherGod https://github.com/WeatherGod hehe, yes, I presumed that, I
even said so above in reply to your comment ;).


Reply to this email directly or view it on GitHub
#4118 (comment)
.

@OceanWolf
Copy link
Contributor Author

Okay, I would like to get this merged, what do I need to do?

I have started writing some unit tests. Should I also ensure that when a signal no longer has any callbacks attached to it, then the signal gets deleted as well? I.e. in _remove_proxy()

if len(callbacks[s] == 0):
    del  self._func_cid_map[s]
    del self.callbacks[s]

@tacaswell
Copy link
Member

If you are writing tests, might as well add them in this PR as well.

@OceanWolf
Copy link
Contributor Author

Okay I have added tests that now pass, so I think this can get merged. What do others think?

On an aside, it turns out that for python3, besides the lazy GC it worked fine, without the need for proxies, so this can get done away with when matplotlib no longer supports python2.

@tacaswell
Copy link
Member

@OceanWolf So, at the rate of python3 adoption I am seeing in about 10 years? 😈

@tacaswell
Copy link
Member

I am provisionally 👍 on this. I don't think we use class-based test fixtures anywhere else in the tests and a bit concerned about introducing them here.

@OceanWolf
Copy link
Contributor Author

Lol, something like that, ain't that long, only around 2, 2.5 terms of a government, country dependent. Though I would hope a bit sooner, I see the packages that I once saw as non-python3 and releasing a python3 version with six.

Anyway, with the class based test-features, do you mean tests that have classes in them? Because I see them everywhere in mpl, I see around 100 of them:

cd lib/matplotlib/tests
grep "class " * | wc

@WeatherGod
Copy link
Member

I think he is referring to class-based tests for the nosetests. Most of
them are function-based, but we have started adopting class-based ones, and
that is perfectly fine (they are really useful for easy setups and
teardowns and logical grouping of tests).

On Thu, Feb 26, 2015 at 10:49 AM, OceanWolf notifications@github.com
wrote:

Lol, something like that, ain't that long, only around 2, 2.5 terms of a
government, country dependent. Though I would hope a bit sooner, I see the
packages that I once saw as non-python3 and releasing a python3 version
with six.

Anyway, with the class based test-features, do you mean tests that have
classes in them? Because I see them everywhere in mpl, I see around 100 of
them:

cd lib/matplotlib/tests
grep "class " * | wc


Reply to this email directly or view it on GitHub
#4118 (comment)
.

@tacaswell
Copy link
Member

Most of our tests are functions that nose finds, rather than test classes which have test methods hanging off of them. It looks like we have 4 of them:

✔ ~/other_source/matplotlib/lib/matplotlib/tests [master L|✔] 
10:51 $ ack-grep -i 'class test'
test_legend.py
171:class TestLegendFunction(object):

test_transforms.py
359:class TestTransformPlotInterface(unittest.TestCase):

test_cbook.py
46:class Test_delete_masked_points(object):
98:class Test_boxplot_stats(object):

It looks like test_mlab uses the unit-test style tests (which I didn't even know we had).

tacaswell added a commit that referenced this pull request Mar 12, 2015
BUG : CallbackRegistry fix

Correctly proxy instance methods.
@tacaswell tacaswell merged commit c1a749a into matplotlib:master Mar 12, 2015
@tacaswell
Copy link
Member

Thank you!

@OceanWolf OceanWolf deleted the callback-registry-fix branch July 25, 2015 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants