Make provided/implementedBy and adapter registries respect super(). #181

jamadden · 2020-03-05T16:29:03Z

The query functions now start by looking at the next class in the MRO (interfaces directly provided by the underlying object are not found).

Adapter registries automatically pick up providedBy change to start finding the correct implementations of adapters, but to make that really useful they needed to change to unpack super() arguments and pass __self__ to the factory.

Fixes #11

Unfortunately, this makes PyPy unable to build the C extensions.

icemac · 2020-03-06T15:43:20Z

@d-maurer I added you as reviewer as #11 was originally written by you.

d-maurer

I am not sure about the __self__ dereferencing before the objects are passed to the adapter factory. Personally, I would be more comfortable when the objects were passed unchanged.

I am quite sure that the mro should be determined from ob.__self_class__ not from ob.__thisclass__.

CHANGES.rst

src/zope/interface/_zope_interface_coptimizations.c

src/zope/interface/declarations.py

d-maurer · 2020-03-07T10:26:46Z

Jason Madden wrote at 2020-3-7 01:44 -0800:

jamadden commented on this pull request. > + mro = PyObject_CallMethodObjArgs(class_that_invoked_super, strmro, NULL); + _NSCHECK(mro, NULL); + + index = PySequence_Index(mro, class_that_invoked_super); I'm afraid you have that backwards. In `super(Class, self)` `__self_class__` is always the same as `type(self)`. It never changes and is the most derived class in the MRO (index 0). It's `__thisclass__` that changes to reflect where in the MRO the new starting search point actually is.

I am aware of this. That's why I am (quite) sure that what you need is mro = PyObject_CallMethodObjArgs(__self_class__, strmro, NULL); _NSCHECK(mro, NULL); index = PySequence_Index(mro, class_that_invoked_super); With your code, `index` will always be 0 as for any class it is the 0's element of its "mro".

d-maurer · 2020-03-07T10:35:31Z

Jason Madden wrote at 2020-3-7 02:05 -0800:

... It's not particularly useful because the `super` object cannot access instance variables of `self` (only class descriptors/methods), nor (as a consequence) can it discover any interfaces that are `directlyProvidedBy` the `self` object. It has no MRO or type and `super` doesn't work on it, so you'd only be able to try this trick for exactly one level of hierarchy. The receiving factory would need to unwrap `__self__` in short order to do any of those things��and at that point, you're writing an adapter that's customized to know it's working with `super`, which destroys reusability: given `class Base` and `class Derived(Base)` the same adapter can't work for `super(Derived, self)` *and* `Base()`.

I am convinced.

jamadden · 2020-03-08T15:16:36Z

CPython 3.5 is crashing on Travis/Linux, apparently hanging on Travis/Mac, and crashing locally for me on my Mac.

The local crash occurs in test_getMultiAdapter_hit_super (zope.interface.tests.test_registry.ComponentsTests) and looks like an issue computing a hash code (?) when getting a key in a dictionary (compiled with -g -O0:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00007fff67191774 libsystem_malloc.dylib`large_malloc + 89
    frame #1: 0x00007fff67187357 libsystem_malloc.dylib`szone_malloc_should_clear + 235
    frame #2: 0x00007fff6718606a libsystem_malloc.dylib`malloc_zone_malloc + 104
    frame #3: 0x00007fff67185fe5 libsystem_malloc.dylib`malloc + 21
    frame #4: 0x000000010012f012 .Python`_PyObject_Alloc + 881
    frame #5: 0x0000000100120030 .Python`PyLong_FromDouble + 176
    frame #6: 0x00000001026f1f23 _zope_interface_coptimizations.cpython-35m-darwin.so`_lookup(self=0x000000010631cea8, required=0x0000000106322b08, provided=0x00000001060ff9a0, name=0x0000000100553ab0, default_=0x0000000000000000) at _zope_interface_coptimizations.c:1015:12
    frame #7: 0x00000001026f1b59 _zope_interface_coptimizations.cpython-35m-darwin.so`lookup_lookup(self=0x000000010631cea8, args=0x00000001063195e8, kwds=0x0000000000000000) at _zope_interface_coptimizations.c:1060:10
    frame #8: 0x000000010012c1f7 .Python`PyCFunction_Call + 107

where _zope_interface_coptimizations.c:1015 is result = PyDict_GetItem(cache, key);

jamadden · 2020-03-08T16:34:51Z

The crash in 3.5 turned out to be a memory bug in tuple(map(func, it)) when func could return temporary objects. The fix is to replace map(func, it) with [func(i) for i in it]. This turns out to be faster too. [EDIT: That was only a partial fix; see below.]

The temporary objects in question are the objects created for providedBy(super(C, c)). Unlike providedBy(c), those aren't currently cached anywhere else (in an instance or class dict). But they'll wind up cached somewhat invisibly in adapter registries. They'll be found again (hashcodes and eq match up) so we're just doing some extra work computing the RO each time. I'm wondering if we should cache that in the class dict to avoid that.

Right now, I'm thinking leave it simple and revisit if it becomes a bottleneck. [EDIT: It was simple to cache, so I tried that.]

d-maurer

Something is still inconsistent. I set up the following test:

>>> from zope.interface import implementer, Interface, providedBy
>>> 
>>> class IB(Interface): pass
... 
>>> class IM1(Interface): pass
... 
>>> class IM2(Interface): pass
... 
>>> class ID(IB): pass
... 
>>> @implementer(IB)
... class B: pass
... 
>>> @implementer(IM1)
... class M1(B): pass
... 
>>> @implementer(IM2)
... class M2(B): pass
... 
>>> @implementer(ID)
... class D(M1, M2): pass
... 
>>> d = D()
>>> tuple(providedBy(d))
(<InterfaceClass __main__.ID>, <InterfaceClass __main__.IM1>, <InterfaceClass __main__.IB>, <InterfaceClass __main__.IM2>)
>>> tuple(providedBy(super(D, d)))
(<InterfaceClass __main__.IM1>, <InterfaceClass __main__.IB>, <InterfaceClass __main__.IM2>)
>>> tuple(providedBy(super(M1, d)))
(<InterfaceClass __main__.IM2>, <InterfaceClass __main__.IB>)
>>> tuple(providedBy(super(M2, d)))
()
>>> tuple(providedBy(super(B, d)))
()
>>> D.mro()
[<class '__main__.D'>, <class '__main__.M1'>, <class '__main__.M2'>, <class '__main__.B'>, <class 'object'>]

Either tuple(providedBy(super(M1, d))) should not contain IB or tuple(providedBy(super(M2, d))) should contain IB (because M1 and M2 are symmetric with respect to B). Very likely, the latter should be the case, as super(M2, d) is "d viewed as B object" and B implements IB.

I am also worried that the use of tuple(map(func, it)) (over [func(i) for i in it]) could cause a crash. In my view, this indicates that something is deeply wrong and may hit people without the capabilities to find out why and how to work around it. Did the problem only affected Python 3.5 or other Python 3 versions as well (in other works, were the bug in Python 3.5 or somewhere in zope.interface).

jamadden · 2020-03-09T15:01:18Z

I am also worried that the use of tuple(map(func, it)) (over [func(i) for i in it]) could cause a crash. In my view, this indicates that something is deeply wrong and may hit people without the capabilities to find out why and how to work around it. Did the problem only affected Python 3.5 or other Python 3 versions as well (in other works, were the bug in Python 3.5 or somewhere in zope.interface).

Yeah, I was concerned about that too and kept looking into it. The crash only happened in CPython 3.5, and debugging showed that the call to tuplefy(required) resulted in the dictionary stored in the cache C variable getting deallocated. Since tuplefy was iterating a lazy mapping that's calling arbitrary Python code, arbitrary side-effects are possible. In this case, it turns out that one such side-effect was eventually calling components.changed(_), which drops the caches. Since cache contained a borrowed reference, its refcount went to 0 and it got deallocated. (This happened on every version of Python, but for some reason having to do with the dict implementation or memory management, it only crashed on CPython 3.5.)

The change to non-lazily iterate before calling into C is a sufficient fix for this code (although I can and will also eliminate the invoking of this side-effect in the first place). But this is a bug waiting to happen and the C code has been incorrect for a long time. The remainder of the fix is to not get the cache until we can be (more) sure there will be no more arbitrary side effects.

The query functions now start by looking at the next class in the MRO (interfaces directly provided by the underlying object are not found). Adapter registries automatically pick up providedBy change to start finding the correct implementations of adapters, but to make that really useful they needed to change to unpack super() arguments and pass __self__ to the factory. Fixes #11 Unfortunately, this makes PyPy unable to build the C extensions. Additional crash-safety for adapter lookup. Make the C functions get the cache only after resolving the ``required`` into a tuple, in case of side-effects like...clearing the cache. This could lead to the ``cache`` object being deallocated before we used it. Drop the ``tuplefy`` function in favor of a direct call to ``PySequence_Tuple``. It's what the ``tuple`` constructor would do anyway and saves a few steps. Make sure that getting ``providedBy(super())`` and ``implementedBy(super())`` have no side effects.

jamadden · 2020-03-09T17:52:43Z

Something is still inconsistent….Either tuple(providedBy(super(M1, d))) should not contain IB or tuple(providedBy(super(M2, d))) should contain IB (because M1 and M2 are symmetric with respect to B). Very likely, the latter should be the case, as super(M2, d) is "d viewed as B object" and B implements IB.

You're absolutely right. I've added that as a test case and corrected it.

d-maurer

Okay - maybe with cosmetic cleanup (removal of unused variables in C).

Should we have a test "adapter_hook hits super" to verify that the "self" dereferencing works in this case, too? Maybe, also for multi adapters?

src/zope/interface/_zope_interface_coptimizations.c

d-maurer · 2020-03-10T07:00:13Z

src/zope/interface/adapter.py

+# ``tuple`` and ``list`` cooperate so that ``tuple([some list])``
+# directly allocates and iterates at the C level without using a
+# Python iterator. That's not the case for
+# ``tuple(generator_expression)`` or ``tuple(map(func, it))``.
+##
+# 3.8
+# ``tuple([t for t in range(10)])``      -> 610ns
+# ``tuple(t for t in range(10))``        -> 696ns
+# ``tuple(map(lambda t: t, range(10)))`` -> 881ns
+##
+# 2.7
+# ``tuple([t fon t in range(10)])``      -> 625ns
+# ``tuple(t for t in range(10))``        -> 665ns
+# ``tuple(map(lambda t: t, range(10)))`` -> 958ns
+#
+# All three have substantial variance.


The comment looks somewhat unmotivated at this place (though I know from the conversation in this PR why it is somewhere).

jamadden · 2020-03-10T09:51:26Z

Thanks for the detailed reviews and attention to detail!

Should we have a test "adapter_hook hits super" to verify that the "self" dereferencing works in this case, too? Maybe, also for multi adapters?

Both of those things are covered in the tests, but I will look at making them more explicit.

The DEFINE_STRING macro prevents the linter from seeing them as unused so I temporarily redefined it to find all such variables.

icemac requested a review from d-maurer March 6, 2020 15:42

d-maurer requested changes Mar 7, 2020

View reviewed changes

CHANGES.rst Show resolved Hide resolved

src/zope/interface/_zope_interface_coptimizations.c Outdated Show resolved Hide resolved

src/zope/interface/declarations.py Outdated Show resolved Hide resolved

jensens requested a review from d-maurer March 8, 2020 16:48

d-maurer requested changes Mar 9, 2020

View reviewed changes

jamadden force-pushed the issue11 branch from 0fc4148 to 1e720c3 Compare March 9, 2020 17:51

jamadden requested a review from d-maurer March 9, 2020 18:02

d-maurer approved these changes Mar 10, 2020

View reviewed changes

jamadden mentioned this pull request Mar 10, 2020

A plan for 5.0.0 release #185

Closed

jamadden added 2 commits March 10, 2020 05:56

Remove unused str variables

364fd0c

The DEFINE_STRING macro prevents the linter from seeing them as unused so I temporarily redefined it to find all such variables.

Add additional low-level tests for super() unwrapping.

fb8180d

jamadden merged commit 354facc into master Mar 10, 2020

jamadden deleted the issue11 branch March 10, 2020 11:16

jamadden added this to the 5.0 milestone Mar 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make provided/implementedBy and adapter registries respect super(). #181

Make provided/implementedBy and adapter registries respect super(). #181

jamadden commented Mar 5, 2020 •

edited by icemac

icemac commented Mar 6, 2020

d-maurer left a comment

d-maurer commented Mar 7, 2020 via email

d-maurer commented Mar 7, 2020 via email

jamadden commented Mar 8, 2020

jamadden commented Mar 8, 2020 •

edited

d-maurer left a comment

jamadden commented Mar 9, 2020

jamadden commented Mar 9, 2020

d-maurer left a comment

d-maurer Mar 10, 2020

jamadden commented Mar 10, 2020

Make provided/implementedBy and adapter registries respect super(). #181

Make provided/implementedBy and adapter registries respect super(). #181

Conversation

jamadden commented Mar 5, 2020 • edited by icemac

icemac commented Mar 6, 2020

d-maurer left a comment

Choose a reason for hiding this comment

d-maurer commented Mar 7, 2020 via email

d-maurer commented Mar 7, 2020 via email

jamadden commented Mar 8, 2020

jamadden commented Mar 8, 2020 • edited

d-maurer left a comment

Choose a reason for hiding this comment

jamadden commented Mar 9, 2020

jamadden commented Mar 9, 2020

d-maurer left a comment

Choose a reason for hiding this comment

d-maurer Mar 10, 2020

Choose a reason for hiding this comment

jamadden commented Mar 10, 2020

jamadden commented Mar 5, 2020 •

edited by icemac

jamadden commented Mar 8, 2020 •

edited