Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make provided/implementedBy and adapter registries respect super(). #181

Merged
merged 3 commits into from Mar 10, 2020

Conversation

jamadden
Copy link
Member

@jamadden jamadden commented Mar 5, 2020

The query functions now start by looking at the next class in the MRO (interfaces directly provided by the underlying object are not found).

Adapter registries automatically pick up providedBy change to start finding the correct implementations of adapters, but to make that really useful they needed to change to unpack super() arguments and pass __self__ to the factory.

Fixes #11

Unfortunately, this makes PyPy unable to build the C extensions.

@icemac icemac requested a review from d-maurer March 6, 2020 15:42
@icemac
Copy link
Member

icemac commented Mar 6, 2020

@d-maurer I added you as reviewer as #11 was originally written by you.

Copy link
Contributor

@d-maurer d-maurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the __self__ dereferencing before the objects are passed to the adapter factory. Personally, I would be more comfortable when the objects were passed unchanged.

I am quite sure that the mro should be determined from ob.__self_class__ not from ob.__thisclass__.

CHANGES.rst Show resolved Hide resolved
src/zope/interface/_zope_interface_coptimizations.c Outdated Show resolved Hide resolved
src/zope/interface/declarations.py Outdated Show resolved Hide resolved
@d-maurer
Copy link
Contributor

d-maurer commented Mar 7, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Mar 7, 2020 via email

@jamadden
Copy link
Member Author

jamadden commented Mar 8, 2020

CPython 3.5 is crashing on Travis/Linux, apparently hanging on Travis/Mac, and crashing locally for me on my Mac.

The local crash occurs in test_getMultiAdapter_hit_super (zope.interface.tests.test_registry.ComponentsTests) and looks like an issue computing a hash code (?) when getting a key in a dictionary (compiled with -g -O0:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00007fff67191774 libsystem_malloc.dylib`large_malloc + 89
    frame #1: 0x00007fff67187357 libsystem_malloc.dylib`szone_malloc_should_clear + 235
    frame #2: 0x00007fff6718606a libsystem_malloc.dylib`malloc_zone_malloc + 104
    frame #3: 0x00007fff67185fe5 libsystem_malloc.dylib`malloc + 21
    frame #4: 0x000000010012f012 .Python`_PyObject_Alloc + 881
    frame #5: 0x0000000100120030 .Python`PyLong_FromDouble + 176
    frame #6: 0x00000001026f1f23 _zope_interface_coptimizations.cpython-35m-darwin.so`_lookup(self=0x000000010631cea8, required=0x0000000106322b08, provided=0x00000001060ff9a0, name=0x0000000100553ab0, default_=0x0000000000000000) at _zope_interface_coptimizations.c:1015:12
    frame #7: 0x00000001026f1b59 _zope_interface_coptimizations.cpython-35m-darwin.so`lookup_lookup(self=0x000000010631cea8, args=0x00000001063195e8, kwds=0x0000000000000000) at _zope_interface_coptimizations.c:1060:10
    frame #8: 0x000000010012c1f7 .Python`PyCFunction_Call + 107

where _zope_interface_coptimizations.c:1015 is result = PyDict_GetItem(cache, key);

@jamadden
Copy link
Member Author

jamadden commented Mar 8, 2020

The crash in 3.5 turned out to be a memory bug in tuple(map(func, it)) when func could return temporary objects. The fix is to replace map(func, it) with [func(i) for i in it]. This turns out to be faster too. [EDIT: That was only a partial fix; see below.]

The temporary objects in question are the objects created for providedBy(super(C, c)). Unlike providedBy(c), those aren't currently cached anywhere else (in an instance or class dict). But they'll wind up cached somewhat invisibly in adapter registries. They'll be found again (hashcodes and eq match up) so we're just doing some extra work computing the RO each time. I'm wondering if we should cache that in the class dict to avoid that.

Right now, I'm thinking leave it simple and revisit if it becomes a bottleneck. [EDIT: It was simple to cache, so I tried that.]

@jensens jensens requested a review from d-maurer March 8, 2020 16:48
Copy link
Contributor

@d-maurer d-maurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is still inconsistent. I set up the following test:

>>> from zope.interface import implementer, Interface, providedBy
>>> 
>>> class IB(Interface): pass
... 
>>> class IM1(Interface): pass
... 
>>> class IM2(Interface): pass
... 
>>> class ID(IB): pass
... 
>>> @implementer(IB)
... class B: pass
... 
>>> @implementer(IM1)
... class M1(B): pass
... 
>>> @implementer(IM2)
... class M2(B): pass
... 
>>> @implementer(ID)
... class D(M1, M2): pass
... 
>>> d = D()
>>> tuple(providedBy(d))
(<InterfaceClass __main__.ID>, <InterfaceClass __main__.IM1>, <InterfaceClass __main__.IB>, <InterfaceClass __main__.IM2>)
>>> tuple(providedBy(super(D, d)))
(<InterfaceClass __main__.IM1>, <InterfaceClass __main__.IB>, <InterfaceClass __main__.IM2>)
>>> tuple(providedBy(super(M1, d)))
(<InterfaceClass __main__.IM2>, <InterfaceClass __main__.IB>)
>>> tuple(providedBy(super(M2, d)))
()
>>> tuple(providedBy(super(B, d)))
()
>>> D.mro()
[<class '__main__.D'>, <class '__main__.M1'>, <class '__main__.M2'>, <class '__main__.B'>, <class 'object'>]

Either tuple(providedBy(super(M1, d))) should not contain IB or tuple(providedBy(super(M2, d))) should contain IB (because M1 and M2 are symmetric with respect to B). Very likely, the latter should be the case, as super(M2, d) is "d viewed as B object" and B implements IB.

I am also worried that the use of tuple(map(func, it)) (over [func(i) for i in it]) could cause a crash. In my view, this indicates that something is deeply wrong and may hit people without the capabilities to find out why and how to work around it. Did the problem only affected Python 3.5 or other Python 3 versions as well (in other works, were the bug in Python 3.5 or somewhere in zope.interface).

@jamadden
Copy link
Member Author

jamadden commented Mar 9, 2020

I am also worried that the use of tuple(map(func, it)) (over [func(i) for i in it]) could cause a crash. In my view, this indicates that something is deeply wrong and may hit people without the capabilities to find out why and how to work around it. Did the problem only affected Python 3.5 or other Python 3 versions as well (in other works, were the bug in Python 3.5 or somewhere in zope.interface).

Yeah, I was concerned about that too and kept looking into it. The crash only happened in CPython 3.5, and debugging showed that the call to tuplefy(required) resulted in the dictionary stored in the cache C variable getting deallocated. Since tuplefy was iterating a lazy mapping that's calling arbitrary Python code, arbitrary side-effects are possible. In this case, it turns out that one such side-effect was eventually calling components.changed(_), which drops the caches. Since cache contained a borrowed reference, its refcount went to 0 and it got deallocated. (This happened on every version of Python, but for some reason having to do with the dict implementation or memory management, it only crashed on CPython 3.5.)

The change to non-lazily iterate before calling into C is a sufficient fix for this code (although I can and will also eliminate the invoking of this side-effect in the first place). But this is a bug waiting to happen and the C code has been incorrect for a long time. The remainder of the fix is to not get the cache until we can be (more) sure there will be no more arbitrary side effects.

The query functions now start by looking at the next class in the MRO (interfaces directly provided by the underlying object are not found).

Adapter registries automatically pick up providedBy change to start finding the correct implementations of adapters, but to make that really useful they needed to change to unpack super() arguments and pass __self__ to the factory.

Fixes #11

Unfortunately, this makes PyPy unable to build the C extensions.

Additional crash-safety for adapter lookup.

Make the C functions get the cache only after resolving the
``required`` into a tuple, in case of side-effects like...clearing the
cache. This could lead to the ``cache`` object being deallocated
before we used it.

Drop the ``tuplefy`` function in favor of a direct call to
``PySequence_Tuple``. It's what the ``tuple`` constructor would do
anyway and saves a few steps.

Make sure that getting ``providedBy(super())`` and
``implementedBy(super())`` have no side effects.
@jamadden
Copy link
Member Author

jamadden commented Mar 9, 2020

Something is still inconsistent….Either tuple(providedBy(super(M1, d))) should not contain IB or tuple(providedBy(super(M2, d))) should contain IB (because M1 and M2 are symmetric with respect to B). Very likely, the latter should be the case, as super(M2, d) is "d viewed as B object" and B implements IB.

You're absolutely right. I've added that as a test case and corrected it.

@jamadden jamadden requested a review from d-maurer March 9, 2020 18:02
Copy link
Contributor

@d-maurer d-maurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - maybe with cosmetic cleanup (removal of unused variables in C).

Should we have a test "adapter_hook hits super" to verify that the "self" dereferencing works in this case, too? Maybe, also for multi adapters?

src/zope/interface/_zope_interface_coptimizations.c Outdated Show resolved Hide resolved
src/zope/interface/_zope_interface_coptimizations.c Outdated Show resolved Hide resolved
Comment on lines +33 to +48
# ``tuple`` and ``list`` cooperate so that ``tuple([some list])``
# directly allocates and iterates at the C level without using a
# Python iterator. That's not the case for
# ``tuple(generator_expression)`` or ``tuple(map(func, it))``.
##
# 3.8
# ``tuple([t for t in range(10)])`` -> 610ns
# ``tuple(t for t in range(10))`` -> 696ns
# ``tuple(map(lambda t: t, range(10)))`` -> 881ns
##
# 2.7
# ``tuple([t fon t in range(10)])`` -> 625ns
# ``tuple(t for t in range(10))`` -> 665ns
# ``tuple(map(lambda t: t, range(10)))`` -> 958ns
#
# All three have substantial variance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment looks somewhat unmotivated at this place (though I know from the conversation in this PR why it is somewhere).

@jamadden
Copy link
Member Author

Thanks for the detailed reviews and attention to detail!

Should we have a test "adapter_hook hits super" to verify that the "self" dereferencing works in this case, too? Maybe, also for multi adapters?

Both of those things are covered in the tests, but I will look at making them more explicit.

The DEFINE_STRING macro prevents the linter from seeing them as unused
so I temporarily redefined it to find all such variables.
@jamadden jamadden merged commit 354facc into master Mar 10, 2020
@jamadden jamadden deleted the issue11 branch March 10, 2020 11:16
@jamadden jamadden added this to the 5.0 milestone Mar 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

providedBy does not respect super
3 participants