Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Interface hashing and comparison to C; 2.5 to 15x speedup in micro benchmarks #183

Merged
merged 12 commits into from
Mar 19, 2020

Conversation

jamadden
Copy link
Member

@jamadden jamadden commented Mar 7, 2020

Included benchmark numbers:

Current master, Python 3.8:

.....................
contains (empty dict): Mean +- std dev: 198 ns +- 5 ns
.....................
contains (populated dict): Mean +- std dev: 197 ns +- 6 ns
.....................
contains (populated list): Mean +- std dev: 53.1 us +- 1.2 us

This code:

.....................
contains (empty dict): Mean +- std dev: 77.9 ns +- 2.3 ns
.....................
contains (populated dict): Mean +- std dev: 78.4 ns +- 3.1 ns
.....................
contains (populated list): Mean +- std dev: 3.69 us +- 0.08 us

So anywhere from 2.5 to 15x faster. Not sure how that will translate to larger (macro) benchmarks, but I'm hopeful. I also need to do some sorting benchmarks (e.g., using interfaces as keys in BTrees).

To make this work, InterfaceBase had to extend SpecificationBase to get a consistent class layout, and that's where the attributes for __name__ and __module__ moved. Neither of those are a public classes, I think.

It turns out that messing with __module__ is nasty, tricky business, especially when you do it from C. Every time you define a new subclass, the descriptors that you set get overridden by the type machinery (PyType_Ready), and there are differences between Python 2 and 3. I'm using a data descriptor and a meta class right now to avoid that but I'm not super happy with that and would like to find a better way. (At least, maybe the data part of the descriptor isn't necessary?) It may be needed to move more code into C, I don't want a slowdown accessing __module__ either; copying around the standard PyGetSet or PyMember descriptors isn't enough because they don't work on the class object (so classImplements(InterfaceClass, IInterface) fails).

There's currently one example doctest failure due to a changed class name.

@jamadden jamadden marked this pull request as ready for review March 11, 2020 23:42
@jamadden
Copy link
Member Author

jamadden commented Mar 11, 2020

I think this is ready for review. I experimented with several different approaches (visible in the commit history), and ultimately a metaclass was best from both a simplicity and a performance standpoint.

Speaking of performance: Equality and hashing are back on a par with identity-based versions. Improvements overall look pretty nice.

Benchmark 38-master-full 38-faster-meta
read __module__ 41.8 ns 40.9 ns: 1.02x faster (-2%)
read __name__ 41.8 ns 39.9 ns: 1.05x faster (-5%)
read providedBy 56.9 ns 58.4 ns: 1.03x slower (+3%)
query adapter (no registrations) 3.85 ms 2.95 ms: 1.31x faster (-24%)
query adapter (all trivial registrations) 4.62 ms 3.63 ms: 1.27x faster (-21%)
query adapter (registrations, wide inheritance) 51.8 us 42.2 us: 1.23x faster (-19%)
query adapter (registrations, deep inheritance) 52.0 us 41.7 us: 1.25x faster (-20%)
sort interfaces 234 us 29.9 us: 7.84x faster (-87%)
sort mixed 569 us 340 us: 1.67x faster (-40%)
contains (empty dict) 135 ns 55.2 ns: 2.44x faster (-59%)
contains (populated dict: interfaces) 137 ns 56.1 ns: 2.45x faster (-59%)
contains (populated list: interfaces) 39.7 us 2.96 us: 13.42x faster (-93%)
contains (populated dict: implementedBy) 137 ns 55.2 ns: 2.48x faster (-60%)
contains (populated list: implementedBy) 40.6 us 24.1 us: 1.68x faster (-41%)

Not significant (2): read __doc__; sort implementedBy

Sorting implementedBy objects didn't change; that makes sense, that part is still in Python.

I do see that I need to add some tests for the NotImplemented branch of comparisons.

@jamadden
Copy link
Member Author

When running Plone's buildout.coredev test suite, I've had run times in the range of 22:40 to 26:11, across at least 5 different runs. With this branch, the run time was 19:06. (This is not a benchmark, and not repeatable, it's just an anecdote. But an encouraging one.)

There is one failure:

File "//plone.autoform-1.8.2-py2.7.egg/plone/autoform/tests/../autoform.rst", line 428, in autoform.rst
Failed example:
    IAnotherAnonymousSchema.__module__ = 'different.module'
Exception raised:
    Traceback (most recent call last):
      File "//lib/python2.7/doctest.py", line 1315, in __run
        compileflags, 1) in test.globs
      File "<doctest autoform.rst[75]>", line 1, in <module>
        IAnotherAnonymousSchema.__module__ = 'different.module'
    TypeError: readonly attribute

That test is explicitly checking for an older bug:

It is possible to have interfaces/schema that have an empty __name__
attribute, specifically in some cases where a schema is dynamically
created.  In such cases, it is possible to have a subclass of
AutoExtensibleForm implement a getPrefix() function as a sufficient
condition for group naming when autoGroups is True.

    Define some unnamed schema:

    >>> class IUnknownName(Interface):
    ...     this = schema.TextLine()
    ...
    >>> IUnknownName.__name__ = ''  # dynamic schema, empty __name__

    >>> class IAnotherAnonymousSchema(Interface):
    ...     that = schema.TextLine()
    ...
    >>> IAnotherAnonymousSchema.__name__ = ''

Fix for https://github.com/zopefoundation/zope.interface/issues/31

    >>> IAnotherAnonymousSchema.__module__ = 'different.module'

Because equality and hashing are both based on __name__ and __module__, and because various things derived from those two attributes are cached (the repr and hash code), reassigning them has long been a dangerous footgun. Since the test is talking about dynamic schemas, I think it could be changed to actually use dynamic schemas and not run into this problem:

IAnotherAnonymousSchema = InterfaceClass(
    '', 
    (Interface,) 
    {'that': schema.TextLine(), '__module__': 'different.module'})

Copy link
Member

@jensens jensens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally I found time to take a deeper look into this changes. Wow! This looks like it was quiet an effort to get to this well founded result. And I love it.

The problem in plone.autoform was fixed as suggested. I am even not sure if this is a good/valid use-case, whats is tested there, but the case is still supported.

I took a look at the code and can not find any problems so far. I would go for merge from my POV.
Given the complexity of this PR another reviewer would be probably a good idea.

@jamadden
Copy link
Member Author

Thanks @jensens!

I'm happy to wait for more reviews. I agree it's a big PR, plus it touches C code, so more eyes are definitely better. I'd prefer not to wait more than a couple days, though. I'd really like to get these last few PRs merged and make a release before the end of the week.

@jamadden jamadden added this to the 5.0 milestone Mar 18, 2020
…ro benchmarks

Included benchmark numbers:

Current master, Python 3.8:

.....................
contains (empty dict): Mean +- std dev: 198 ns +- 5 ns
.....................
contains (populated dict): Mean +- std dev: 197 ns +- 6 ns
.....................
contains (populated list): Mean +- std dev: 53.1 us +- 1.2 us

This code:

.....................
contains (empty dict): Mean +- std dev: 77.9 ns +- 2.3 ns
.....................
contains (populated dict): Mean +- std dev: 78.4 ns +- 3.1 ns
.....................
contains (populated list): Mean +- std dev: 3.69 us +- 0.08 us

So anywhere from 2.5 to 15x faster. Not sure how that will translate to
larger benchmarks, but I'm hopeful.

It turns out that messing with ``__module__`` is nasty, tricky
business, especially when you do it from C. Everytime you define a new
subclass, the descriptors that you set get overridden by the type
machinery (PyType_Ready). I'm using a data descriptor and a meta class
right now to avoid that but I'm not super happy with that and would
like to find a better way. (At least, maybe the data part of the
descriptor isn't necessary?) It may be needed to move more code into
C, I don't want a slowdown accessing ``__module__`` either; copying
around the standard PyGetSet or PyMember descriptors isn't enough
because they don't work on the class object (so
``classImplements(InterfaceClass, IInterface)`` fails).
Current results (this branch vs master, 354facc):

| Benchmark                                 | 38-master | 38-faster                     |
|-------------------------------------------|-----------|-------------------------------|
| query adapter (no registrations)          | 3.81 ms   | 3.03 ms: 1.26x faster (-20%)  |
| query adapter (all trivial registrations) | 4.65 ms   | 3.90 ms: 1.19x faster (-16%)  |
| contains (empty dict)                     | 163 ns    | 76.1 ns: 2.14x faster (-53%)  |
| contains (populated dict)                 | 162 ns    | 76.9 ns: 2.11x faster (-53%)  |
| contains (populated list)                 | 40.3 us   | 3.09 us: 13.04x faster (-92%) |

Also need benchmarks using inheritance. The 'implied' data structures
are also hash/equality based.
This is pretty, but it slows down all attribute access to interfaces.
By up to 25%. I'm not sure that's acceptable for things like
Interface.providedBy.

+-------------------------------------------+------------+-------------------------------+
| Benchmark                                 | 38-master3 | 38-faster3                    |
+===========================================+============+===============================+
| read __module__                           | 41.1 ns    | 44.3 ns: 1.08x slower (+8%)   |
+-------------------------------------------+------------+-------------------------------+
| read __name__                             | 41.3 ns    | 51.6 ns: 1.25x slower (+25%)  |
+-------------------------------------------+------------+-------------------------------+
| read __doc__                              | 41.8 ns    | 53.3 ns: 1.28x slower (+28%)  |
+-------------------------------------------+------------+-------------------------------+
| read providedBy                           | 56.7 ns    | 71.6 ns: 1.26x slower (+26%)  |
+-------------------------------------------+------------+-------------------------------+
| query adapter (no registrations)          | 3.85 ms    | 2.95 ms: 1.31x faster (-23%)  |
+-------------------------------------------+------------+-------------------------------+
| query adapter (all trivial registrations) | 4.59 ms    | 3.65 ms: 1.26x faster (-20%)  |
+-------------------------------------------+------------+-------------------------------+
| contains (empty dict)                     | 136 ns     | 55.4 ns: 2.45x faster (-59%)  |
+-------------------------------------------+------------+-------------------------------+
| contains (populated dict)                 | 137 ns     | 55.0 ns: 2.49x faster (-60%)  |
+-------------------------------------------+------------+-------------------------------+
| contains (populated list)                 | 40.2 us    | 2.95 us: 13.62x faster (-93%) |
+-------------------------------------------+------------+-------------------------------+
This makes the rest of the attribute access fast again, but slows down
__module__.

+-------------------------------------------+------------+-------------------------------+
| Benchmark                                 | 38-master3 | 38-faster-descr               |
+===========================================+============+===============================+
| read __module__                           | 41.1 ns    | 123 ns: 2.99x slower (+199%)  |
+-------------------------------------------+------------+-------------------------------+
| read __name__                             | 41.3 ns    | 39.9 ns: 1.04x faster (-3%)   |
+-------------------------------------------+------------+-------------------------------+
| read __doc__                              | 41.8 ns    | 42.4 ns: 1.01x slower (+1%)   |
+-------------------------------------------+------------+-------------------------------+
| query adapter (no registrations)          | 3.85 ms    | 2.95 ms: 1.30x faster (-23%)  |
+-------------------------------------------+------------+-------------------------------+
| query adapter (all trivial registrations) | 4.59 ms    | 3.67 ms: 1.25x faster (-20%)  |
+-------------------------------------------+------------+-------------------------------+
| contains (empty dict)                     | 136 ns     | 54.8 ns: 2.48x faster (-60%)  |
+-------------------------------------------+------------+-------------------------------+
| contains (populated dict)                 | 137 ns     | 55.7 ns: 2.46x faster (-59%)  |
+-------------------------------------------+------------+-------------------------------+
| contains (populated list)                 | 40.2 us    | 2.86 us: 14.03x faster (-93%) |
+-------------------------------------------+------------+-------------------------------+

Not significant (1): read providedBy
This offers the absolute best performance at what seems like reasonable complexity.

+-------------------------------------------------------------+----------------+-------------------------------+
| Benchmark                                                   | 38-master-full | 38-faster-meta                |
+=============================================================+================+===============================+
| read __module__                                             | 41.8 ns        | 40.9 ns: 1.02x faster (-2%)   |
+-------------------------------------------------------------+----------------+-------------------------------+
| read __name__                                               | 41.8 ns        | 39.9 ns: 1.05x faster (-5%)   |
+-------------------------------------------------------------+----------------+-------------------------------+
| read providedBy                                             | 56.9 ns        | 58.4 ns: 1.03x slower (+3%)   |
+-------------------------------------------------------------+----------------+-------------------------------+
| query adapter (no registrations)                            | 3.85 ms        | 2.95 ms: 1.31x faster (-24%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| query adapter (all trivial registrations)                   | 4.62 ms        | 3.63 ms: 1.27x faster (-21%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| query adapter (all trivial registrations, wide inheritance) | 51.8 us        | 42.2 us: 1.23x faster (-19%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| query adapter (all trivial registrations, deep inheritance) | 52.0 us        | 41.7 us: 1.25x faster (-20%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| sort interfaces                                             | 234 us         | 29.9 us: 7.84x faster (-87%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| sort mixed                                                  | 569 us         | 340 us: 1.67x faster (-40%)   |
+-------------------------------------------------------------+----------------+-------------------------------+
| contains (empty dict)                                       | 135 ns         | 55.2 ns: 2.44x faster (-59%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| contains (populated dict: interfaces)                       | 137 ns         | 56.1 ns: 2.45x faster (-59%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| contains (populated list: interfaces)                       | 39.7 us        | 2.96 us: 13.42x faster (-93%) |
+-------------------------------------------------------------+----------------+-------------------------------+
| contains (populated dict: implementedBy)                    | 137 ns         | 55.2 ns: 2.48x faster (-60%)  |
+-------------------------------------------------------------+----------------+-------------------------------+
| contains (populated list: implementedBy)                    | 40.6 us        | 24.1 us: 1.68x faster (-41%)  |
+-------------------------------------------------------------+----------------+-------------------------------+

Not significant (2): read __doc__; sort implementedBy
Several places needed to, essentially, call super.
…ithout the required attributes.

And fix the C handling of this case.
@jensens
Copy link
Member

jensens commented Mar 18, 2020

I'd prefer not to wait more than a couple days, though. I'd really like to get these last few PRs merged and make a release before the end of the week.

+1

Copy link
Member

@mgedmin mgedmin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is a PR. Very interested reading. It probably works fine.

@@ -325,6 +330,9 @@ Spec_clear(Spec* self)
static void
Spec_dealloc(Spec* self)
{
/* PyType_GenericAlloc that you get when you don't
specify a tp_alloc always tracks the object. */
PyObject_GC_UnTrack((PyObject *)self);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bugfix for code in master? Or something that's now necessary because of other changes in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, now I think this is option B: changes in this PR (specifically, specifying a custom tp_alloc instead of 0)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I added C fields to InterfaceBase, it was necessary to untrack it. As it now extends SpecificationBase I was going up the method hierarchy and realized it was necessary also to untrack SpecificationBase, since it also has fields that can contain arbitrary objects. So it's a bit of both? But really a bug in #155

CPB_clear(self);
Py_TYPE(self)->tp_free(OBJECT(self));
Spec_dealloc((Spec*)self);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now calling PyObject_GC_UnTrack(self) twice, once here directly and once through Spec_dealloc()?

I've no idea what PyObject_GC_UnTrack() does, so just wanted to bring this to your attention in case the double call is unintentional.

Copy link
Member Author

@jamadden jamadden Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I double-checked, calling that API twice is fine.

# goes, we'll never have equal name and module to those, so we're still consistent there.
# Instances of this class are essentially intended to be unique and are
# heavily cached (note how our __reduce__ handles this) so having identity
# based hash and eq should also work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the 'should', especially when talking about correctness.

Copy link
Member Author

@jamadden jamadden Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I just moved this comment from place to place. This PR didn't add it. It was added way back in #44 by some familiar names 😄

except:
# XXX: Do we really want to catch BaseException? Shouldn't
# things like MemoryError, KeyboardInterrupt, etc, get through?
except: # pylint:disable=bare-except
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this should be except Exception:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'll take some changes in the C code too. There are several cases like that; some I fixed already. And when I say fixed, I mean, "made the Python code do except: like C 😢 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get it right: Technically there is a difference in C between except Exception and a bare except?
But is there a difference semantically?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compatibility with C code is a good argument for not touching this right now. (But it should be mentioned in the comment, next to the XXX -- something like "if you fix it here, please also do that in the_right_function() in filename.c".)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically there is a difference in C between except Exception and a bare except?

Yes, much as in Python (where "bare except:" is a shortcut for except BaseException:). In C, a "bare except" is often not even explicitly written. If some function returns NULL (the universal indicator of failure), and that NULL result is ignored and the error is thrown away with PyErr_Clear(), that's basically like writing except: pass in Python. That looks like this:

  conform = PyObject_GetAttr(obj, str__conform__);
  if (conform != NULL) {
      return PyObject_CallMethodObjArgs(self, str_call_conform,
                                           conform, NULL);
  }
  else { // except:
    PyErr_Clear(); // pass
  }
  adapter = __adapt__(self, obj);

Catching specific exceptions is a bit more difficult and looks something like:

 othername = PyObject_GetAttrString(other, "__name__");
 if (!othername) {
     if (PyErr_Occurred() 
          && PyErr_ExceptionMatches(PyExc_AttributeError)) {
         PyErr_Clear();
     }
 }

Which is the rough equivalent of

try:
    othername = other.__name__
except AttributeError:
    pass

# That leaves us with a metaclass. Here we can have our cake and
# eat it too: no extra storage, and C-speed access to the
# underlying storage. The only cost is that metaclasses tend to
# make people's heads hurt. (But still less than the descriptor-is-string, I think.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My head hurts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully not too bad?

# a class, but of course it does it in C. :-/
__module__ = sys._getframe(1).f_globals['__name__']
except (AttributeError, KeyError): # pragma: no cover
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if this exception is caught and silently ignored...

moduledescr = InterfaceBase.__dict__['__module_property__']
attrs['__module__'] = moduledescr
kind = type.__new__(cls, name, bases, attrs)
kind.__module = __module__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... then we fail with an UnboundLocalError here?

Why not let the original exception go through?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question is, when and why does sys._getframe(1).f_globals['__name__'] fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great points. I just cargo-culted that code from the __init__ of InterfaceClass. There, it's kinda prepared to deal with those errors (in that it won't fail immediately, but later on things like the repr and sorting are going to be messed up). Here, you're right, we're not prepared at all.

Reviewers weren't sure how it could be raised.
@jamadden jamadden merged commit dc719e2 into master Mar 19, 2020
@jamadden jamadden deleted the faster-eq-hash-comparison branch March 19, 2020 12:41
admin-turris pushed a commit to CZ-NIC/turris-os-packages that referenced this pull request Jul 10, 2020
5.1.0 (2020-04-08)
==================

- Make ``@implementer(*iface)`` and ``classImplements(cls, *iface)``
  ignore redundant interfaces. If the class already implements an
  interface through inheritance, it is no longer redeclared
  specifically for *cls*. This solves many instances of inconsistent
  resolution orders, while still allowing the interface to be declared
  for readability and maintenance purposes. See `issue 199
  <https://github.com/zopefoundation/zope.interface/issues/199>`_.

- Remove all bare ``except:`` statements. Previously, when accessing
  special attributes such as ``__provides__``, ``__providedBy__``,
  ``__class__`` and ``__conform__``, this package wrapped such access
  in a bare ``except:`` statement, meaning that many errors could pass
  silently; typically this would result in a fallback path being taken
  and sometimes (like with ``providedBy()``) the result would be
  non-sensical. This is especially true when those attributes are
  implemented with descriptors. Now, only ``AttributeError`` is
  caught. This makes errors more obvious.

  Obviously, this means that some exceptions will be propagated
  differently than before. In particular, ``RuntimeError`` raised by
  Acquisition in the case of circular containment will now be
  propagated. Previously, when adapting such a broken object, a
  ``TypeError`` would be the common result, but now it will be a more
  informative ``RuntimeError``.

  In addition, ZODB errors like ``POSKeyError`` could now be
  propagated where previously they would ignored by this package.

  See `issue 200 <https://github.com/zopefoundation/zope.interface/issues/200>`_.

- Require that the second argument (*bases*) to ``InterfaceClass`` is
  a tuple. This only matters when directly using ``InterfaceClass`` to
  create new interfaces dynamically. Previously, an individual
  interface was allowed, but did not work correctly. Now it is
  consistent with ``type`` and requires a tuple.

- Let interfaces define custom ``__adapt__`` methods. This implements
  the other side of the :pep:`246` adaptation protocol: objects being
  adapted could already implement ``__conform__`` if they know about
  the interface, and now interfaces can implement ``__adapt__`` if
  they know about particular objects. There is no performance penalty
  for interfaces that do not supply custom ``__adapt__`` methods.

  This includes the ability to add new methods, or override existing
  interface methods using the new ``@interfacemethod`` decorator.

  See `issue 3 <https://github.com/zopefoundation/zope.interface/issues/3>`_.

- Make the internal singleton object returned by APIs like
  ``implementedBy`` and ``directlyProvidedBy`` for objects that
  implement or provide no interfaces more immutable. Previously an
  internal cache could be mutated. See `issue 204
  <https://github.com/zopefoundation/zope.interface/issues/204>`_.

5.0.2 (2020-03-30)
==================

- Ensure that objects that implement no interfaces (such as direct
  subclasses of ``object``) still include ``Interface`` itself in
  their ``__iro___`` and ``__sro___``. This fixes adapter registry
  lookups for such objects when the adapter is registered for
  ``Interface``. See `issue 197
  <https://github.com/zopefoundation/zope.interface/issues/197>`_.

5.0.1 (2020-03-21)
==================

- Ensure the resolution order for ``InterfaceClass`` is consistent.
  See `issue 192 <https://github.com/zopefoundation/zope.interface/issues/192>`_.

- Ensure the resolution order for ``collections.OrderedDict`` is
  consistent on CPython 2. (It was already consistent on Python 3 and PyPy).

- Fix the handling of the ``ZOPE_INTERFACE_STRICT_IRO`` environment
  variable. Previously, ``ZOPE_INTERFACE_STRICT_RO`` was read, in
  contrast with the documentation. See `issue 194
  <https://github.com/zopefoundation/zope.interface/issues/194>`_.

5.0.0 (2020-03-19)
==================

- Make an internal singleton object returned by APIs like
  ``implementedBy`` and ``directlyProvidedBy`` immutable. Previously,
  it was fully mutable and allowed changing its ``__bases___``. That
  could potentially lead to wrong results in pathological corner
  cases. See `issue 158
  <https://github.com/zopefoundation/zope.interface/issues/158>`_.

- Support the ``PURE_PYTHON`` environment variable at runtime instead
  of just at wheel build time. A value of 0 forces the C extensions to
  be used (even on PyPy) failing if they aren't present. Any other
  value forces the Python implementation to be used, ignoring the C
  extensions. See `PR 151 <https://github.com/zopefoundation/zope.interface/pull/151>`_.

- Cache the result of ``__hash__`` method in ``InterfaceClass`` as a
  speed optimization. The method is called very often (i.e several
  hundred thousand times during Plone 5.2 startup). Because the hash value never
  changes it can be cached. This improves test performance from 0.614s
  down to 0.575s (1.07x faster). In a real world Plone case a reindex
  index came down from 402s to 320s (1.26x faster). See `PR 156
  <https://github.com/zopefoundation/zope.interface/pull/156>`_.

- Change the C classes ``SpecificationBase`` and its subclass
  ``ClassProvidesBase`` to store implementation attributes in their structures
  instead of their instance dictionaries. This eliminates the use of
  an undocumented private C API function, and helps make some
  instances require less memory. See `PR 154 <https://github.com/zopefoundation/zope.interface/pull/154>`_.

- Reduce memory usage in other ways based on observations of usage
  patterns in Zope (3) and Plone code bases.

  - Specifications with no dependents are common (more than 50%) so
    avoid allocating a ``WeakKeyDictionary`` unless we need it.
  - Likewise, tagged values are relatively rare, so don't allocate a
    dictionary to hold them until they are used.
  - Use ``__slots___`` or the C equivalent ``tp_members`` in more
    common places. Note that this removes the ability to set arbitrary
    instance variables on certain objects.
    See `PR 155 <https://github.com/zopefoundation/zope.interface/pull/155>`_.

  The changes in this release resulted in a 7% memory reduction after
  loading about 6,000 modules that define about 2,200 interfaces.

  .. caution::

     Details of many private attributes have changed, and external use
     of those private attributes may break. In particular, the
     lifetime and default value of ``_v_attrs`` has changed.

- Remove support for hashing uninitialized interfaces. This could only
  be done by subclassing ``InterfaceClass``. This has generated a
  warning since it was first added in 2011 (3.6.5). Please call the
  ``InterfaceClass`` constructor or otherwise set the appropriate
  fields in your subclass before attempting to hash or sort it. See
  `issue 157 <https://github.com/zopefoundation/zope.interface/issues/157>`_.

- Remove unneeded override of the ``__hash__`` method from
  ``zope.interface.declarations.Implements``. Watching a reindex index
  process in ZCatalog with on a Py-Spy after 10k samples the time for
  ``.adapter._lookup`` was reduced from 27.5s to 18.8s (~1.5x faster).
  Overall reindex index time shrunk from 369s to 293s (1.26x faster).
  See `PR 161
  <https://github.com/zopefoundation/zope.interface/pull/161>`_.

- Make the Python implementation closer to the C implementation by
  ignoring all exceptions, not just ``AttributeError``, during (parts
  of) interface adaptation. See `issue 163
  <https://github.com/zopefoundation/zope.interface/issues/163>`_.

- Micro-optimization in ``.adapter._lookup`` , ``.adapter._lookupAll``
  and ``.adapter._subscriptions``: By loading ``components.get`` into
  a local variable before entering the loop a bytcode "LOAD_FAST 0
  (components)" in the loop can be eliminated. In Plone, while running
  all tests, average speedup of the "owntime" of ``_lookup`` is ~5x.
  See `PR 167
  <https://github.com/zopefoundation/zope.interface/pull/167>`_.

- Add ``__all__`` declarations to all modules. This helps tools that
  do auto-completion and documentation and results in less cluttered
  results. Wildcard ("*") are not recommended and may be affected. See
  `issue 153
  <https://github.com/zopefoundation/zope.interface/issues/153>`_.

- Fix ``verifyClass`` and ``verifyObject`` for builtin types like
  ``dict`` that have methods taking an optional, unnamed argument with
  no default value like ``dict.pop``. On PyPy3, the verification is
  strict, but on PyPy2 (as on all versions of CPython) those methods
  cannot be verified and are ignored. See `issue 118
  <https://github.com/zopefoundation/zope.interface/issues/118>`_.

- Update the common interfaces ``IEnumerableMapping``,
  ``IExtendedReadMapping``, ``IExtendedWriteMapping``,
  ``IReadSequence`` and ``IUniqueMemberWriteSequence`` to no longer
  require methods that were removed from Python 3 on Python 3, such as
  ``__setslice___``. Now, ``dict``, ``list`` and ``tuple`` properly
  verify as ``IFullMapping``, ``ISequence`` and ``IReadSequence,``
  respectively on all versions of Python.

- Add human-readable ``__str___`` and ``__repr___`` to ``Attribute``
  and ``Method``. These contain the name of the defining interface
  and the attribute. For methods, it also includes the signature.

- Change the error strings raised by ``verifyObject`` and
  ``verifyClass``. They now include more human-readable information
  and exclude extraneous lines and spaces. See `issue 170
  <https://github.com/zopefoundation/zope.interface/issues/170>`_.

  .. caution:: This will break consumers (such as doctests) that
               depended on the exact error messages.

- Make ``verifyObject`` and ``verifyClass`` report all errors, if the
  candidate object has multiple detectable violations. Previously they
  reported only the first error. See `issue
  <https://github.com/zopefoundation/zope.interface/issues/171>`_.

  Like the above, this will break consumers depending on the exact
  output of error messages if more than one error is present.

- Add ``zope.interface.common.collections``,
  ``zope.interface.common.numbers``, and ``zope.interface.common.io``.
  These modules define interfaces based on the ABCs defined in the
  standard library ``collections.abc``, ``numbers`` and ``io``
  modules, respectively. Importing these modules will make the
  standard library concrete classes that are registered with those
  ABCs declare the appropriate interface. See `issue 138
  <https://github.com/zopefoundation/zope.interface/issues/138>`_.

- Add ``zope.interface.common.builtins``. This module defines
  interfaces of common builtin types, such as ``ITextString`` and
  ``IByteString``, ``IDict``, etc. These interfaces extend the
  appropriate interfaces from ``collections`` and ``numbers``, and the
  standard library classes implement them after importing this module.
  This is intended as a replacement for third-party packages like
  `dolmen.builtins <https://pypi.org/project/dolmen.builtins/>`_.
  See `issue 138 <https://github.com/zopefoundation/zope.interface/issues/138>`_.

- Make ``providedBy()`` and ``implementedBy()`` respect ``super``
  objects. For instance, if class ``Derived`` implements ``IDerived``
  and extends ``Base`` which in turn implements ``IBase``, then
  ``providedBy(super(Derived, derived))`` will return ``[IBase]``.
  Previously it would have returned ``[IDerived]`` (in general, it
  would previously have returned whatever would have been returned
  without ``super``).

  Along with this change, adapter registries will unpack ``super``
  objects into their ``__self___`` before passing it to the factory.
  Together, this means that ``component.getAdapter(super(Derived,
  self), ITarget)`` is now meaningful.

  See `issue 11 <https://github.com/zopefoundation/zope.interface/issues/11>`_.

- Fix a potential interpreter crash in the low-level adapter
  registry lookup functions. See issue 11.

- Adopt Python's standard `C3 resolution order
  <https://www.python.org/download/releases/2.3/mro/>`_ to compute the
  ``__iro__`` and ``__sro__`` of interfaces, with tweaks to support
  additional cases that are common in interfaces but disallowed for
  Python classes. Previously, an ad-hoc ordering that made no
  particular guarantees was used.

  This has many beneficial properties, including the fact that base
  interface and base classes tend to appear near the end of the
  resolution order instead of the beginning. The resolution order in
  general should be more predictable and consistent.

  .. caution::
     In some cases, especially with complex interface inheritance
     trees or when manually providing or implementing interfaces, the
     resulting IRO may be quite different. This may affect adapter
     lookup.

  The C3 order enforces some constraints in order to be able to
  guarantee a sensible ordering. Older versions of zope.interface did
  not impose similar constraints, so it was possible to create
  interfaces and declarations that are inconsistent with the C3
  constraints. In that event, zope.interface will still produce a
  resolution order equal to the old order, but it won't be guaranteed
  to be fully C3 compliant. In the future, strict enforcement of C3
  order may be the default.

  A set of environment variables and module constants allows
  controlling several aspects of this new behaviour. It is possible to
  request warnings about inconsistent resolution orders encountered,
  and even to forbid them. Differences between the C3 resolution order
  and the previous order can be logged, and, in extreme cases, the
  previous order can still be used (this ability will be removed in
  the future). For details, see the documentation for
  ``zope.interface.ro``.

- Make inherited tagged values in interfaces respect the resolution
  order (``__iro__``), as method and attribute lookup does. Previously
  tagged values could give inconsistent results. See `issue 190
  <https://github.com/zopefoundation/zope.interface/issues/190>`_.

- Add ``getDirectTaggedValue`` (and related methods) to interfaces to
  allow accessing tagged values irrespective of inheritance. See
  `issue 190
  <https://github.com/zopefoundation/zope.interface/issues/190>`_.

- Ensure that ``Interface`` is always the last item in the ``__iro__``
  and ``__sro__``. This is usually the case, but if classes that do
  not implement any interfaces are part of a class inheritance
  hierarchy, ``Interface`` could be assigned too high a priority.
  See `issue 8 <https://github.com/zopefoundation/zope.interface/issues/8>`_.

- Implement sorting, equality, and hashing in C for ``Interface``
  objects. In micro benchmarks, this makes those operations 40% to 80%
  faster. This translates to a 20% speed up in querying adapters.

  Note that this changes certain implementation details. In
  particular, ``InterfaceClass`` now has a non-default metaclass, and
  it is enforced that ``__module__`` in instances of
  ``InterfaceClass`` is read-only.

  See `PR 183 <https://github.com/zopefoundation/zope.interface/pull/183>`_.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants