Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve InterfaceClass __hash__ performance #156

Merged
merged 2 commits into from
Jan 27, 2020
Merged

Conversation

jensens
Copy link
Member

@jensens jensens commented Jan 24, 2020

Performance optimization of __hash__ method on InterfaceClass.

The method is called very often (i.e several 100.000 times on a Plone 5.2 startup).
Even if a single hash is very fast, this sums up to a real significant part of processing time.
Because the hash value calculated from name and __module__ never changes (afaik), it can be cached.

I took Py-Spy to analyze the performance.
On reindexing of an index in ZCatalog the __hash__ took ~9 seconds after 10000 samples before changes.
With a cached hash like in this implementation __hash__ takes ~4.7 seconds.
The whole reindex index process came down from 402s to 320s (1.26x faster).
The test performance python setup.py test goes from 0.614s down to 0.575s (1.07x faster).

@fschulze
Copy link
Contributor

Looks good to me. If someone does anything which breaks this, they can be expected to adapt to the change.

@ale-rt
Copy link
Member

ale-rt commented Jan 24, 2020

I am testing this.
I will let you know but looks sweet!

@jaroel
Copy link

jaroel commented Jan 24, 2020

I'm pondering if it could be possible that an Interface instance is persisted, and then moved to a different module. How would that affect the hash?
In other words, would it be better/safer to use self._v_cached_hash?

Copy link

@ctheune ctheune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. I guess the fall-through path has become a tiny bit more expensive but the regular cache access obviously is much faster than before. Someone questioned whether name or module can change ... it's Python so people could manipulate this but I don't think it practically will.

@ale-rt
Copy link
Member

ale-rt commented Jan 25, 2020

I added this checkout on Plone5.2 found what could be a problem in some (not all) browser doctests (which are a functional test):

...
  File "/var/lib/jenkins/.buildout/eggs/plone.autoform-1.8.1-py3.7.egg/plone/autoform/utils.py", line 287, in processFields
    _process_fieldsets(form, schema, groups, all_fields, prefix, defaultGroup)
  File "/var/lib/jenkins/.buildout/eggs/plone.autoform-1.8.1-py3.7.egg/plone/autoform/utils.py", line 170, in _process_fieldsets
    group.fields += new_fields
  File "/var/lib/jenkins/.buildout/eggs/z3c.form-3.7.0-py3.7.egg/z3c/form/util.py", line 298, in __add__
    return self.__class__(self, other)
  File "/var/lib/jenkins/.buildout/eggs/z3c.form-3.7.0-py3.7.egg/z3c/form/field.py", line 136, in __init__
    raise ValueError("Duplicate name", name)
ValueError: ('Duplicate name', 'IDublinCore.creators')

Robot tests (which are also functional) do not suffer from this.

Just mentioning as a note, it might be that plone.autoform needs an update and Plone 5.2 is not really thought to work with zope.interface master.

Abyway for me it is good to go :)

Copy link
Member

@mgedmin mgedmin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jensens jensens merged commit eeaacb6 into master Jan 27, 2020
@jensens jensens deleted the hash_performance branch January 27, 2020 10:09
@jamadden
Copy link
Member

I'm pondering if it could be possible that an Interface instance is persisted, and then moved to a different module. How would that affect the hash?
In other words, would it be better/safer to use self._v_cached_hash?

InterfaceClass defines __reduce__ which is used in preference to the default pickle behaviour of saving an instance's __dict__. __reduce__ simply returns the name of the interface (which is automatically combined with its __module__ by the pickle machinery) so unpickling turns into a global lookup of the object with the same name and module. Also, _v_ attributes are only special to subclasses of persistent.Persistent, which InterfaceClass isn't.

So this would matter only to a a subclass of InterfaceClass that overrides __reduce__, potentially by subclassing Persistent.

On current master, it's not possible to subclass both Persistent and InterfaceClass:

>>> from zope import interface
>>> from persistent import Persistent
>>> class PersistentInterfaceClass(Persistent, 
          interface.interface.InterfaceClass):
…        pass 
TypeError: multiple bases have instance lay-out conflict

On current released versions, it's not possible to pickle such an object because there are things in __dict__ that can't be pickled:

>>> from zope import interface
>>> from persistent import Persistent
>>> class PersistentInterfaceClass(Persistent,
           interface.interface.InterfaceClass):
...     pass
...
>>> class IP(metaclass=PersistentInterfaceClass): pass
...
>>> IP
<__main__.PersistentInterfaceClass object at 0x10f5c59d0>
>>> type(IP)
<class '__main__.PersistentInterfaceClass'>
>>> import pickle
>>> pickle.dumps(IP)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: Can't pickle local object 'WeakKeyDictionary.__init__.<locals>.remove'

If you switch the order of inheritance around you can pickle it, but it goes back to pickling by name using __reduce__ and ignoring the persistence machinery. ZODB fails to pickle IP and also fails to pickle the reversed order of inheritance because there's no __getstate__ defined.

So it would take subclassing Persistent and InterfaceClass in the right order, and also defining __getstate__ in this subclass to attempt to pickle an interface in this way. Is anyone aware of something that does that, or that has a use case to do that?

Copy link
Member

@jamadden jamadden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jamadden jamadden added this to the 5.0 milestone Mar 18, 2020
admin-turris pushed a commit to CZ-NIC/turris-os-packages that referenced this pull request Jul 10, 2020
5.1.0 (2020-04-08)
==================

- Make ``@implementer(*iface)`` and ``classImplements(cls, *iface)``
  ignore redundant interfaces. If the class already implements an
  interface through inheritance, it is no longer redeclared
  specifically for *cls*. This solves many instances of inconsistent
  resolution orders, while still allowing the interface to be declared
  for readability and maintenance purposes. See `issue 199
  <https://github.com/zopefoundation/zope.interface/issues/199>`_.

- Remove all bare ``except:`` statements. Previously, when accessing
  special attributes such as ``__provides__``, ``__providedBy__``,
  ``__class__`` and ``__conform__``, this package wrapped such access
  in a bare ``except:`` statement, meaning that many errors could pass
  silently; typically this would result in a fallback path being taken
  and sometimes (like with ``providedBy()``) the result would be
  non-sensical. This is especially true when those attributes are
  implemented with descriptors. Now, only ``AttributeError`` is
  caught. This makes errors more obvious.

  Obviously, this means that some exceptions will be propagated
  differently than before. In particular, ``RuntimeError`` raised by
  Acquisition in the case of circular containment will now be
  propagated. Previously, when adapting such a broken object, a
  ``TypeError`` would be the common result, but now it will be a more
  informative ``RuntimeError``.

  In addition, ZODB errors like ``POSKeyError`` could now be
  propagated where previously they would ignored by this package.

  See `issue 200 <https://github.com/zopefoundation/zope.interface/issues/200>`_.

- Require that the second argument (*bases*) to ``InterfaceClass`` is
  a tuple. This only matters when directly using ``InterfaceClass`` to
  create new interfaces dynamically. Previously, an individual
  interface was allowed, but did not work correctly. Now it is
  consistent with ``type`` and requires a tuple.

- Let interfaces define custom ``__adapt__`` methods. This implements
  the other side of the :pep:`246` adaptation protocol: objects being
  adapted could already implement ``__conform__`` if they know about
  the interface, and now interfaces can implement ``__adapt__`` if
  they know about particular objects. There is no performance penalty
  for interfaces that do not supply custom ``__adapt__`` methods.

  This includes the ability to add new methods, or override existing
  interface methods using the new ``@interfacemethod`` decorator.

  See `issue 3 <https://github.com/zopefoundation/zope.interface/issues/3>`_.

- Make the internal singleton object returned by APIs like
  ``implementedBy`` and ``directlyProvidedBy`` for objects that
  implement or provide no interfaces more immutable. Previously an
  internal cache could be mutated. See `issue 204
  <https://github.com/zopefoundation/zope.interface/issues/204>`_.

5.0.2 (2020-03-30)
==================

- Ensure that objects that implement no interfaces (such as direct
  subclasses of ``object``) still include ``Interface`` itself in
  their ``__iro___`` and ``__sro___``. This fixes adapter registry
  lookups for such objects when the adapter is registered for
  ``Interface``. See `issue 197
  <https://github.com/zopefoundation/zope.interface/issues/197>`_.

5.0.1 (2020-03-21)
==================

- Ensure the resolution order for ``InterfaceClass`` is consistent.
  See `issue 192 <https://github.com/zopefoundation/zope.interface/issues/192>`_.

- Ensure the resolution order for ``collections.OrderedDict`` is
  consistent on CPython 2. (It was already consistent on Python 3 and PyPy).

- Fix the handling of the ``ZOPE_INTERFACE_STRICT_IRO`` environment
  variable. Previously, ``ZOPE_INTERFACE_STRICT_RO`` was read, in
  contrast with the documentation. See `issue 194
  <https://github.com/zopefoundation/zope.interface/issues/194>`_.

5.0.0 (2020-03-19)
==================

- Make an internal singleton object returned by APIs like
  ``implementedBy`` and ``directlyProvidedBy`` immutable. Previously,
  it was fully mutable and allowed changing its ``__bases___``. That
  could potentially lead to wrong results in pathological corner
  cases. See `issue 158
  <https://github.com/zopefoundation/zope.interface/issues/158>`_.

- Support the ``PURE_PYTHON`` environment variable at runtime instead
  of just at wheel build time. A value of 0 forces the C extensions to
  be used (even on PyPy) failing if they aren't present. Any other
  value forces the Python implementation to be used, ignoring the C
  extensions. See `PR 151 <https://github.com/zopefoundation/zope.interface/pull/151>`_.

- Cache the result of ``__hash__`` method in ``InterfaceClass`` as a
  speed optimization. The method is called very often (i.e several
  hundred thousand times during Plone 5.2 startup). Because the hash value never
  changes it can be cached. This improves test performance from 0.614s
  down to 0.575s (1.07x faster). In a real world Plone case a reindex
  index came down from 402s to 320s (1.26x faster). See `PR 156
  <https://github.com/zopefoundation/zope.interface/pull/156>`_.

- Change the C classes ``SpecificationBase`` and its subclass
  ``ClassProvidesBase`` to store implementation attributes in their structures
  instead of their instance dictionaries. This eliminates the use of
  an undocumented private C API function, and helps make some
  instances require less memory. See `PR 154 <https://github.com/zopefoundation/zope.interface/pull/154>`_.

- Reduce memory usage in other ways based on observations of usage
  patterns in Zope (3) and Plone code bases.

  - Specifications with no dependents are common (more than 50%) so
    avoid allocating a ``WeakKeyDictionary`` unless we need it.
  - Likewise, tagged values are relatively rare, so don't allocate a
    dictionary to hold them until they are used.
  - Use ``__slots___`` or the C equivalent ``tp_members`` in more
    common places. Note that this removes the ability to set arbitrary
    instance variables on certain objects.
    See `PR 155 <https://github.com/zopefoundation/zope.interface/pull/155>`_.

  The changes in this release resulted in a 7% memory reduction after
  loading about 6,000 modules that define about 2,200 interfaces.

  .. caution::

     Details of many private attributes have changed, and external use
     of those private attributes may break. In particular, the
     lifetime and default value of ``_v_attrs`` has changed.

- Remove support for hashing uninitialized interfaces. This could only
  be done by subclassing ``InterfaceClass``. This has generated a
  warning since it was first added in 2011 (3.6.5). Please call the
  ``InterfaceClass`` constructor or otherwise set the appropriate
  fields in your subclass before attempting to hash or sort it. See
  `issue 157 <https://github.com/zopefoundation/zope.interface/issues/157>`_.

- Remove unneeded override of the ``__hash__`` method from
  ``zope.interface.declarations.Implements``. Watching a reindex index
  process in ZCatalog with on a Py-Spy after 10k samples the time for
  ``.adapter._lookup`` was reduced from 27.5s to 18.8s (~1.5x faster).
  Overall reindex index time shrunk from 369s to 293s (1.26x faster).
  See `PR 161
  <https://github.com/zopefoundation/zope.interface/pull/161>`_.

- Make the Python implementation closer to the C implementation by
  ignoring all exceptions, not just ``AttributeError``, during (parts
  of) interface adaptation. See `issue 163
  <https://github.com/zopefoundation/zope.interface/issues/163>`_.

- Micro-optimization in ``.adapter._lookup`` , ``.adapter._lookupAll``
  and ``.adapter._subscriptions``: By loading ``components.get`` into
  a local variable before entering the loop a bytcode "LOAD_FAST 0
  (components)" in the loop can be eliminated. In Plone, while running
  all tests, average speedup of the "owntime" of ``_lookup`` is ~5x.
  See `PR 167
  <https://github.com/zopefoundation/zope.interface/pull/167>`_.

- Add ``__all__`` declarations to all modules. This helps tools that
  do auto-completion and documentation and results in less cluttered
  results. Wildcard ("*") are not recommended and may be affected. See
  `issue 153
  <https://github.com/zopefoundation/zope.interface/issues/153>`_.

- Fix ``verifyClass`` and ``verifyObject`` for builtin types like
  ``dict`` that have methods taking an optional, unnamed argument with
  no default value like ``dict.pop``. On PyPy3, the verification is
  strict, but on PyPy2 (as on all versions of CPython) those methods
  cannot be verified and are ignored. See `issue 118
  <https://github.com/zopefoundation/zope.interface/issues/118>`_.

- Update the common interfaces ``IEnumerableMapping``,
  ``IExtendedReadMapping``, ``IExtendedWriteMapping``,
  ``IReadSequence`` and ``IUniqueMemberWriteSequence`` to no longer
  require methods that were removed from Python 3 on Python 3, such as
  ``__setslice___``. Now, ``dict``, ``list`` and ``tuple`` properly
  verify as ``IFullMapping``, ``ISequence`` and ``IReadSequence,``
  respectively on all versions of Python.

- Add human-readable ``__str___`` and ``__repr___`` to ``Attribute``
  and ``Method``. These contain the name of the defining interface
  and the attribute. For methods, it also includes the signature.

- Change the error strings raised by ``verifyObject`` and
  ``verifyClass``. They now include more human-readable information
  and exclude extraneous lines and spaces. See `issue 170
  <https://github.com/zopefoundation/zope.interface/issues/170>`_.

  .. caution:: This will break consumers (such as doctests) that
               depended on the exact error messages.

- Make ``verifyObject`` and ``verifyClass`` report all errors, if the
  candidate object has multiple detectable violations. Previously they
  reported only the first error. See `issue
  <https://github.com/zopefoundation/zope.interface/issues/171>`_.

  Like the above, this will break consumers depending on the exact
  output of error messages if more than one error is present.

- Add ``zope.interface.common.collections``,
  ``zope.interface.common.numbers``, and ``zope.interface.common.io``.
  These modules define interfaces based on the ABCs defined in the
  standard library ``collections.abc``, ``numbers`` and ``io``
  modules, respectively. Importing these modules will make the
  standard library concrete classes that are registered with those
  ABCs declare the appropriate interface. See `issue 138
  <https://github.com/zopefoundation/zope.interface/issues/138>`_.

- Add ``zope.interface.common.builtins``. This module defines
  interfaces of common builtin types, such as ``ITextString`` and
  ``IByteString``, ``IDict``, etc. These interfaces extend the
  appropriate interfaces from ``collections`` and ``numbers``, and the
  standard library classes implement them after importing this module.
  This is intended as a replacement for third-party packages like
  `dolmen.builtins <https://pypi.org/project/dolmen.builtins/>`_.
  See `issue 138 <https://github.com/zopefoundation/zope.interface/issues/138>`_.

- Make ``providedBy()`` and ``implementedBy()`` respect ``super``
  objects. For instance, if class ``Derived`` implements ``IDerived``
  and extends ``Base`` which in turn implements ``IBase``, then
  ``providedBy(super(Derived, derived))`` will return ``[IBase]``.
  Previously it would have returned ``[IDerived]`` (in general, it
  would previously have returned whatever would have been returned
  without ``super``).

  Along with this change, adapter registries will unpack ``super``
  objects into their ``__self___`` before passing it to the factory.
  Together, this means that ``component.getAdapter(super(Derived,
  self), ITarget)`` is now meaningful.

  See `issue 11 <https://github.com/zopefoundation/zope.interface/issues/11>`_.

- Fix a potential interpreter crash in the low-level adapter
  registry lookup functions. See issue 11.

- Adopt Python's standard `C3 resolution order
  <https://www.python.org/download/releases/2.3/mro/>`_ to compute the
  ``__iro__`` and ``__sro__`` of interfaces, with tweaks to support
  additional cases that are common in interfaces but disallowed for
  Python classes. Previously, an ad-hoc ordering that made no
  particular guarantees was used.

  This has many beneficial properties, including the fact that base
  interface and base classes tend to appear near the end of the
  resolution order instead of the beginning. The resolution order in
  general should be more predictable and consistent.

  .. caution::
     In some cases, especially with complex interface inheritance
     trees or when manually providing or implementing interfaces, the
     resulting IRO may be quite different. This may affect adapter
     lookup.

  The C3 order enforces some constraints in order to be able to
  guarantee a sensible ordering. Older versions of zope.interface did
  not impose similar constraints, so it was possible to create
  interfaces and declarations that are inconsistent with the C3
  constraints. In that event, zope.interface will still produce a
  resolution order equal to the old order, but it won't be guaranteed
  to be fully C3 compliant. In the future, strict enforcement of C3
  order may be the default.

  A set of environment variables and module constants allows
  controlling several aspects of this new behaviour. It is possible to
  request warnings about inconsistent resolution orders encountered,
  and even to forbid them. Differences between the C3 resolution order
  and the previous order can be logged, and, in extreme cases, the
  previous order can still be used (this ability will be removed in
  the future). For details, see the documentation for
  ``zope.interface.ro``.

- Make inherited tagged values in interfaces respect the resolution
  order (``__iro__``), as method and attribute lookup does. Previously
  tagged values could give inconsistent results. See `issue 190
  <https://github.com/zopefoundation/zope.interface/issues/190>`_.

- Add ``getDirectTaggedValue`` (and related methods) to interfaces to
  allow accessing tagged values irrespective of inheritance. See
  `issue 190
  <https://github.com/zopefoundation/zope.interface/issues/190>`_.

- Ensure that ``Interface`` is always the last item in the ``__iro__``
  and ``__sro__``. This is usually the case, but if classes that do
  not implement any interfaces are part of a class inheritance
  hierarchy, ``Interface`` could be assigned too high a priority.
  See `issue 8 <https://github.com/zopefoundation/zope.interface/issues/8>`_.

- Implement sorting, equality, and hashing in C for ``Interface``
  objects. In micro benchmarks, this makes those operations 40% to 80%
  faster. This translates to a 20% speed up in querying adapters.

  Note that this changes certain implementation details. In
  particular, ``InterfaceClass`` now has a non-default metaclass, and
  it is enforced that ``__module__`` in instances of
  ``InterfaceClass`` is read-only.

  See `PR 183 <https://github.com/zopefoundation/zope.interface/pull/183>`_.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants