Skip to content

Commit

Permalink
Non-self-cached type hint caching x 1.
Browse files Browse the repository at this point in the history
This commit is the first in a commit chain internally caching
**non-self-cached type hints** (i.e., hints that do *not* internally
cache themselves somewhere like PEP 563- or 585-compliant type hints)
and coercing semantically equal non-self-cached type hints into
syntactically equal `@beartype`-cached type hints, dramatically
improving both the space and time efficiency of such hints.
Specifically, this commit defines a new private `beartype._decor._cache`
subpackage, shifts the existing `beartype._decor._typistry` submodule
into that subpackage as `beartype._decor._cache.cachetype`, and defines
a new `beartype._decor._cache.cachehint` submodule with which to cache
non-self-cached type hints in a subsequent commit. (*Impressive ponderousness!*)
  • Loading branch information
leycec committed Feb 12, 2021
1 parent 76b0d6c commit 871733e
Show file tree
Hide file tree
Showing 14 changed files with 305 additions and 89 deletions.
20 changes: 11 additions & 9 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -326,28 +326,30 @@ integration_>`__. If you have any time, money, or motivation left, `annotate
callables with PEP-compliant type hints <Compliance_>`__ and `decorate those
callables with the @beartype.beartype decorator <Usage_>`__.

Prefer ``beartype`` over other runtime type checkers whenever you lack control
over the objects passed to or returned from your callables – *especially*
whenever you cannot limit the size of those objects. This includes common
developer scenarios like:
Prefer ``beartype`` over other runtime and static type checkers whenever you
lack control over the objects passed to or returned from your callables –
*especially* whenever you cannot limit the size of those objects. This includes
common developer scenarios like:

* You are the author of an **open-source library** intended to be reused by a
general audience.
* You are the author of a **public app** accepting as input or generating as
output sufficiently large data internally passed to or returned from app
callables.

Prefer ``beartype`` over static type checkers whenever:
If none of the above apply, prefer ``beartype`` over static type checkers
whenever:

* You want to `check types decidable only at runtime <Versus Static Type
Checkers_>`__.
* You want to JIT_ your code with PyPy_, :superscript:`...which you should`,
which most static type checkers remain incompatible with.

Even where none of the above apply, still use ``beartype``. It's `free as in
beer and speech <gratis versus libre_>`__ and `cost-free at installation- and
runtime <Overview_>`__. Leverage ``beartype`` until you find something that
suites you better, because ``beartype`` is *always* better than nothing.
If none of the above *still* apply, still use ``beartype``. It's `free
as in beer and speech <gratis versus libre_>`__, `cost-free at installation-
and runtime <Overview_>`__, and transparently stacks with existing type
checking solutions. Leverage ``beartype`` until you find something that suites
you better, because ``beartype`` is *always* better than nothing.

Why should I use beartype?
--------------------------
Expand Down
Empty file.
115 changes: 115 additions & 0 deletions beartype/_decor/_cache/cachehint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
#!/usr/bin/env python3
# --------------------( LICENSE )--------------------
# Copyright (c) 2014-2021 Cecil Curry.
# See "LICENSE" for further details.

'''
**Type hint cache** (i.e., singleton dictionary mapping from the
machine-readable representations of all non-self-cached type hints to those
hints).**
This private submodule is *not* intended for importation by downstream callers.
'''

# ....................{ IMPORTS }....................
# See the "beartype.__init__" submodule for further commentary.
__all__ = ['STAR_IMPORTS_CONSIDERED_HARMFUL']

# ....................{ CACHES }....................
HINT_REPR_TO_HINT = {}
'''
**Type hint cache** (i.e., singleton dictionary mapping from the
machine-readable representations of all non-self-cached type hints to those
hints).**
This dictionary caches:
* `PEP 585`_-compliant type hints, which do *not* cache themselves.
* `PEP 563`_-compliant **deferred type hints** (i.e., type hints persisted as
evaluatable strings rather than actual type hints), enabled if the active
Python interpreter targets either:
* Python 3.7.0 *and* the module declaring this callable explicitly enables
`PEP 563`_ support with a leading dunder importation of the form ``from
__future__ import annotations``.
* Python 4.0.0, where `PEP 563`_ is expected to be mandatory.
This dictionary does *not* cache:
* Type hints declared by the :mod:`typing` module, which implicitly cache
themselves on subscription thanks to inscrutable metaclass magic.
Design
------
This dictionary does *not* bother caching **self-cached type hints** (i.e.,
type hints that externally cache themselves), as these hints are already cached
elsewhere. Self-cached type hints include most `PEP 484`_-compliant type hints
declared by the :mod:`typing` module, which means that subscripting type hints
declared by the :mod:`typing` module with the same child type hints reuses the
exact same internally cached objects rather than creating new uncached objects:
e.g.,
.. code-block:: python
>>> import typing
>>> typing.List[int] is typing.List[int]
True
Equivalently, this dictionary *only* caches **non-self-cached type hints**
(i.e., type hints that do *not* externally cache themselves), as these hints
are *not* already cached elsewhere. Non-self-cached type hints include *all*
`PEP 585`_-compliant type hints produced by subscripting builtin container
types, which means that subscripting builtin container types with the same
child type hints creates new uncached objects rather than reusing the same
internally cached objects: e.g.,
.. code-block:: python
>>> list[int] is list[int]
False
Implementation
--------------
This dictionary is intentionally designed as a naive dictionary rather than
robust LRU cache, for the same reasons that callables accepting hints are
memoized by the :func:`beartype._util.cache.utilcachecall.callable_cached`
rather than the :func:`functools.lru_cache` decorator. Why? Because:
* The number of different type hints instantiated across even worst-case
codebases is negligible in comparison to the space consumed by those hints.
* The :attr:`sys.modules` dictionary persists strong references to all
callables declared by previously imported modules. In turn, the
``func.__annotations__`` dunder dictionary of each such callable persists
strong references to all type hints annotating that callable. In turn, these
two statements imply that type hints are *never* garbage collected but
instead persisted for the lifetime of the active Python process. Ergo,
temporarily caching hints in an LRU cache is pointless, as there are *no*
space savings in dropping stale references to unused hints.
Motivation
----------
This dictionary enables callers to coerce non-self-cached type hints into
:mod:`beartype`-cached type hints. :mod:`beartype` effectively requires *all*
type hints to be cached somewhere! :mod:`beartype` does *not* care who, what,
or how is caching those type hints -- only that they are cached before being
passed to utility functions in the :mod:`beartype` codebase. Why? Because
most such utility functions are memoized for efficiency by the
:func:`beartype._util.cache.utilcachecall.callable_cached` decorator, which
maps passed parameters (typically including the standard ``hint`` parameter
accepting a type hint) based on object identity to previously cached return
values. You see the problem, we trust.
Non-self-cached type hints that are otherwise semantically equal are
nonetheless distinct objects and will thus be treated as distinct parameters by
memoization decorators. If this dictionary did *not* exist, non-self-cached
type hints could *not* be coerced into :mod:`beartype`-cached type hints and
thus could *not* be memoized, dramatically reducing the efficiency of
:mod:`beartype` for common type hints.
.. _PEP 484:
https://www.python.org/dev/peps/pep-0484
.. _PEP 563:
https://www.python.org/dev/peps/pep-0563
.. _PEP 585:
https://www.python.org/dev/peps/pep-0585
'''
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ def register_typistry_type(hint: type) -> str:
This function is syntactic sugar improving consistency throughout the
codebase, but is otherwise roughly equivalent to:
>>> from beartype._decor._typistry import bear_typistry
>>> from beartype._decor._cache.cachetype import bear_typistry
>>> from beartype._util.utilobject import get_object_classname
>>> bear_typistry[get_object_classname(hint)] = hint
Expand Down
97 changes: 88 additions & 9 deletions beartype/_decor/_code/_pep/_pephint.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,8 +115,57 @@
#* Grep the codebase for all existing uses of the @callable_cached decorator.
#* For use such use, if the decorated callable accepts a "hint" parameter,
# refactor that callable to use @callable_cached_hintable instead.

#FIXME: *WOOPS.* The "LRUDuffleCacheStrong" class designed below assumes that
#FIXME: *YIKES!* We are incredibly thankful we didn't actually do any of the
#above, but everything above is absolutely the *WRONG* approach. Yes, it would
#technically work, but it would be considerably slower, more fragile, and
#require considerably more work across the codebase than the preferable
#approach delineated below -- which is to say, everything above is bad.
#
#So, how do we do this the right way? *SIMPLE.* We "patch up" PEP 585-compliant
#type hints directly in the "func.__annotations__" dictionary once and only
#once sufficiently early in @beartype decoration that we don't actually need to
#do anything else, where "patch up" means:
#* If passed a callable decorated by a PEP 585-compliant type hint whose
# repr() is something that we've already seen, we *REPLACE* that hint in that
# callable's "func.__annotations__" dictionary with the hint we already saw.
#
#To do so, we should probably:
#* In the "beartype._decor._cache.cachehint" submodule:
# * Define a new private "HINT_REPR_TO_HINT" dictionary mapping from hint
# repr() strings to previously cached hints sharing the same repr() strings.
# This dictionary should actually be a trivial dictionary rather than a
# robust LRU cache, because the number of type hints used across a codebase
# is *ALWAYS* miniscule. Moreover, strong references to type hints are
# already stored in "func.__annotations__" dictionaries, so there's no space
# savings in dropping stale references to them in an LRU cache.
# * *ALL* hints except those that are already internally cached (e.g., by the
# "typing" module) should be cached into this dictionary. This obviously
# includes PEP 585-compliant type hints but also *ALL* hints produced by
# resolving deferred PEP 563-based type hint strings. Note that, in the
# latter case, we might want to additionally strip ignorable internal
# whitespace from those strings *IF* those strings contain such whitespace.
# We're pretty sure they don't (because they're programmatically constructed
# by the parser, we think), but we should still investigate this.
# * Any hint that appears in that cache should be *REPLACED* where it appears
# in the "func.__annotations__" dictionary with its cached value. Sweeeeeet.
#* Cache deferred annotations in the "beartype._decor._pep563" submodule. To do
# so, we probably want to define a new cache_hint_pep563() function in the new
# "beartype._decor._cache.cachehint" submodule. Note this function should
# internally defer to the new cache_hint_nonpep563() function detailed below
# (e.g., to ensure tuple unions are cached as typing unions).
#* Cache PEP 563 annotations in the "beartype._decor._code.codemain" or
# possibly "beartype._decor._code._pep.pepcode" submodule. *AHAH!* Yes. Here's
# what we want to do:
# * Shift the existing beartype._decor._code.pepcode.coerce_hint_pep()
# function into the new "beartype._decor._cache.cachehint" submodule as a
# new cache_hint_nonpep563() function. Note that if the passed hint was
# previously deferred and thus cached by a prior call to the
# cache_hint_pep563() function, then the current call to the
# cache_hint_nonpep563() function should just reduce to a noop.
#FIXME: *UNIT TEST THIS CACHE AND MAKE SURE IT ACTUALLY WORKS* for both PEP
#585- and 563-compliant hints, which are the principle use cases.

#FIXME: *WOOPS.* The "LRUDuffleCacheStrong" class designed below assumes thatL
#calculating the semantic height of a type hint (e.g., 3 for the complex hint
#Optional[int, dict[Union[bool, tuple[int, ...], Sequence[set]], list[str]])
#is largely trivial. It isn't -- at all. Computing that without a context-free
Expand Down Expand Up @@ -195,13 +244,20 @@
# greater than 1. This height *CANNOT* be defined during the first phase
# but *MUST* instead be deferred to the second phase.
# * ...probably loads more stuff, but that's fine.
#* In the second phase, another "while ...:" loop generates a Python code
# snippet type-checking the root hint and all child hints visitable from that
# hint in full by beginning *AT THE LAST CHILD HINT ADDED TO THE* "hints_meta"
# FixedList, generating code type-checking that hint, iteratively visiting all
# hints *IN THE REVERSE DIRECTION BACK UP THE TREE*, and so on.
#2. In the second phase, another "while ...:" loop generates a Python code
# snippet type-checking the root hint and all child hints visitable from that
# hint in full by beginning *AT THE LAST CHILD HINT ADDED TO THE*
# "hints_meta" FixedList, generating code type-checking that hint,
# iteratively visiting all hints *IN THE REVERSE DIRECTION BACK UP THE TREE*,
# and so on.
#
#That's insanely swag. It shames us that we only thought of it now. *sigh*
#FIXME: Note that this new approach will probably (hopefully only slightly)
#reduce decoration efficiency. This means that we should revert to optimizing
#the common case of PEP-noncompliant classes. Currently, we uselessly iterate
#over these classes with the same BFS below as we do PEP-compliant classes --
#which is extreme overkill. This will be trivial (albeit irksome) to revert,
#but it really is fairly crucial. *sigh*
#FIXME: Now that we actually have an audience (yay!), we *REALLY* need to avoid
#breaking anything. But implementing the above refactoring would absolutely
#break everything for an indeterminate period of time. So how do we do this?
Expand Down Expand Up @@ -240,7 +296,7 @@
#objects with the "beartypistry" is *NOT* a valid generic solution. That said,
#we *COULD* technically still do so for the subset of literal objects that are
#hashable -- which will probably be most of them, actually. To do so, we would
#then define a new beartype._decor._typistry.register_hashable() function
#then define a new beartype._decor._cache.cachetype.register_hashable() function
#registering a generic hashable. This would then necessitate a new prefix
#unique to hashables (e.g., "h"). In short, this actually entails quite a bit
#of work and fails in the general case. So, we might simply avoid this for now.
Expand Down Expand Up @@ -788,6 +844,29 @@
# cleanup routine *AFTER* code type-checking these parameters. While
# mildly inefficient, function calls incur considerably less overhead
# when compiled away from interpreted Python bytecode.
#FIXME: Note that the above scheme by definition *REQUIRES* assignment
#expressions and thus Python >= 3.8 for general-purpose O(1) type-checking of
#arbitrarily nested dictionaries and sets. Why? Because each time we iterate an
#iterator over those data structures we lose access to the previously iterated
#value, which means there is *NO* sane means of type-checking nested
#dictionaries or sets without assignment expressions. But that's unavoidable
#and Python <= 3.7 is the past, so that's largely fine.
#
#What we can do under Python <= 3.7, however, is the following:
#* If the (possibly nested) type hint is of the form
# "{checkable}[...,{dict_or_set}[{class},{class}],...]" where
# "{checkable}" is an arbitrary parent type hint safely checkable under Python
# <= 3.7 (e.g., lists, unions), "{dict_or_set}" is (wait for it) either "dict"
# or "set", and "{class}" is an arbitrary type, then that hint *IS* safely
# checkable under Python <= 3.7. Note that items (i.e., keys and values) can
# both be checked in O(1) time under Python <= 3.7 by just validating the key
# and value of a different key-value pair (e.g., by iterating once for the key
# and then again for the value). That does have the disadvantage of then
# requiring O(n) iteration to raise a human-readable exception if a dictionary
# value fails a type-check, but we're largely okay with that. Again, this only
# applies to an edge case under obsolete Python versions, so... *shrug*
#* Else, a non-fatal warning should be emitted and the portion of that type
# hint that *CANNOT* be safely checked under Python <= 3.7 should be ignored.

#FIXME: *WOOPS.* The "LRUCacheStrong" class is absolutely awesome and we'll
#absolutely be reusing that for various supplementary purposes across the
Expand Down Expand Up @@ -1195,7 +1274,7 @@
BeartypeDecorHintPepUnsupportedException,
BeartypeDecorHintPep484Exception,
)
from beartype._decor._typistry import (
from beartype._decor._cache.cachetype import (
register_typistry_forwardref,
register_typistry_type,
register_typistry_tuple,
Expand Down
2 changes: 1 addition & 1 deletion beartype/_decor/_code/_pep/pepcode.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
from beartype._decor._code._pep._pepsnip import (
PEP_CODE_PITH_ROOT_PARAM_NAME_PLACEHOLDER)
from beartype._decor._data import BeartypeData
from beartype._decor._typistry import register_typistry_forwardref
from beartype._decor._cache.cachetype import register_typistry_forwardref
from beartype._util.cache.utilcacheerror import reraise_exception_cached
from beartype._util.hint.utilhintget import (
get_hint_forwardref_classname_relative_to_obj,
Expand Down
2 changes: 2 additions & 0 deletions beartype/_decor/_code/codemain.py
Original file line number Diff line number Diff line change
Expand Up @@ -853,6 +853,8 @@ def _code_check_params(data: BeartypeData) -> 'Tuple[str, bool]':
# is both PEP-compliant and supported, *OR* raise an exception
# otherwise (i.e., if this hint is neither PEP-noncompliant nor a
# supported PEP-compliant hint).
#
# Do this first *BEFORE* passing this hint to any further callables.
hint = coerce_hint_pep(
func=func,
pith_name=param_name,
Expand Down
2 changes: 1 addition & 1 deletion beartype/_decor/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@
from beartype._decor._code.codesnip import (
ARG_NAME_FUNC, ARG_NAME_TYPISTRY)
from beartype._decor._data import BeartypeData
from beartype._decor._typistry import bear_typistry
from beartype._decor._cache.cachetype import bear_typistry
from beartype._util.cache.pool.utilcachepoolobjecttyped import (
acquire_object_typed, release_object_typed)
from beartype._decor._code._pep._error.peperror import (
Expand Down
2 changes: 1 addition & 1 deletion beartype/_util/utilclass.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
'''
Fully-qualified name of the module declaring all builtins followed by a ``.``,
defined purely as a trivial optimization for the frequently accessed
:class:`beartype._decor._typistry.Beartypistry.__setitem__` dunder method.
:class:`beartype._decor._cache.cachetype.Beartypistry.__setitem__` dunder method.
'''

# ....................{ VALIDATORS }....................
Expand Down
2 changes: 1 addition & 1 deletion beartype/roar.py
Original file line number Diff line number Diff line change
Expand Up @@ -540,7 +540,7 @@ class _BeartypeDecorBeartypistryException(BeartypeDecorException):
This exception is raised at decoration time from the
:func:`beartype.beartype` decorator when erroneously accessing the
**beartypistry** (i.e., :class:`beartype._decor._typistry.bear_typistry`
**beartypistry** (i.e., :class:`beartype._decor._cache.cachetype.bear_typistry`
singleton).
This private exception denotes a critical internal issue and should thus
Expand Down

0 comments on commit 871733e

Please sign in to comment.