Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Plugin API, Phase III: The __beartype_hint__() protocol #192

Open
leycec opened this issue Nov 29, 2022 · 7 comments
Open

Comments

@leycec
Copy link
Member

leycec commented Nov 29, 2022

This feature request is the third milestone on the great journey to a first-class best-of-breed plugin architecture for @beartype. This milestone is orthogonal to the second milestone (i.e., beartype.sign); that is, these two plugin milestones may be implemented in any order. Indeed, despite offering more powerful plugin machinery than the second milestone, this milestone is arguably simpler to implement and document. In other words, we might want to do this one first. universe: 0. @beartype: 1.

In this phase, we introduce a new public __beartype_hint__() protocol enabling users to attach type hint transforms to arbitrary user-defined classes. Since type hint transforms include Turing-complete beartype validators (e.g., typing.Annotated[muh_class, beartype.vale.Is[muh_instance: bool(muh_instance)]]), this protocol effectively enables users to define arbitrary runtime type-checking. Plugin architecture, I summon thee! 🧙‍♂️

By default, @beartype type-checks any passed parameter or return value against a user-defined type cls with a trivial isinstance(obj, cls) test at runtime. Now, however, you'll be able to enhance how your classes are type-checked with additional constraint checks dynamically defined by you at runtime and (more importantly) permanently attached to those classes and all instances of those classes.

Dig Deep Like Shovel Knight on a Digging Bender

Let's dig deep into what exactly we're talking about here.

Consider the following PEP 544-compliant protocol:

from beartype import beartype
from dataclasses import dataclass

@beartype
@dataclass
def DoorsOfPerception(object):
    heaven_or_hell: Literal['heaven', 'hell']
    circle_of_hell: Optional[Literal[
        'limbo', 'lust', 'gluttony', 'greed', 'wrath',
        'heresy', 'violence', 'fraud', 'treachery',
    ]]

@beartype deeply type-checks typing.Literal[...] type hints and thus validates the heaven_or_hell field as expected. But what about the circle_of_hell type hint? Clearly, Heaven has no Circles of Hell. Clearly. We thus want @beartype to validate that:

  • If heaven_or_hell == 'heaven', then circle_of_hell is None.
  • If heaven_or_hell == 'hell', then circle_of_hell in ('limbo', 'lust', 'gluttony', 'greed', 'wrath', 'heresy', 'violence', 'fraud', 'treachery').

In other words, we want @beartype to type-check conditionally depending upon the contents of the DoorsOfPerception instance being type-checked. How can we do this depraved act of fiendish difficulty? Simple.

Digging a Hole So Deep We Ended up on the Moon

We use the __beartype_hint__() protocol to instruct @beartype to apply additional constraints when runtime type-checking.

First, we easily define this protocol using the existing typing.Protocol machinery standardized by PEP 544:

from beartype.typing import Any, Protocol

def BeartypeHintable(Protocol):
    @classmethod
    def __beartype_hint__(cls) -> Any:
        '''
        **Beartype type hint transform** (i.e., :mod:`beartype-specific
        dunder class method returning a new PEP-compliant type hint
        constraining this class with additional runtime type-checking).
        '''

        pass

Second, we augment our previously defined DoorsOfPerception dataclass with additional runtime type-checking defined via this protocol:

from beartype import BeartypeHintable, beartype
from beartype.typing import Any, Annotated
from beartype.vale import Is, IsAttr, IsEqual
from dataclasses import dataclass

@beartype
@dataclass
def DoorsOfPerception(BeartypeHintable):
    heaven_or_hell: Literal['heaven', 'hell']
    circle_of_hell: Optional[Literal[
        'limbo', 'lust', 'gluttony', 'greed', 'wrath',
        'heresy', 'violence', 'fraud', 'treachery',
    ]]

    @classmethod
    def __beartype_hint__(cls) -> Any:
        return Annotated[cls,
            IsAttr['heaven_or_hell',
                (IsEqual['heaven'] & IsAttr['circle_of_hell',  IsEqual[None]]) |
                (IsEqual['hell'  ] & IsAttr['circle_of_hell', ~IsEqual[None]])
            ]
        ]

Done and done. @beartype now detects that DoorsOfPerception satisfies the BeartypeHintable protocol and implicitly applies the additional conditional constraints defined by the __beartype_hint__() dunder method when type-checking instances of that dataclass at runtime.

Shorthand for Shorties

Okay. Everything is great, but it's also a bit long-winded. Can we improvise some sort of convenient syntactic sugar that elides away all of the boilerplate implied by actual real-world usage of the BeartypeHintable protocol?

Yes. Yes, we can. Consider this terse shorthand for implementing the exact same dataclass as implemented above – but with far fewer lines of DRY-violating scaffolding:

from beartype import beartype
from beartype.typing import Any, Annotated
from beartype.vale import Is, IsAttr, IsEqual
from dataclasses import dataclass

@beartype
@dataclass
def DoorsOfPerception(IsAttr['heaven_or_hell',
    (IsEqual['heaven'] & IsAttr['circle_of_hell',  IsEqual[None]]) |
    (IsEqual['hell'  ] & IsAttr['circle_of_hell', ~IsEqual[None]])
]):
    heaven_or_hell: Literal['heaven', 'hell']
    circle_of_hell: Optional[Literal[
        'limbo', 'lust', 'gluttony', 'greed', 'wrath',
        'heresy', 'violence', 'fraud', 'treachery',
    ]]

...wut u say!?!?

So. The idea here is that beartype validators (i.e., public beartype.vale.Is*[...] constraint factories) can be generalized to be subclassable. When subclassed, beartype validators will automatically:

  • Replace themselves in the method order resolution (MRO) (__mro__) of the class being subclassed with the beartype.BeartypeHintable protocol.
  • Dynamically monkey-patch the class being subclassed by adding an appropriate __beartype_hint__() dunder classmethod to that class.

You are possibly now muttering to yourself in the candle-lit darkness of your own peculiar (wo)mancave: "B-b-but... that's impossible!" Actually, it's all too possible; I used to hate these sorts of maleficent shenanigans, because the typing module does exactly this sort of thing all over the place. PEP 484 refers to this decadent behaviour as "type erasure."

In fact, this sleight-of-hand was standardized over five years ago by PEP 560 via the mostly undocumented __mro_entries__() dunder method. By abusing __mro_entries__(), any type (including the type of beartype validators) can transparently replace itself on-the-fly with any other arbitrary type(s) in the MRO of subclasses.

In this case, the type of beartype validators will define a trivial __mro_entries__() that just replaces itself with BeartypeHintable when subclassed. Darkness: my old friend.

\o/

This syntactic sugar is strongly inspired by @antonagestam's wondrous phantom-types, which applies a similar syntax to similar great effect (albeit implemented in a completely different low-level way with __instancecheck__() dunder metaclass methods, which itself is endlessly fascinating). Thanks so much, @antonagestam!

That's Just How We Roll in 2023

In theory, the BeartypeHintable protocol should suffice to provide a fully-complete plugin architecture. The __beartype_hint__() dunder method can be implemented to return Turing-complete beartype validators capable of doing literally anything including bad stuff but let's ignore that (i.e., typing.Annotated[{user_type}, beartype.vale.Is[{plugin_func}]]).

Users: commence rejoicing once @leycec actually does something about this.

@rtbs-dev
Copy link

YES. YESSSSS.

Ok, so once again, I have to digest this, but i'm pretty sure it completely nukes the usefulness of my efforts last night. STILL it would help me a great deal if you @leycec and perhaps @antonagestam would see if I was way off base with this proof-of-concept to realize the error of my protocol ways and embrace the way phantom-types hack-and-slashed the path through the jungle before me?

Meaning, I have a PR that is meant only to show a diff on how I think a beartype-backed phantom-type would operate... and it would help me a lot if I could understand what about that thing would change if this marvelous API was introduced.

see here pretty please?? 😄

@rtbs-dev
Copy link

I'm probably gonna spam stuff, as I think of it, but first...

Say we have two BeartypeHintable objects, both with vale validators in their __subclass_init__ call. Ok so those will get transformed into __beartype_hint__'s as classmethods.

Then what happens if I don't explicitly use validators in an inheriting class, but it's implicit in the other classes' __beartype_hint__ already?

E.g. usage:

@beartype
class PositiveInt(int, Is[lambda x: x>0]):
    ...

@beartype
class PrimeInt(int, Is[sieve_eratosthenes]):
    ...

@beartype
class EvenInt(int, Is[lambda x: x%2==0]):
    ...

Then what happens with:

@beartype  # product type: 
class PositivePrime(PositiveInt, PrimeInt):
    ...

@beartype  # sum type: 
class PrimeOrEven(PrimeInt | EvenInt):
    ...
  • Would those also collect the __beartype_hint__'s from their arguments? Or would we need to wrap the arguments in e.g. IsSubclass[...]?
  • Could we provide tagged union narrowing? Either by default or with a BeartypeConfig option? Such that type(PrimeOrEven(17)) is PrimeInt at runtime? This would be sweeeet. Might require hacking into the Literal[...] system to get MyPy to care, but maybe this is possible with the sign system?

@rtbs-dev
Copy link

OH another thought:

  • Can we... maybe... perhaps... consider being very bad and adding a sneaky __beartype_hint__ to the stuff in beartype.typing? 😅 This is a massive pro to using beartype.typing, and gives us a lot of functionality on base types for free, without messing with default type checkers at all really, right?

@rtbs-dev
Copy link

rtbs-dev commented Dec 14, 2022

@leycec another thing I'm encountering (how do I always manage to break stuff the very first time I try to use said thing??) in my particular flavor of "the wild":

Annotated Generics

Yeah, so... I want a pluggable system for folks to add parameters to a "type" and have beartype.vale validators check those parameterized types at their runtime. For instance, say I want to make a sort of "schema" for the pandas-esque (and very cool) static_frames library:

import static_frame as sf
import beartype.typing as bt
from beartype.vale import IsAttr, IsEqual
from beartype.door import is_bearable

# first we need to make typevars for generics
EntityKind = bt.TypeVar('EntityKind', bound=str)

FlagArray = bt.Annotated[
    sf.SeriesHE, # it's a series
    IsAttr['name', IsEqual[bt.Tuple[EntityKind, EntityKind]]]  # it's got a (name1,name2) name
    ## ^ this part should propagate "hey this thing is generic!"
    & IsAttr['index', IsAttr['depth', IsEqual[2]]]  # <- this means it's hierarchical
]

"""doesn't work either: 
FlagArray = bt.Annotated[
    object, 
    IsInstance[sf.SeriesHE]
    & IsAttr['name', IsEqual[bt.Tuple[EntityKind, EntityKind]]]
    & IsAttr['index', IsAttr['depth', IsEqual[2]]]
]
"""

trying to subscript FlagArray now won't work: FlagArray['e','v'] gives

TypeError: typing.Annotated[object, IsInstance[static_frame.core.series.SeriesHE] & IsAttr['name', IsEqual[tuple[~EntityKind, ~EntityKind]]] & IsAttr['index', IsAttr['depth', IsEqual[2]]]] is not a generic class

Same for FlagArray[bt.Literal['e'],bt.Literal['v']], but in this case I don't think that's the point... I think Annotated is explicitly stopping us from using generics in the metadata part.

SO... how to make parameterized Validators for reuse? And... can I use the type parameterization system with this new proposal?

e.g., using the new PEP 695 syntax, since I'm tired of typing TypeVar:

@beartype
@dataclass
def DoorsOfPerception[T<:str](IsAttr['niceplace_or_hell',
    (IsEqual[T] & IsAttr['circle_of_hell',  IsEqual[None]]) |
    (IsEqual['hell'] & IsAttr['circle_of_hell', ~IsEqual[None]])
]):
    niceplace_or_hell: T | Literal['hell']
    circle_of_hell: Optional[Literal[
        'limbo', 'lust', 'gluttony', 'greed', 'wrath',
        'heresy', 'violence', 'fraud', 'treachery',
    ]]

So I want to get something like, yknow, DoorsOfPerception[Literal['valhalla']] that gives my users some level of extensibility... who am I to say DoorsOfPerception[Literal['pizzahut']] isn't their most useful version of the class??1

Is that... legal?

I mean...
What kind of question is that?! This is python, damnit!

The issue here is, I'm not even really sure how you're supposed to use generic parameters at runtime, anyway. All of your Is* magic type things are black-magic to me, so I presume I would only be able to do something like this with the __mro_entries__ sorcery you're cooking up, anyway.

EDIT: looks like PEP675 people beat me to it, but this begs the questions 1) why we don't also get LiteralEnum, LiteralBool, etc, since Literal[T] seems way more elegant, and 2) is the equivalent possible, now, with your IsEqual system? Like, how to make an "IsEqualMyType(IsEqual)" subtype that checks for equality but higher-kinded? IsEqualFactory[MyType]-> IsEqual[T<:MyType]?

Footnotes

  1. which is another thing I was today-years-old when I learned that I am strictly forbidden from creating Literal[T] for typevar T, or anything of this kind, since apparently that is bad, and I should feel bad. So... now I can't get out of my head that I should have a runtime way to allow users to pass an x: str parameter and have a special context where I automatically assume they need Literal[x] at runtime. WHY? BECAUSE D.R.Y., THAT'S WHY. Sheesh, it's like the python.typing people want us to type Literal and Annotated and Union and all these verbose things as a form of pennance for daring to have a type system in python at all in the first place 😨

@leycec
Copy link
Member Author

leycec commented Dec 15, 2022

Gah! So sorry for the radio silence. I've been delving deep into the stinky bowels of issue #180 for our upcoming beartype 0.12.0 release, which I believed would only take a day. "What could be simpler?", I said. It's been over two weeks. Our commit history has seen better days.

You ask the pertinent questions that hit me hard in the gut. Like the entirety of PEP 695 -- "Type Parameter Syntax", which is both amazing and horrifying. Thank you so much for giving me the heads up on that, although my head is now throbbing with thunderous pain. Seriously. @beartype must support PEP 695, because everyone will immediately start using that in Python 3.12, but I don't even know if runtime type-checkers can reasonably support the entirety of PEP 695. Apparently, we're now throwing out explicit type variance (e.g., covariant and contravariant parameters to typing.TypeVar) and just requiring type-checkers to infer type variance implicitly based on an extremely non-trivial algorithm that I don't fully understand and may be too computationally expensive to even reasonably perform at runtime if I did.

This is QA madness. Wait. What were we chatbotting about again? 😮‍💨

Plugins! Right. In theory, __beartype_hint__() should be trivial for us to add. We are all now sadly chuckling into our pewter beer steins. So, my last action item for beartype 0.12.0 after finalizing #180 is supporting __beartype_hint__(). If I fail to do this by New Year's, I do hereby resolve to go outside and eat clumps of frozen grass while three wild-eyed cats watch in shock – live on GitHub. You heard it here in dismay first.

As for everything intriguing that you just asked... I swear I read everything but currently have no idea. The cat is snoring fitfully and I am pounding the keyboard uselessly. You are correct about everything, I suspect. The entire typing ecosystem is cobbled together with expired glue and wishful thinking. Like, is this even valid syntax under PEP 695?

# Seriously? We're seriously using comparison operators in
# implicit type parameter syntax? But... what does that even mean?
def DoorsOfPerception[T<:str](...): ...

Oh. I shudder as I see. Did you unearth that syntactic monstrosity from the "Rejected Ideas" section of PEP 695? Specifically, this?

We considered, but ultimately rejected, the use of a <: token like in Scala...

Darn you, PEP 695 authors! Darn you to all heck!

leycec added a commit that referenced this issue Jan 6, 2023
This commit is the first in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit:

* Defines a new public `beartype.plug` subpackage.
* Defines a new public `BeartypeHintable` protocol in that subpackage,

Naturally, everything is mostly undocumented, very untested, and
absolutely non-working. (*Conflicted conspiracy of lickable flicks!*)
@leycec
Copy link
Member Author

leycec commented Jan 6, 2023

So. It begins with a whimper.

@tbsexton: Sincere apologies on failing to deeply respond to your incredibly enlivening exegesis on Pythonic type hint plugins here and elsewhere. Gaah! I can only hope that you and yours had a fantastic Christmas and New Year's – and do solemnly swear on my cat's grossly distended belly to reply to everything you wrote before resolving this feature request.

Let's do this, Team Bear That Types! 🐻 ⌨️

leycec added a commit that referenced this issue Jan 11, 2023
This commit is the next in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit
documents, unit tests, and finalizes the implementation of our recently
added `beartype.plug.BeartypeHintable` protocol. (*Ultimate ulna, mate!*)
leycec added a commit that referenced this issue Jan 11, 2023
This commit is the next in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit:

* Drafted initial (read: untested) usage of the `__beartype_hint__()`
  dunder class method in our core
  `beartype._check.conv.convreduce.reduce_hint()` function.
* Began drafting unit tests exercising that usage.
* Repairs unit tests broken by prior commits in this commit chain.
  Thankfully, they only pertained to Python 3.7 -- which nobody should
  care about in 2023. This means you!
* Unrelatedly, *very* lightly documented the read-only
  `beartype.door.TypeHint.{args,hint}` properties.

(*Beady eyes on a heady I!*)
leycec added a commit that referenced this issue Jan 12, 2023
This commit is the next in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit continues
iterating on the implementation, usage, and testing of this protocol.
Sadly, we suddenly realized in a fit of mad panic that this protocol
should *not* actually be a protocol. Why? Because doing so coerces *all*
user-defined classes subclassing this protocol into typing generics --
which is absolutely *not* intended. Consequently, the next commit in
this commit chain will:

* Relax `BeartypeHintable` into a standard abstract base class (ABC).
* Create a new private `_BeartypeHintableProtocol` for internal usage,
  mostly in detecting classes implicitly satisfying this ABC.

(*Thunderous terror or tons of nuns on tundra?*)
leycec added a commit that referenced this issue Jan 13, 2023
This commit is the next in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit relaxes
`BeartypeHintable` into a standard abstract mixin *and* substantially
revises the docstring for that mixin to reflect its new nature.
(*Ultrasonic Ultraman or Sonic?*)
leycec added a commit that referenced this issue Jan 14, 2023
This commit is the next in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit adds
support reducing type hints via user-defined `__beartype_hint__()`
dunder methods to our existing reduction machinery. Unfortunately, doing
so induces infinite recursion (...that's bad) in the common case of a
`__beartype_hint__()` implementation returning a type hint of the form
`Annotated[cls, ...]`. Since resolving that would require non-trivial
refactoring of our current breadth-first code generation algorithm into
a depth-first code generation algorithm, we're temporarily stymied in
this commit chain. Fortunately, we will be performing this refactoring
shortly -- because we need to anyway, for various and sundry reasons.
Until then, this commit chain is sadly on hold.
(*Formidable middling form is gormless meddling!*)
leycec added a commit that referenced this issue Jan 14, 2023
This commit is the next in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit documents
additional work in this commit chain needed to detect and squelch
unwanted recursion in our existing support for reducing type hints via
user-defined `__beartype_hint__()` dunder methods. (*Floppy lop-sided hides!*)
leycec added a commit that referenced this issue Jan 16, 2023
This commit is the next in a commit chain defining the PEP
544-compliant `beartype.plug.BeartypeHintable` protocol, en-route to
resolving feature request #192 technically submitted by myself but
spiritually submitted by super-nice NIST scientist @tbsexton (Thurston
Sexton) in his unquenchable search for a utopian DSL for typing Python
that may never exist – *but should*. Specifically, this commit documents
additional work in this commit chain needed to detect and squelch
unwanted recursion in our existing support for reducing type hints via
user-defined `__beartype_hint__()` dunder methods. While we *could*
perform this work in the `beartype 0.12.0` release cycle, doing so seems
increasingly unproductive. Let's just ship this monster already!
(*Mewling pews strewn with straw news!*)
@rtbs-dev
Copy link

rtbs-dev commented Mar 1, 2023

Also, I think a better way for me to ask all that junk up there is:

does this API provide a mechanism to design beartype-native parametric types?

See, for instance, the way Plum enables Parametric Types. Since this isn't something strictly related to multiple dispatch, this plugin api might let @wesselb keep the parametric types as a separate micro-repository beartype plugin, or let users get them for free via @beartyped classes, etc.

The other place I've seen parametric types (recently) is Pandera's dataframe validators, since the categoricalDType constructors take user parameters (the valid categories, orders, etc).

I've been using Beartype validators to design these rather than heavier pandera schemas, but to allow better dispatching I'd love to know at runtime, say, whether the DoorsOfPerception('hell','gluttony') comes back as having a typehint DoorsOfPerception[L['hell']] or DoorsofPerception[L['heaven']] (or equivalent enum, etc.)

(N.B. Also I realized where I got that :> syntax from...and it seems my coconut is showing 😅 sorry!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants