Popular value constraint #32

Saphyel · 2021-03-14T11:12:30Z

Hello! I'm creating this issue to see how people feels about choosing the most popular constrains so we can agree which one are more important/urgent, feel free also to comment improve the list if you feel like I'm missing something.
I'm gonna create some categories as well for visibility

String Constraints

Email
Uuid
Choice
Language
Locale
Country
Currency

Comparison Constraints

EqualTo
NotEqualTo
IdenticalTo
NotIdenticalTo
LessThan
GreaterThan
Range
DivisibleBy

The implementation I think should be something like:

    passports: Annotated[List[str], Country[alpha3=True]],
    age: Annotated[date, Range["1901-12-12", "2001-12-12"]],
    pets: Annotated[int, GreaterThan[-1]],
    bio: Annotated[Optional[str], Range[50, 500]],

The text was updated successfully, but these errors were encountered:

leycec · 2021-03-15T22:27:55Z

Thanks so much for the detailed theory-crafting, Carlos! Indeed, I love it. This means it will happen.

One practical alternative to adopting, embracing, and extending someone else's third-party constraints package would be to develop our own in-house beartype solution. I'm leaning "our way or the highway," because there don't actually appear to be any PEP-compliant third-party constraints packages. All existing constraints packages precede PEP 593 -- Flexible function and variable annotations, which means:

They're all incompatible with static type checkers like mypy and competing runtime type checkers like typeguard.
They're all deprecated by PEP 563 -- Postponed Evaluation of Annotations, which means they're going to break at some point.

In either case, they're all bad news for certain definitions of "bad news" and just waste everyone's time. Instead...

Introducing Bearnote

Since we don't trust anyone else to get this right, let's just do this ourselves.

Specifically, let's publish a new beartype/{insert_punny_package_name_here} package under the @beartype organization, where prospective names for {insert_punny_package_name_here} might be:

bearnote!
bearspray!
bearspittle!

Very well, I have no idea. Something prefixed by bear. Beyond that, I'm all furry ears. And here's how we design that package:

In a public bearnote.abc submodule, we declare a BearNoteConstraintABC abstract base class (ABC) resembling:

from abc import (
    ABCMeta as _ABCMeta,
    abstractmethod as _abstractmethod,
)
from types import GenericAlias as _GenericAlias
from typing import Any as _Any

class BearNoteConstraintABC(object, metaclass=_ABCMeta):
    '''
    Abstract base class (ABC) of all :mod:`bearnote` 
    value constraint subclasses suitable for use in
    `PEP 593`_-compliant type hints.

    Instances of this ABC encapsulate a specific constraint on
    arbitrary values of a specific type (e.g., minimum and/or
    maximum character length for strings, minimum and/or
    maximum value for numbers).

    .. _PEP 593:
       https://www.python.org/dev/peps/pep-0593
    '''

    #FIXME: Unsure whether this works as intended. Let's find out!
    @_abstractmethod
    @classmethod
    def __class_getitem__(self, *args) -> _GenericAlias:
        '''
        `PEP 560`_-compliant class method returning a generic alias
        of this subclass subscripted by the passed arguments.
        '''

        pass


    @_abstractmethod
    def is_valid(self, obj: _Any) -> bool:
        '''
        ``True`` only if the passed arbitrary object satisfies this constraint.
        '''

        pass

In a public bearnote.text submodule, we declare one BearNoteConstraintABC subclass for each string-specific value constraint. For example, a trivial constraint on string length might resemble:

from bearnote.abc import BearNoteConstraintABC as _BearNoteConstraintABC
from bearnote.roar import BearNoteException as _BearNoteException
from beartype import beartype as _beartype
from types import GenericAlias as _GenericAlias
from typing import Any as _Any

class StrLenMax(_BearNoteConstraintABC):
    '''
    :mod:`bearnote` **maximum string length constraint** (i.e., matching
    only strings no longer than a user-specific maximum character
    length), suitable for use in `PEP 593`_-compliant type hints.

    .. _PEP 593:
       https://www.python.org/dev/peps/pep-0593
    '''

    #FIXME: Obviously, this needs work. What you gonna do?
    @beartype
    def __class_getitem__(self, max_len: int) -> _GenericAlias:
        if max_len < 0:
            raise _BearNoteException(f'Maximum string length {max_len} < 0.')

        return _GenericAlias(StrLenMax, max_len)

    #FIXME: Obviously, this needs work too. What you still gonna do?
    def is_valid(self, obj: _Any) -> bool:
        return isinstance(obj, str) and len(obj) <= self.__args__[0]

End users then annotate @beartype-decorated callables with instances of those subclasses like so:

from bearnote.text import StrLenMax
from beartype import beartype
from typing import Annotated

@beartype
def strip_text(text: Annotated[str, StrLenMax[5]]) -> (
    Annotated[str, StrLenMax[3]]):
    return text[2:-1] if len(text) >= 3 else text

beartype then checks the arguments of PEP 593-compliant typing.Annotated type hints to decide whether any of them are bearnote-specific value constraints with a simple O(1) test resembling:

from bearnote.abc import BearNoteConstraintABC 
from types import GenericAlias
...

# If this "typing.Annotated" argument is a "bearnote" value constraint...
if (
    isinstance(annotated_arg, GenericAlias) and
    isinstance(annotated_org.__origin__, BearNoteConstraintABC
):
    #FIXME: Obviously, this is simply pseudo-code. Don't crucify me!
    # Then generate code calling this constraint's is_valid() method.
    return 'annotated_org.__origin__.is_valid(obj)'

That's the gist, anyway.

@harens: I know you're absolutely full-up with being an over-achieving Londoner of studious success, but I dimly recall you being interested in Python data validation. It's that something you'd still be interested in? If so, I'd be delighted to have you either co-lead or just casually contribute to a project like that outlined above.

Of course, a non-committal shrug is also an acceptable and expected response. 😜

harens · 2021-03-16T22:09:30Z

I dimly recall you being interested in Python data validation.

Those dodgy nootropics seem to be doing you well @leycec! 💊 Funnily enough, that project was just part of a school course back when I was learning Python...so I wouldn't really call myself a data validation connoisseur. Having said that, I'd still love to help where I can. 👍

Introducing Bearnote

This looks great! A mypy-compliant constraints package would certainly be amazing, and a sprinkle of bear puns makes it even better :) 🐻 It's definitely something that I would love to use.

Specifically, let's publish a new beartype/{insert_punny_package_name_here} package under the @beartype organization

If this is going to be a seperate project (which it seems like it will), I have a few suggestions from our experience with the beartype project which you might find interesting.

Start building the docs from the beginning. It doesn't have to be amazing, but I think just breaking down the content into different files should hopefully make things easier in the long run.
mypy as we go. Getting ~400 mypy errors just before a release is never fun (thanks again for fixing PEP 561 compliance #25). It might be easier and simpler if, similar to beartype, zero mypy errors are required for tests to pass.
Hypermodern Python setup. 🎸 This one's less necessary, but there are loads of hypermodern python templates/guides online which you might find fascinating. In particular, definitely check out poetry if you haven't already. Both @Saphyel and I use it! Having said that, there might also be benefits to sticking with the current layout since then a lot of things can be copied across easily. I'll leave that one up to you. 😉

Either way, whatever happens, I'll still be here to package the project when you need me. 📦 Thanks for all your work on the beartype project @leycec! You've done an amazing job.

Also hi 👋 @Saphyel. It's always great to see a fellow Londoner! 🇬🇧 I hope the chilly nightfall is treating you well.

Heliotrop3 · 2021-03-17T15:04:13Z

Note that beartype will need to perform some value checking in order to incorporate PEP 586 -- Literal Type.

However the aforementioned PEP suggests then states adding the semantics for value checking as outlined above is currently outside the scope of the Literal type hint. Paraphrasing:

A full-fledged dependent type system that lets users predicate types based on their values in arbitrary ways, while certainly useful, is out of scope for PEP 586. Such a type system would require substantially more work with respect to implementation, discussion, and research than this PEP provides
....
This PEP should be seen as a stepping stone towards this goal, rather than an attempt at providing a comprehensive solution.

Admittedly, I'm not crystal clear on the interaction between PEP 593 -- Flexible Function and Variable Annotations and PEP 586 -- Literal Type. It seems, however, type hint Literal should be included in this conversation of data validation.

leycec · 2021-03-18T03:28:36Z

@Heliotrop3 with the deep take, as always.

PEP 586 -- Literal Type is indeed a crude form of value constraint – the crudest! typing.Literal is literally ^{see wat i did there} just the == object equality operator encapsulated as a type hint: e.g.,

>>> CREEPY = 'I have a special plan for this world.'
>>> NOT_CREEPY = 'In her eyes tonight, there’s a glow tonight.'

# This is how normal code checks object equality.
>>> CREEPY == 'I have a special plan for this world.'
True
>>> NOT_CREEPY == 'I have a special plan for this world.'
False

# ...but this is how type checkers check object equality!
>>> @beartype
... def ligotti(creepy: Literal[
...     'I have a special plan for this world.',
...     'Imagine, he said, all the flesh that is eaten.',
...     'Now take away that flesh, he said.',
... ]) -> str:
...     return creepy[:16]

>>> ligotti(CREEPY)
'I have a special'
>>> ligotti(NOT_CREEPY)
beartype.roar.BeartypeCallHintPepParamException: @beartyped ligotti()
parameter creepy='In her eyes tonight, there’s a glow tonight.' violates type
hint typing.Literal[...], as value 'In her eyes tonight, there’s a glow tonight.'
not 'I have a special plan for this world.', 'Imagine, he said, all the flesh that
is eaten.', or 'Now take away that flesh, he said.'.

The takeaway is that PEP 586 doesn't really do much for us here, because PEP 586 only talks about object equality. But we're talking about every sort of object comparison here like integer comparison and regex-based pattern matching and yadda-yadda.

Object equality is neat and all... but how often in real-world code have you needed to type-check a callable parameter or return to be strictly equal to only one of n specific values? Those who like playing devil's advocate may now be thinking: "All the time, bro. All the friggin' time." If this is you, read on, because we have bad news.

PEP 586: The PEP That Barely Does Anything and Does It Badly

@beartype technically doesn't support PEP 586 yet – but it wouldn't take much to get us there. We mostly just lack motivation, because PEP 586 is mostly useless.

Why? Because PEP 586 only supports five possible types excluding None, which PEP 484 already explicitly supported five years ago:

Literal may be parameterized with literal ints, byte and unicode strings, bools, Enum values and None.

That's it. Like, who even wrote that specification? With constraints that narrow, why'd they even bother? We can't even type-check objects against complex numbers, containers, or instances of user-defined classes with that! That makes PEP 586 frustratingly inapplicable for 99.9999% of use cases.

Even using that to type-check equality against Enum members – which is really the one only valid real-world use case here – violates DRY by requiring manual relisting of all Enum members (e.g., Literal[ShapeEnum.square, ShapeEnum.circle, ShapeEnum.ellipse, ShapeEnum.triangle]).

That pains me somewhere sensitive deep inside. What happens when you add a new ShapeEnum.dodecahedron member but forget to append that member to every Literal type hint enumerating ShapeEnum littered throughout your million-line codebase?

Bad stuff, Tyler. Bad stuff happens.

We still intend to support PEP 586, because it's an annotation PEP. That's what we do here. But it's the least useful annotation PEP yet, which means it's dead last on our TODO: list.

Y U do this 2 us, Guido? 😞

leycec · 2021-03-18T04:25:10Z

Let's chat third-party validation packages. There are more than a few. Here at @beartype, we aim to please you while pleasing ourselves. We'd thus be happy to support all reasonably popular well-maintained validation packages with sane APIs, where "sane APIs" is defined here as packages that:

Provide a fast mechanism for detecting package-specific validation objects. We need to be able to detect these objects when listed as typing.Annotation arguments and we need that detection to be fast. This usually isn't a problem, because a sane API should root its object hierarchy at a public abstract base class (ABC). Given that class, we can trivially detect all validation objects produced by that package via a trivial issubclass(type_hint, ThirdPartyPackageABC). Next!
Provide a quasi-fast mechanism for validating arbitrary objects against those package-specific validation objects. This is probably the stickler. We're not necessarily mandating @beartype-style O(1) behaviour here, but it would be nice if the package in question took pains to avoid unsafe O(n) behaviour. Even if it doesn't, that could still be okay. The inefficiency of other packages isn't necessarily our concern, right? What we do need here is for that package to make it really easy to validate arbitrary objects against those package-specific validation objects. "Really easy" means that validation objects like somepackage.GreaterThan(2):
- Must define a tester method that returns booleans rather than raising exceptions when passed invalid objects. This tester should be defined as an abstract method by the aforementioned ABC, since that lets us treat validation objects generically.
- May optionally also define a validation method that raises exceptions rather than returning booleans when passed invalid objects. This is optional, because we can raise exceptions ourselves; we don't need a specific method for that, although having a specific method for that would help us raise human-readable exceptions.

Minimal-length example or it didn't happen, so consider a third-party validation somepackage with a sane API resembling:

class ValidationABC(object, metaclass=ABCMeta):
    @abstractmethod
    def is_valid(self, obj: Any) -> bool: pass

class GreaterThan(ValidationABC):
    def __init__(self, number: int) -> None:
        self._number = number
    def is_valid(self, obj: Any) -> bool:
        return isinstance(obj, int) and obj > self._number

That's it. We have no idea whether the following packages satisfy those requirements, because we are lazy. Nonetheless, popular well-maintained validation packages include (in no particular order):

That... might be it.

Note that most third-party validation packages like Cerberus and Colander are obsessed with schemas and data exchange formats (e.g., JSON, YAML). Those are all sadly irrelevant and useless for our purposes. We need something general-purpose, unstructured, and fast – so, not those. Those are all slow behemoths from a bygone age when web devs didn't have Django or Panel. </sigh>

leycec · 2021-03-30T06:40:08Z

...back from the GitHub gutter, it's that balding @leycec guy! I've just finalized Python 3.10 support in 89bb8d3 and am now dedicating the next several ~~years~~ ~~months~~ ~~weeks~~ days to this. Provisional support for data validation will land in the next stable release, which means beartype 0.7.0 by this Friday, because if I spend any longer on this my wife will seriously beat me up.

Here's how you'll use it:

from beartype import beartype
from beartype.constraint import Constraint
from typing import Annotated

@beartype
def validate_text(text: Annotated[str, Constraint[
    lambda text: 4 <= len(data) <= 14]]):
    ...

The validate_text() function defined above validates the passed value to be a string with length in the range [4, 14]. @beartype will do all that for you. All you do is supply the arbitrary user-defined constraint. I believe in you!

Of course, that's a bit unwieldy when copy-and-pasted across an entire codebase. Instead, everyone wants to define commonly used constraints as PEP 484-compliant type aliases: e.g.,

from beartype import beartype
from beartype.constraint import Constraint
from typing import Annotated

LengthyString = Annotated[str, Constraint[lambda data: 4 <= len(data) <= 14]]
'''
PEP-compliant type hint validating the passed or returned value to be
a string with length in the range ``[4, 14]``.
'''

@beartype
def munge_text(text: LengthyString): ...

@beartype
def plunge_text(text: LengthyString): ...

Everything above is PEP-compliant. That means static type checkers and smarty-pants Python IDEs like PyCharm will implicitly support all of that. That's good.

More importantly, users can define arbitrarily complex constraints satisfying their own stack-specific needs. That's even better. You don't need to wait for me or someone else who resembles me (so, my evil doppelgänger) to write those constraints for you. Instead, you do it and I'll unconditionally support it, whatever it is, no matter what.

Given that basic support for data validation, we can then gradually build out more involved support for specific types of data validation in the beartype.constraint submodule – all reusing the same core beartype.constraint.Constraint API. Higher-level constraints might resemble:

beartype.constraint.RegexConstraint, constraining the passed or returned string to match a compiled regex.
beartype.constraint.NumberConstraint, constraining the passed or returned integer, floating-point, or complex number to match a numeric operation (e.g., greater-than, less-than).
beartype.constraint.CountryCodeConstraint, constraining the passed or returned string to match an ISO 3166-compliant country code (probably internally implemented with beartype.constraint.RegexConstraint and A Really Big Or Ugly Regex (ARBOUR)).
...and so on and so forth, hand-waving all the ugly details away.

None of that except the core beartype.constraint.Constraint API will land in beartype 0.7.0, because time is slipping like a greased pig through my sausage fingers. I'd still love to hear what everyone thinks about that. If you think that sucks, please don't tell because I'm now emotionally invested.

@harens: Also, I can't believe I missed your enthusiastic reply suffused with venerable wisdom. I now feel bad. You are correct about everything. You often are. Thankfully, I have realized that nobody wants me to author 1,001 packages with cute and cuddly names like bearcat, bearspray, bearable, and unbearable. I won't be doing that. Instead, everyone just wants me to make beartype usable and passably documented. I'll try to be doing that instead. 📝

This commit is the first in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit defines a new well-tested private `beartype._util.func.utilfuncmake.copy_func_shallow()` utility function shallowly coping pure-Python callables in a robust and efficient manner *and* a new untested and currently mostly empty public `beartype.must` subpackage exposing the public API for this validation. (*Circumstantial happenstance!*)

leycec · 2021-04-03T07:36:58Z

It's happening, bearmongers. Commit 036075b begins The Work That Transforms Beartype into a Useable Work Product for People.

beartype.constraint.Constraint seemed overly verbose and anti-fun. Instead, we're now aiming for a terse and pro-fun data validation API situated at either beartype.must.Must or beartype.note.Note. I'm leaning towards the former, because beartype must increase your data consistency.

It looks like:

from beartype import beartype
from beartype.must import Must
from typing import Annotated

@beartype
def get_text_middle(text: Annotated[str, Must[
    lambda text: 4 <= len(data) <= 14]]):
    '''
    Return the substring spanning characters ``[6, 9]`` inclusive
    from the passed string required to have a length in the range
    ``[4, 14]`` inclusive.
    '''

    # "This is guaranteed to work," says beartype.
    return text[7:10]

Tragically, the IRS insists I must file expat taxes... or else. I think they intend to commit depraved and unmentionable acts of perfidy on our cats. To save our cats, I'll be distracted next week with filling out an endless litany of Cold War-era Linux-incompatible bureaucracy that could really use have used a reboot three decades ago.

Do not be alarmed if I commit nothing for a week. That's just me trading off my remaining sanity, patience, and hair for my cats.

This commit is the first in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit documents the new `beartype.must.Must` class -- complete with usage instructions and a working example. (*Hypothetical hypochondriacs!*)

This commit is the first in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit renames the prior `beartype.must` subpackage to `beartype.vale` and the prior prior `beartype.must.Must` class `beartype.vale.Is`, significantly improves documentation across this class, and begins implementing the core `beartype.vale.Is.__class_getitem__()` dunder class method. (*Arbitrary arbitration!*)

This commit is the first in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this significantly improves documentation across the `beartype.vale.Is` class and defines a new private `beartype._util.func.utilfuncarg.get_func_args_len_standard()` getter introspecting the number of standard arguments accepted by the passed callable, internally called by the core `beartype.vale.Is.__class_getitem__()` dunder class method. (*Prestidigitation's predestination!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit (yet again) significantly improves documentation across the `beartype.vale.Is` class, defines a new public `beartype.vale.AnnotatedIs` class instantiated by the core `beartype.vale.Is.__class_getitem__()` dunder class method, defines a new private `beartype.vale._valeiscore.is_hint_pep593_beartype()` tester detecting beartype-specific annotated type hints, and exhaustively exercises *all* `beartype.vale._valeiscore` attributes with unit tests. (*Insipid agility's tepid fragility!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit enables the new public `beartype.vale.AnnotatedIs` class instantiated by the core `beartype.vale.Is.__class_getitem__()` dunder class method to optionally avoid additional stack frames by generating executable code and code locals and exhaustively exercises this functionality with unit tests. (*Pallid limpets primped in pomposity!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit type-checks the `beartype.vale.AnnotatedIs` class instantiated by the `beartype.vale.Is.__class_getitem__()` dunder class method, but has yet to generate human-readable exceptions on type-checking failures. (*Extraneous extemporaneousness!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit implements a variety of low-level utility functions required to generate human-readable exceptions on violations of data validators supplied as callables, which curiously is substantially more difficult than simply validating that data. (*Munificent munitions!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit implements a draft private beartype._util.func.utilfuncorigin.get_func_lambda_origin_code_or_none() function introspecting the exact code substring declaring an arbitrary lambda function as well as superficial support for a new `beartype.vale.SubscriptedIs._get_repr()` static method enabling dynamic generation of machine-readable object representations for arbitrarily nested and complex "Is[...]" subscriptions, required to generate human-readable exceptions on violations of data validators. Unrelatedly, this commit also dramatically improves the *See Also* section of our front-facing `README.rst` documentation with a comparative review of all known runtime type checkers and new *Runtime Data Validators* subsection enumerating all known runtime data validator (e.g., contract) packages. (*Enigmatic intervals!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, it's so late and it's raining and I no longer have a clear grip on what exactly was done here but I'm fairly sure it was impressive. ## Features Optimized * **Non-builtin types.** `@beartype` now checks non-builtin types optimally by avoiding an extraneous dictionary lookup in the beartypistry singleton previously required to check those types. (*Outlandish outlanders!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit implements support for generating deferred memoized machine-readable representations via the repr() builtin when passed arbitrarily nested and complex "Is[...]" subscriptions, used by @beartype to generate human-readable exception messages when a parameter or return violates a data validator. (*Dissipative reparations!*)

This commit is the next in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving issue #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit substantially improves the robustness of getter functions defined by the private `beartype._util.func.utilfunccode`, which internally call surprisingly fragile `ast`, `inspect`, and `tokenize` functions that have a bad habit of raising fatal non-human-readable exceptions in common edge cases. Frankly, those functions are sufficiently buggy that I have doubts whether anyone actually tested them to any reasonable degree. (*Destructive derivatives!*)

This commit is the maybe second-to-last in a commit chain adding support for arbitrary caller-defined data validation en-route to resolving the issue that is #32, enabling callers to validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, this commit finalizes the implementation of both the core "beartype.vale.Is" class *and* code generated to validate type hints annotated by subscriptions of that class in `@beartype`-decorated callables as well as unit tests exercising these facilities. The next commit will finalize unit tests exercising all edge cases and ideally be the last commit in this commit chain – finalizing the world's first PEP-compliant data validation API. (*Almost almighty!*)

leycec · 2021-04-24T07:30:18Z

OMFG. People, the world's first PEP-friendly data validation framework that is also the world's fastest PEP-friendly data validation framework is happening. We will:

Finalize the API by tomorrow after dangerously high caffeine consumption and the agonized head-clutching that follows.
Release beartype 0.8.0 publishing both that API and Python 3.10 support on Sunday, which hereafter will be referred to as D.V.S. (Data Validation Sunday).

Final sneak preview for the crickets chirping dolefully in the audience: 🦗 🦗 🦗

from beartype import beartype
from beartype.vale import Is
from typing import Annotated

@beartype
def get_text_middle(text: Annotated[str, Is[
    lambda text: 4 <= len(data) <= 14]]):
    '''
    Return the substring spanning characters ``[6, 9]`` inclusive
    from the passed string required to have a length in the range
    ``[4, 14]`` inclusive.
    '''

    # "This is guaranteed to work," says beartype.
    return text[7:10]

</heavy_breathing>

@Saphyel

This commit is the last in a commit chain adding support for arbitrary caller-defined data validation, resolving issue #32. Specifically, this commit finalizes unit tests exercising all edge cases associated with this functionality – finalizing the world's first PEP-compliant data validation API. ## Issues Resolved * **Data validation.** `@beartype` now supports arbitrary caller-defined data validators enabling callers to efficiently validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects. Specifically, `@beartype`-decorated callables may now be annotated by type hints of the form `typing.Annotated[{cls}, beartype.vale.Is[lambda obj: {test_expr1}], ..., beartype.vale.Is[lambda obj: {test_exprN}]]`, where: * `{cls}` is any arbitrary class (e.g., `str`, `numpy.ndarray`). * `{test_expr1}` is any arbitrary expression evaluating to a boolean (e.g., `len(obj) <= 80`, `obj.dtype == np.dtype(np.float64)`). * `{test_exprN}` is any arbitrary expression evaluating to a boolean. `beartype.vale.Is` may also be subscripted (indexed) by non-lambda callables with similar signatures. For convenience, `beartype.vale.Is` objects support a rich domain-specific language (DSL) enabling new data validators to be synthesized from existing data validators using only standard operators: * **Negation** with `~beartype.vale.Is[lambda obj: {test_expr}]`, equivalent to `beartype.vale.Is[lambda obj: not {test_expr}]`. * **And-ing** with `beartype.vale.Is[lambda obj: {test_expr1}] & beartype.vale.Is[lambda obj: {test_expr2}]`, equivalent to `beartype.vale.Is[lambda obj: {test_expr1} and {test_expr2}]`. * **Or-ing** with `beartype.vale.Is[lambda obj: {test_expr1}] | beartype.vale.Is[lambda obj: {test_expr2}]`, equivalent to `beartype.vale.Is[lambda obj: {test_expr1} or {test_expr2}]`. This syntax fully complies with [PEP 593](https://www.python.org/dev/peps/pep-0593) and thus requires Python ≥ 3.9. See `help(beartype.vale.Is)` for full usage instructions, complete with real-world examples. This resolves issue #32, kindly submitted by fashionable top hat-wearing London cat #seductress @Saphyel (Carlos Jimenez). * **Byte strings in errors.** `@beartype` now correctly displays byte string values in exception and warning messages. It's amazing! Believe what you have never believed before. (*Unseemly briars sired by seamstress tresses!*)

leycec · 2021-04-25T08:17:10Z

Boom-shaka! 🔥 💥 🤯

@beartype now supports arbitrary caller-defined data validators. Thus ends a year-long journey culminating in the complete loss of all hair from my head, which we take a sober moment to mourn.

Everyone may now efficiently validate the internal structure of arbitrarily complex scalars, data structures, and third-party objects with PEP-compliant type hints that preserve everything you secretly love about fat bears, luscious berries, hot Spring weather, and valid app data.

Syntax

@beartype-decorated callables may now be annotated by type hints of the form typing.Annotated[{cls}, beartype.vale.Is[lambda obj: {test_expr1}], ..., beartype.vale.Is[lambda obj: {test_exprN}]], where:

{cls} is any arbitrary class (e.g., str, numpy.ndarray).
{test_expr1} is any arbitrary expression evaluating to a boolean (e.g., len(obj) <= 80, obj.dtype == np.dtype(np.float64)).
{test_exprN} is any arbitrary expression evaluating to a boolean, too.

Syntax: it's no sin and we don't charge tax.

Example 1: Make It So, Ensign NumPy!

Computational geometry example or it didn't happen, so let's validate a passed object as a two-dimensional NumPy array of floats of arbitrary precision:

from beartype import beartype
from beartype.vale import Is
from numpy import floating, issubdtype, ndarray
from typing import Annotated

Numpy2DFloatArray = Annotated[ndarray, Is[
    lambda array: polygon.ndim == 2 and issubdtype(polygon.dtype, floating)]]
'''
Beartype-specific data validator matching only parameter and return values that
are two-dimensional NumPy arrays of floats of arbitrary precision.
'''

@beartype
def polygon_area(polygon: Numpy2DFloatArray) -> float:
    '''
    Area of a two-dimensional polygon of floats defined as a set of
    counter-clockwise points, calculated via Green's theorem ala a planimeter.

    *Don't ask.*
    '''

    # Calculate and return the desired area. Just pretend we understand this.
    polygon_rolled = np.roll(polygon, -1, axis=0)
    return np.abs(0.5*np.sum(
        polygon[:,0]*polygon_rolled[:,1] - polygon_rolled[:,0]*polygon[:,1]))

DSL: It's Not Just a Telecom Acronym Anymore

beartype.vale.Is also supports a rich domain-specific language (DSL) enabling new validators to be synthesized from existing validators with overloaded set operators, including:

Negation via ~beartype.vale.Is[lambda obj: {test_expr}], equivalent to beartype.vale.Is[lambda obj: not {test_expr}].
And-ing via beartype.vale.Is[lambda obj: {test_expr1}] & beartype.vale.Is[lambda obj: {test_expr2}], equivalent to beartype.vale.Is[lambda obj: {test_expr1} and {test_expr2}].
Or-ing via beartype.vale.Is[lambda obj: {test_expr1}] | beartype.vale.Is[lambda obj: {test_expr2}], equivalent to beartype.vale.Is[lambda obj: {test_expr1} or {test_expr2}].

Example 2: Validate My Strings or GTFO

Nonsensical string matching example or it didn't happen, so let's validate a passed object as a string either of at least 80 characters or both quoted and suffixed by a period. Look, it doesn't matter. Just do it already, @beartype!

from beartype import beartype
from beartype.vale import Is
from typing import Annotated

# Beartype-specific data validators defined as lambda functions.
IsLengthy = Is[lambda text: len(text) > 80]
IsSentence = Is[lambda text: text and text[-1] == '.']

# Beartype-specific data validator defined as a non-lambda function.
def _is_quoted(text): return '"' in text or "'" in text
IsQuoted = Is[_is_quoted]

# Combine multiple validators by just listing them sequentially.
@beartype
def desentence_lengthy_quoted_sentence(
    text: Annotated[str, IsLengthy, IsSentence, IsQuoted]]) -> str:
    '''
    Strip the suffixing period from a lengthy quoted sentence... *just 'cause.*
    '''

    return text[:-1]  # this is horrible

# Combine multiple validators by just "&"-ing them sequentially. Yes, this is
# exactly identical to the prior function... just 'cause.
@beartype
def desentence_lengthy_quoted_sentence_part_deux(
    text: Annotated[str, IsLengthy & IsSentence & IsQuoted]]) -> str:
    '''
    Strip the suffixing period from a lengthy quoted sentence... *just 'cause.*
    '''

    return text[:-1]  # this is still horrible

# Combine multiple validators with as many "&", "|", and "~" operators as you
# can possibly stuff into a file that your coworkers can stomach. They will
# thank you later... possibly much, much later.
@beartype
def strip_lengthy_or_quoted_sentence(
    text: Annotated[str, IsLengthy | (IsSentence & ~IsQuoted)]]) -> str:
    '''
    Strip the suffixing character from a string that is lengthy and/or a quoted
    sentence, because your web app deserves only the best data.
    '''

    return text[:-1]  # this is frankly outrageous

There's No Catch, I Swear and I Cannot Tell a Lie

Everything above fully complies with PEP 593 and thus requires Python ≥ 3.9. See help(beartype.vale.Is) in your favourite Python REPL (...which is, of course, Jupyter Lab, because I see that you are an end user of culture) for full usage instructions, complete with real-world examples.

Thus ends my last hair follicle. 👨‍🦲

@Saphyel

This release brings titillating support for **[beartype validators][beartype validators]**, **Python 3.10**, [**full PEP 563 – "Postponed Evaluation of Annotations" compliance**][PEP 563], and [**full PEP 586 – "Literal Types" compliance**. This release resolves **4 outstanding issues** and merges **1 pending pull request.** Changes include: ## Features Added * **[Beartype validators][beartype validators],** the world's first PEP-compliant validation API. Validate anything with two-line type hints designed by you, built by the `@beartype` decorator for you. The new public `beartype.vale` subpackage enables `beartype` users to design their own PEP-compliant type hints enforcing arbitrary runtime constraints on the internal structure and contents of parameters and returns via user-defined lambda functions and nestable declarative expressions leveraging familiar `typing` syntax – all seamlessly composable with standard type hints through an expressive domain-specific language (DSL). Specifically, `@beartype`-decorated callables may now be annotated by type hints of the form `typing.Annotated[{cls}, beartype.vale.Is[lambda obj: {test_expr1}], ..., beartype.vale.Is[lambda obj: {test_exprN}]]`, where: * `{cls}` is any arbitrary class (e.g., `str`, `numpy.ndarray`). * `{test_expr1}` and `{test_exprN}` are any arbitrary expressions evaluating to booleans (e.g., `len(obj) <= 80`, `obj.dtype == np.dtype(np.float64)`). `beartype.vale.Is` may also be subscripted (indexed) by non-lambda callables with similar signatures. For convenience, `beartype.vale.Is` objects support a rich domain-specific language (DSL) enabling new validators to be synthesized from existing validators with Pythonic set operators: * **Negation** with `~beartype.vale.Is[lambda obj: {test_expr}]`, equivalent to `beartype.vale.Is[lambda obj: not {test_expr}]`. * **And-ing** with `beartype.vale.Is[lambda obj: {test_expr1}] & beartype.vale.Is[lambda obj: {test_expr2}]`, equivalent to `beartype.vale.Is[lambda obj: {test_expr1} and {test_expr2}]`. * **Or-ing** with `beartype.vale.Is[lambda obj: {test_expr1}] | beartype.vale.Is[lambda obj: {test_expr2}]`, equivalent to `beartype.vale.Is[lambda obj: {test_expr1} or {test_expr2}]`. This syntax fully complies with [PEP 593][PEP 593] and thus requires Python ≥ 3.9. See [*Beartype validators*][beartype validators] for full usage instructions, complete with real-world examples including tensors. Rejoice machine learning data scientists! This resolves issue #32, kindly submitted by fashionable London steampunk cat pimp @Saphyel (Carlos Jimenez). ## Features Optimized * **Package importation.** The first importation of both the `beartype` package and `@beartype` decorator has been significantly optimized, now consuming on the order of microseconds rather than milliseconds (or even seconds in the worst case). This critical optimization should significantly improve runtime performance for short-lived CLI applications. Isn't that great, guys? ...guys? *awkward cough* * **Wrapper function attributes.** The `@beartype` decorator now generates unconditionally faster type-checking wrapper functions. Previously, attributes accessed in the bodies of those functions were indirectly passed to those functions via a common dictionary singleton referred to as the "beartypistry" directly passed to those functions; while trivial, this approach had the measurable harm of one dictionary lookup for each attribute access in those functions. Now, the same attributes are instead directly passed as optional private beartype-specific parameters to these functions; while non-trivial, this approach has the measurable benefit of avoiding *any* dictionary lookups by instead localizing all requisite attributes to the signatures of those functions. Of course, this isn't just an optimization; this is also a hard prerequisite for supporting both ["PEP 586 -- Literal Types"](https://www.python.org/dev/peps/pep-0586) and beartype validators. The beartypistry singleton remains used only to dynamically resolve forward references to undeclared user types. ## Compatibility Improved * **Python >= 3.10.0.** `@beartype` now officially supports Python 3.10, currently in beta but maybe-soon-to-be-released thanks to Python's accelerated release schedule. Python 3.10 significantly broke backwards compatibility with runtime introspection of type hints and thus runtime type checkers, complicating support for Python 3.10 for most runtime type checkers (including us). Specifically, Python 3.10 unconditionally enables ["PEP 563 -- Postponed Evaluation of Annotations"](https://www.python.org/dev/peps/pep-0563) – an abysmal standard preferentially improving the efficiency of statically type-checked applications by reducing the efficiency of applications also checked by runtime type checkers. We can only protest with skinny fists lifted like antennas to GitHub. *Praise be to Guido.* * **[PEP 563 – "Postponed Evaluation of Annotations"][PEP 563].** While `beartype 0.1.1` only partially supported [PEP 563][PEP 563], `@beartype` now fully supports all edge cases associated with [PEP 563][PEP 563] – including postponed methods, nested functions, closures, and forward references. Forward references merit particular mention. Why? Because of course, forward references are fundamentally indistinguishable from [PEP 563][PEP 563]-postponed type hints, because [PEP 563][PEP 563] was never intended to be usable at runtime. Unsurprisingly, it isn't. While numerous Python packages superficially support [PEP 563][PEP 563] by deferring to the broken `typing.get_type_hints()` function, `@beartype` is the first and thus far *only* annotation-based Python package to fully support [PEP 563][PEP 563] and thus Python 3.10. * **[PEP 586 – "Literal Types"][PEP 586].** The `@beartype` decorator now fully supports the [new `typing.Literal` type hint introduced by Python ≥ 3.9](https://docs.python.org/3/library/typing.html#typing.Literal). Note, however, that beartype validators offer similar but significantly more practical support for type hint-based equality comparison in our new `beartype.vale.IsEqual` class. ## Issues Resolved * **`typing.OrderedDict` under Python 3.7.0 and 3.7.1.** `@beartype` now conditionally imports the `typing.OrderedDict` singleton *only* if the active Python interpreter targets Python ≥ 3.7.2, the patch release that bizarrely changed the [public `typing` API by introducing this new public attribute](https://docs.python.org/3/library/typing.html#typing.OrderedDict). Doing so improves compatibility with both Python 3.7.0 and 3.7.1 *and* resolves issue #33 – kindly reported by @aiporre, the dancing unicorn that radiates sparkles named Ariel. ## Tests Improved * **Test coverage.** The test suite for `@beartype` now automatically generates test coverage metrics – resolving #20: * Locally via the third-party `coverage` package (if importable under the active Python interpreter). `@beartype` intentionally leverages the `coverage` package directly rather than its higher-level `pytest-cov` wrapper, as the latter offers no tangible benefits over the former while suffering various tangible harms. These include: * Insufficient configurability, preventing us from sanely generating XML-formatted reports via our existing `tox.ini` configuration. * Ambiguous output, preventing us from sanely differentiating expected from unexpected behaviours. * Argumentative and strongly opinionated developers, which is frankly *never* a good look for open-source volunteerism. * Remotely via the [third-party Codecov.io coverage service](https://about.codecov.io), integrated with the [codecov/codecov-action](https://github.com/codecov/codecov-action) now performed on each commit and pull request by our GitHub Actions continuous integration (CI) workflow. This commit is the last in a commit chain by measuring test coverage driven Specifically, this commit refactors (*Innumerable insinuation!*) * **Python Development Mode (PDM).** The PDM (e.g., `-X dev`, `PYTHONDEVMODE`) is now enabled by default under both pytest and tox and thus continuous integration (CI), mildly improving the robustness of our test suite in edge cases that absolutely should *never* apply (e.g., GIL and memory safety violations) but probably will, because bad things always happen to good coders. It's, like, a law. ## Documentation Revised * **See Also.** The *See Also* section of our front-facing `README.rst` documentation has been significantly expanded with: * A comparative review of all known runtime type checkers. * A new *Runtime Data Validators* subsection enumerating all known runtime validation (e.g., contract, trait) packages. [beartype validators]: https://github.com/beartype/beartype#beartype-validators [PEP 563]: https://www.python.org/dev/peps/pep-0563 [PEP 586]: https://www.python.org/dev/peps/pep-0586 [PEP 593]: https://www.python.org/dev/peps/pep-0593 (*Winsome winners ransom random dendritic endoscopy!*)

xerz-one · 2022-05-29T21:09:02Z

Hey there, any chance we'll get those higher level beartype.constraints? I was particularly interested about RegexConstraint, as that seems better than compiling a regex somewhere (globally?) and then taking that regex object into a validator lambda without even checking its type first, or else compiling the regex on each performed validation.

leycec · 2022-05-31T05:35:48Z

So much "Yes." Thanks for reminding me about my shameful laziness, @xerz-one. The long-dormant issue of high-level constraints has, indeed, laid dormant for too long.

Most of the constraints originally listed by @Saphyel above ultimately reduce to regular expression-based matching. But as you astutely suggest, that itself raises non-trivial questions relating to space and time efficiency... like:

Do we compile the regular expression?
If so, where do we cache the compiled regular expression to?

These are fascinating conundrums. Let's open a new feature request for this, @beartype!

leycec closed this as completed Apr 25, 2021

harens mentioned this issue May 8, 2021

Error handling harens/checkdigit#15

Open

leycec mentioned this issue May 31, 2022

[Feature Request] Regex-based beartype.vale.IsMatch validator API #131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Popular value constraint #32

Popular value constraint #32

Saphyel commented Mar 14, 2021 •

edited

Loading

leycec commented Mar 15, 2021

harens commented Mar 16, 2021

Heliotrop3 commented Mar 17, 2021

leycec commented Mar 18, 2021 •

edited

Loading

leycec commented Mar 18, 2021 •

edited

Loading

leycec commented Mar 30, 2021 •

edited

Loading

leycec commented Apr 3, 2021

leycec commented Apr 24, 2021

leycec commented Apr 25, 2021 •

edited

Loading

xerz-one commented May 29, 2022 •

edited

Loading

leycec commented May 31, 2022

Popular value constraint #32

Popular value constraint #32

Comments

Saphyel commented Mar 14, 2021 • edited Loading

leycec commented Mar 15, 2021

Introducing Bearnote

harens commented Mar 16, 2021

Heliotrop3 commented Mar 17, 2021

leycec commented Mar 18, 2021 • edited Loading

PEP 586: The PEP That Barely Does Anything and Does It Badly

leycec commented Mar 18, 2021 • edited Loading

leycec commented Mar 30, 2021 • edited Loading

leycec commented Apr 3, 2021

leycec commented Apr 24, 2021

leycec commented Apr 25, 2021 • edited Loading

Syntax

Example 1: Make It So, Ensign NumPy!

DSL: It's Not Just a Telecom Acronym Anymore

Example 2: Validate My Strings or GTFO

There's No Catch, I Swear and I Cannot Tell a Lie

xerz-one commented May 29, 2022 • edited Loading

leycec commented May 31, 2022

Saphyel commented Mar 14, 2021 •

edited

Loading

leycec commented Mar 18, 2021 •

edited

Loading

leycec commented Mar 18, 2021 •

edited

Loading

leycec commented Mar 30, 2021 •

edited

Loading

leycec commented Apr 25, 2021 •

edited

Loading

xerz-one commented May 29, 2022 •

edited

Loading