New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
beartype performance is substantially slower than no beartype #58
Comments
Hah-hah! I love fielding questions like this, because overly scrupulous fixation on efficiency is my middle name(s). Thankfully, according to the wizened sages of old and our own Let's delve deeper. Formulaic Formulas: They're Back in FashionFirst, let's formalize how exactly we arrive at the call-time overheads above. Given any pair of reasonably fair timings (which yours absolutely are) between an undecorated callable and its equivalent
Then the call-time overhead
Plugging in
Again, this superficially seems reasonable – but is it? Let's delve deeper. Function Call Overhead: The New Glass CeilingNext, the added cost of calling Since Python decorators almost always add at least one additional stack frame (typically as a closure call) to the call stack of each decorated call, this measurable overhead is the minimal cost of doing business with Python decorators. Even the fastest possible Python decorator necessarily pays that cost. Our quandary thus becomes: "Is 1—0.01µsec of call-time overhead reasonable or is this sufficiently embarrassing as to bring multigenerational shame upon our entire extended family tree, including that second cousin twice-removed who never sends a kitsch greeting card featuring Santa playing with mischievous kittens at Christmas time?" We can answer that by first inspecting the theoretical maximum efficiency for a pure-Python decorator that performs minimal work by wrapping the decorated callable with a closure that just defers to the decorated callable. This excludes the identity decorator (i.e., decorator that merely returns the decorated callable unmodified), which doesn't actually perform any work whatsoever. The fastest meaningful pure-Python decorator is thus: def fastest_decorator(func):
def fastest_wrapper(*args, **kwargs): return func(*args, **kwargs)
return fastest_wrapper By replacing
Again, plugging in
Holy Balls of Flaming Dumpster FiresHoly balls, people. I'm actually astounded myself. Above, we saw that Not only is Of course, even a negligible time delta accumulated over 10,000 function calls becomes slightly less negligible. Still, it's pretty clear that But, but... That's Not Good Enough!Yeah. None of us are pleased with the performance of the official CPython interpreter anymore, are we? CPython is that geriatric old man down the street that everyone puts up with because they've seen Pixar's Up! and he means well and he didn't really mean to beat your equally geriatric 20-year-old tomcat with a cane last week. Really, that cat had it comin'. If Does that fully satisfy your thirsty cravings for break-neck performance? If so, feel free to toggle that Close button. If not, I'd be happy to hash it out over a casual presentation of further timings, fake math, and Unicode abuse. tl;dr
|
Thank you for a great analysis of the situation... as a retrospective, I am taking away the following from this exercise...
Summary: Enforcing PEP 484 slows things down. Duck typing + assert is as fast as we can get. If I must have PEP 484, beartype is the fastest option I've found. Disclaimer: |
These continue to be great questions! Thanks for all the fun engagement here. Let's see if I can't tackle a few of these...
Right. All else being equal, manually inlining code (e.g., with hand-rolled The same is actually true of all languages – not only high-level dynamically typed languages like Python and Ruby. This is why manual loop unrolling is still a thing in C and C++. i am shaking my head
Right. This is because Enthought traits isn't actually being compiled to low-level machine code; like all pure-Python code, CPython is just byte-compiling traits into intermediary Python bytecode. But bytecode is still excruciatingly slow, sadly. Given your understandable interest in extreme speed, compilation, and optimizations, I wonder if you're familiar with pydantic? Unlike traits, pydantic is actually being compiled to low-level machine code via Cython. This means that pydantic should be faster than duck typing. The only disadvantage of pydantic's approach is that it will be slower than beartype's approach when JIT-ed by a JIT like PyPy3, because JITs like PyPy3 can't JIT low-level Cython, C, or machine code; they can only JIT pure-Python. In any case... I now, right? It's hard to beat the duck, but pydantic should get you there for 90% of common use cases. 🦆
Careful there! "Static" is an unfortunately overloaded term that means many strange different things in many strange different languages. When Python devs say "static typing," they mean static type checkers that perform their work at static analysis time rather than runtime. So, things like mypy, Facebook's Pyre, Microsoft's pyright, and Google's pytype is what I'm saying. What you probably mean here is more commonly known as type-safe dataclasses. This is also what pydantic and attrs (and eventually beartype itself) do: they inject automated type-checking into each read and write of class and instance variables.
Fascinating! Again, pydantic should handily beat out both if you have Cython installed. Enthought traits and traitlets are actually both pure-Python, as far as I know. If Enthought traits significantly outperforms traitlets, this is likely due to extreme microoptimizations in the former versus the latter. Microoptimization is right up in our wheelhouse here, because beartype itself internally leverages extreme microoptimization to the hilt to extract every last ounce of the wall-clock time slice from CPython.
Actually... beartype should be dramatically faster than all other pure-Python runtime type checkers! This includes Enthought traits, traitlets, and attrs. If this is not the case, please post timings in a separate issue and I'd be delighted to repeatedly bang my head against CPython until we beat 'em up. 🤜 🩸 🤛 Beartype actually began life as a non-PEP 484 type checker. For this reason, If you're at all curious about what the code Thanks again for the fun discourse, Mike! From the Great White North to you, have a great Canadian Thanksgiving. 🦃 🍴 |
Thank you for pointing out pydantic, but pydantic tested worst of all when I configured type validation... see the
Actually Enthought traits compiles the base traits classes... also ref the source ->
My rig...
Test results... $ python test_me.py
timeit duck getattr time: 0.0432 seconds
timeit duck assert time: 0.0466 seconds
timeit traits time: 0.0743 seconds
timeit bear time: 0.098 seconds
timeit traitlets time: 0.5947 seconds
timeit pydantic time: 1.6075 seconds Test code: # Filename: test_me.py
from beartype import beartype
import pydantic
from traits.api import HasTraits as eHasTraits
from traits.api import Unicode as eUnicode
from traits.api import Int as eInt
from traitlets import HasTraits as tHasTraitlets
from traitlets import Unicode as tUnicode
from traitlets import Integer as tInteger
from timeit import timeit
###############################################################################
# define duck type getattr() test function
###############################################################################
def main_duck_getattr(arg01="__undefined__", arg02=0):
"""Proof of concept code implenting duck-typed args and getattr"""
getattr(arg01, "capitalize") # Type-checking with attributes
getattr(arg02, "to_bytes") # Type-checking with attributes
str_len = len(arg01) + arg02
getattr(str_len, "to_bytes")
return ("duck_bar", str_len,)
###############################################################################
# define duck type assert test function
###############################################################################
def main_duck_assert(arg01="__undefined__", arg02=0):
"""Proof of concept code implenting duck-typed args and assert"""
assert isinstance(arg01, str)
assert isinstance(arg02, int)
str_len = len(arg01) + arg02
assert isinstance(str_len, int)
return ("duck_bar", str_len,)
###############################################################################
# define Enthought traits test class
###############################################################################
class MainTraits(eHasTraits):
"""Proof of concept code implenting Enthought traits args"""
arg01 = eUnicode()
arg02 = eInt()
def __init__(self, *args, **kwargs):
super(MainTraits, self).__init__(*args, **kwargs)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("traits_bar", self.str_len)
###############################################################################
# define traitlets test class
###############################################################################
class MainTraitlets(tHasTraitlets):
"""Proof of concept code implenting traitlets args"""
arg01 = tUnicode()
arg02 = tInteger()
def __init__(self, *args, **kwargs):
super(MainTraitlets, self).__init__(*args, **kwargs)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("traitlets_bar", self.str_len)
###############################################################################
# define beartype test function
###############################################################################
@beartype
def main_bear(arg01: str="__undefined__", arg02: int=0) -> tuple:
"""Proof of concept code implenting bear-typed args"""
str_len = len(arg01) + arg02
return ("bear_bar", str_len,)
###############################################################################
# define pydantic test class
###############################################################################
class MainPydantic(pydantic.BaseModel):
"""
Proof of concept code implenting pydantic args
- Warning: pydantic does NOT vaidate types by default
https://github.com/samuelcolvin/pydantic/issues/578
"""
arg01: pydantic.StrictStr = ""
arg02: pydantic.StrictInt = 0
str_len: pydantic.StrictInt = 0
class Config(object):
"""Configure pydantic parameters"""
validate_all = True
validate_assignment = True
def __init__(self, *args, **kwargs):
super(MainPydantic, self).__init__(*args, **kwargs)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("pydantic_bar", self.str_len)
if __name__=="__main__":
num_loops = 100000
duck_result_getattr = timeit('main_duck_getattr("foo", 1)', setup="from __main__ import main_duck_getattr", number=num_loops)
print("timeit duck getattr time:", round(duck_result_getattr, 4), "seconds")
duck_result_assert = timeit('main_duck_assert("foo", 1)', setup="from __main__ import main_duck_assert", number=num_loops)
print("timeit duck assert time:", round(duck_result_assert, 4), "seconds")
traits_result = timeit('mm.run("foo", 1)', setup="from __main__ import MainTraits;mm = MainTraits()", number=num_loops)
print("timeit traits time:", round(traits_result, 4), "seconds")
bear_result = timeit('main_bear("foo", 1)', setup="from __main__ import main_bear", number=num_loops)
print("timeit bear time:", round(bear_result, 4), "seconds")
traitlets_result = timeit('tt.run("foo", 1)', setup="from __main__ import MainTraitlets;tt = MainTraitlets()", number=num_loops)
print("timeit traitlets time:", round(traitlets_result, 4), "seconds")
pydantic_result = timeit('pp.run("foo", 1)', setup="from __main__ import MainPydantic;pp = MainPydantic()", number=num_loops)
print("timeit pydantic time:", round(pydantic_result, 4), "seconds") |
Wow! Thanks so much for your thoughtful corrections on Enthought traits and that extensive cross-type checker profiling suite. This is all mega-helpful and amazing stuff right here, which I'll now forcefully meld somehow into our existing but much less amazing test suite. Mike Penning is... The Optimization Boss.
Poor, poor My only useful thought here is that If
OMG. Just... wowzers. I'm honestly stunned. It's also fascinating that the pure-Python Do you have PyPy3 installed by any chance? You've already gone well above and beyond the call of profiling duty. But... it would be equally fascinating to see how everything fares under PyPy3. In theory, PyPy3 should narrow the performance gap between these six heroic contenders even further. I'm on the edge of my seat here, Mike. Which approach (excluding the two ducking baselines, which should still outperform everything else) will survive this bloody gladiatorial combat? I'm betting on you, fighting grizzly bear! 🏟️ 🐻 ⚔️ 💦 🩸 |
Good question, and fortunately
Well that seems like a testimony of your good work to make The from traits.api import Range as eRange
#...
str_len = eRange(low=0, high=80, trait=eInt) That code makes the
I just downloaded the latest pypy3 tarball... it's late and I would like to defer more analysis for another time :-) And I think it's worth saying... You truly represent the open source community well... I don't think I've seen such a lively and engaging discussion in other projects. Props to you! |
FYI... my latest copy of the type eval tests... in retrospect, I probably should have put this in git revision control... something for another day... # Filename: test_me.py
from beartype import beartype
import pydantic
from traits.api import HasTraits as eHasTraits
from traits.api import Unicode as eUnicode
from traits.api import Int as eInt
from traits.api import Range as eRange
from traits.api import Disallow as eDisallow
from traitlets import HasTraits as tHasTraitlets
from traitlets import Unicode as tUnicode
from traitlets import Integer as tInteger
from timeit import timeit
###############################################################################
# define duck type getattr() test function
###############################################################################
def main_duck_getattr(arg01="__undefined__", arg02=0):
"""Proof of concept code implenting duck-typed args and getattr"""
getattr(arg01, "capitalize") # Type-checking with attributes
getattr(arg02, "to_bytes") # Type-checking with attributes
str_len = len(arg01) + arg02
getattr(str_len, "to_bytes")
return ("duck_bar", str_len,)
###############################################################################
# define duck type assert test function
###############################################################################
def main_duck_assert(arg01="__undefined__", arg02=0):
"""Proof of concept code implenting duck-typed args and assert"""
assert isinstance(arg01, str)
assert isinstance(arg02, int)
str_len = len(arg01) + arg02
assert isinstance(str_len, int)
return ("duck_bar", str_len,)
###############################################################################
# define Enthought traits test class
###############################################################################
class MainTraits(eHasTraits):
"""Proof of concept code implenting Enthought traits args"""
arg01 = eUnicode
arg02 = eInt
#str_len = eRange(low=0, high=80, trait=eInt)
str_len = eInt
# disallow other variables unless they are explicitly called out here...
_ = eDisallow
def __init__(self, *args, **kwargs):
super(MainTraits, self).__init__(*args, **kwargs)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("traits_bar", self.str_len)
###############################################################################
# define traitlets test class
###############################################################################
class MainTraitlets(tHasTraitlets):
"""Proof of concept code implenting traitlets args"""
arg01 = tUnicode()
arg02 = tInteger()
str_len = tInteger()
def __init__(self, *args, **kwargs):
super(MainTraitlets, self).__init__(*args, **kwargs)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("traitlets_bar", self.str_len)
###############################################################################
# define beartype test function
###############################################################################
@beartype
def main_bear(arg01: str="__undefined__", arg02: int=0) -> tuple:
"""Proof of concept code implenting bear-typed args"""
str_len = len(arg01) + arg02
return ("bear_bar", str_len,)
###############################################################################
# define pydantic test class
###############################################################################
class MainPydantic(pydantic.BaseModel):
"""
Proof of concept code implenting pydantic args
- Warning: pydantic does NOT validate types by default
https://github.com/samuelcolvin/pydantic/issues/578
"""
arg01: pydantic.StrictStr = ""
arg02: pydantic.StrictInt = 0
str_len: pydantic.StrictInt = 0
class Config(object):
"""Configure pydantic validation parameters"""
validate_all = True
validate_assignment = True
def __init__(self, *args, **kwargs):
super(MainPydantic, self).__init__(*args, **kwargs)
if pydantic.compiled is False:
error = ("During installation, the pydantic module was not"
" compiled with cython")
raise SystemError(error)
def run(self, arg01="__undefined__", arg02=0):
self.arg01 = arg01
self.arg02 = arg02
self.str_len = len(self.arg01) + self.arg02
return ("pydantic_bar", self.str_len)
if __name__=="__main__":
num_loops = 100000
duck_result_getattr = timeit('main_duck_getattr("foo", 1)', setup="from __main__ import main_duck_getattr", number=num_loops)
print("timeit duck getattr time:", round(duck_result_getattr, 4), "seconds")
duck_result_assert = timeit('main_duck_assert("foo", 1)', setup="from __main__ import main_duck_assert", number=num_loops)
print("timeit duck assert time:", round(duck_result_assert, 4), "seconds")
traits_result = timeit('mm.run("foo", 1)', setup="from __main__ import MainTraits;mm = MainTraits()", number=num_loops)
print("timeit traits time:", round(traits_result, 4), "seconds")
bear_result = timeit('main_bear("foo", 1)', setup="from __main__ import main_bear", number=num_loops)
print("timeit bear time:", round(bear_result, 4), "seconds")
traitlets_result = timeit('tt.run("foo", 1)', setup="from __main__ import MainTraitlets;tt = MainTraitlets()", number=num_loops)
print("timeit traitlets time:", round(traitlets_result, 4), "seconds")
pydantic_result = timeit('pp.run("foo", 1)', setup="from __main__ import MainPydantic;pp = MainPydantic()", number=num_loops)
print("timeit pydantic time:", round(pydantic_result, 4), "seconds") |
So impressive. That's some seriously industrial-strength profiling. I'm now considering jettisoning our shoddy Bash-based profiling suite in favour of... exactly what you've done! Hypothetically speaking, would you strenuously object to my importing your Two approaches avail us – one that grants you co-ownership over everything and the other that grants you a welcome respite from odorous responsibilities:
Both definitely work. The latter's preferable if you're scarce on free time, volunteer enthusiasm, and the oppressive desire to shame our online rivals; the former's preferable if you're burning with a feverish need to minutely control, micromanage, and fine-tune every aspect of your own transformative genius. Or we could just do nothing and pretend this never happened like that one time our Maine Coon cat mewled incessantly for three hours preceding dinner time (...that was an ugly evening). Avoiding work works, too – but would sadden and depress me. Let's do bold and risky things instead. 👯♀️ 👯♀️ 👯♀️ |
Relatedly, this obsessively fascinates me...
Hah-hah! Suck it,
What will happen? Who will win? I don't know, but I'm gripping my white-knuckled fist with uncertainty. My vaguely hand-wavy suspicion is that Oh, boy. This gettin' gud. 🍿 |
Thank you for your profuse appreciation... officially, I'm a Cisco Systems employee and pretty-much everything I do is copyright Cisco Systems. Obviously, IANAL and I need to ask before we have a final disposition on this code... please give me a few days to work out what officially Cisco wants me to do in this case. |
...ohnoes I didn't mean to publicly put you on the spot, Mike. Now I feel bad, having failed to grok the subtle signs that you were on the company dime. Awkward. This is why I live in the woods, folks. We deal with simple Old World problems here – like how to politely separate an unruly pack of raccoons squabbling over post-Thanksgiving kitchen detritus without losing a hand. it happened Please. Don't go to any absurd trouble on I now formally swear on both of my arthritic pinky fingers that I neither read nor understood (...if I actually read, which I didn't!) a single line of the code you graciously posted above. I swear, Cisco. I didn't see nuffin'. |
Please don't worry... it's 100% my fault if there is a problem. I am not assuming there is or isn't a problem... Cisco has an open-source approval process... I formally asked for approval and we'll see what happens. |
Hello,
I was quite interested when I found this stackoverflow answer about beartype... As a POC, I cooked up a performance test using beartype, Enthought traits, traitlets, and plain-ole-python-ducktyping...
But... I found that beartype was pretty slow in my test.... As an attempt to be as fair as possible, I used
assert
to enforce types in my duck-typed function (ref -main_duck_assert()
)...I also confess that Enthought traits are compiled, so the Enthought traits data below is mostly just an FYI...
Running my comparison 100,000 times...
Question
Am I doing something wrong with bear-typing (ref my POC code below)? Is there a way to improve the "beartyped" performance?
My rig...
4.19.0-12-amd64
The text was updated successfully, but these errors were encountered: