Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instantiation performance vs. dataclasses #12

Closed
hohav opened this issue Dec 31, 2020 · 7 comments · Fixed by #14
Closed

Instantiation performance vs. dataclasses #12

hohav opened this issue Dec 31, 2020 · 7 comments · Fixed by #14
Labels
question Further information is requested

Comments

@hohav
Copy link

hohav commented Dec 31, 2020

Thanks for creating dataclassy! In terms of features it does everything I want, but I'm also very sensitive to performance since I'm working on a project that constructs hundreds of thousands of small objects in a tight loop.

I've observed roughly a 33% slowdown for small object creation between dataclasses and dataclassy in CPython 3.9.0 (benchmark below). Do you have any thoughts on this? How much of a priority is performance for dataclassy? Would you welcome a PR that closed the performance gap at the cost of significantly longer or more complex code?

from timeit import timeit
import dataclasses
import dataclassy

@dataclasses.dataclass
class Foo1:
    __slots__ = 'x', 'y'
    x: int
    y: int

@dataclassy.dataclass(slots=True)
class Foo2:
    x: int
    y: int

def f1():
    Foo1(1, 2)

def f2():
    Foo2(1, 2)

print(timeit(f1, number=1000000))
print(timeit(f2, number=1000000))
@biqqles
Copy link
Owner

biqqles commented Dec 31, 2020

Hi there. I have been thinking about performance recently - for example see #10 where I am testing a change with the aim of improving performance in a different scenario. Earlier this month I created some rudimentary benchmarks and got the same results as you. Performance is something I am happy to focus on now I'm confident dataclassy has achieved its primary goal of being more expressive and enjoyable to use than dataclasses.

Now, as to what I think specifically:

  • dataclasses is fast because it minimises the amount of time CPython has to spend executing Python on class initialisation. It does this by only touching __init__. dataclassy currently uses __new__ or __call__ to do initialisation so that it stays out of the user's way, allowing them to beautifully define an __init__ that takes the role of __post_init__. The initialisation logic between the two (setting instance variables to parameters) is identical; what slows this down is more or less having to call object.__new__ and return the object from Python.
  • dataclassy absolutely can gain the same performance in the simple case. Changing it to generate __init__ instead of __new__ required a grand total of 4 lines of code to be changed and in my benchmarks results in identical (if not marginally better) performance to dataclasses. Of course this totally breaks the user-defined init feature, but this is an optimisation that can be made conditionally.
  • This "pay for what you need" paradigm is appealing and wouldn't be difficult or especially longwinded to implement. I have a few ideas for other optimisations too, but they're much less dramatic than this.

@biqqles biqqles added the question Further information is requested label Dec 31, 2020
@biqqles
Copy link
Owner

biqqles commented Dec 31, 2020

OK, I realised that with the simplifications made to how initialisation works over the months I actually do not need to mess around with __new__ and __call__ and can do everything with __init__ meaning the performance upgrade is attainable across all use-cases. One test for decorator reuse on a blank class is failing which I still need to figure out the cause of.

@biqqles
Copy link
Owner

biqqles commented Dec 31, 2020

OK, the prototype is in performance. All tests pass (though it needs neatening) but I'm now off to enjoy the new year. Happy New Year to you too.

@hohav
Copy link
Author

hohav commented Jan 1, 2021

Wow, that was quick! I tried the performance branch, and (for my use at least) it closely matches dataclasses' performance. Let me know if you want any help with further testing or implementation. Thanks, and happy new year to you too!

@biqqles
Copy link
Owner

biqqles commented Jan 11, 2021

@TylerYep, would you be able to check the code in performance works in your applications? This is quite a big internal change and it would be good if a few people could check it doesn't break anything. It should also fix the instantiation performance problem you also noted in #6.

@TylerYep
Copy link

TylerYep commented Jan 14, 2021

Hey, I tested it out (I checked ot the version of code on the open PR) and it works great!
Prior to this, dataclassy was taking around 30+ seconds to complete either of the two benchmark tests.

My quick benchmark (my application is essentially a SAT solver that creates roughly 2^n instances of the dataclass).
I am using cProfile to measure the runtimes.

                dataclasses         dataclassy
Small N:          2.2 sec            2.3 sec
Larger N:         4.7 sec            4.9 sec

@biqqles
Copy link
Owner

biqqles commented Jan 14, 2021

Awesome, thanks!

@biqqles biqqles linked a pull request Jan 17, 2021 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants