Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify asdict and has. #48

Merged
merged 4 commits into from
Aug 8, 2016
Merged

Conversation

Tinche
Copy link
Member

@Tinche Tinche commented Aug 5, 2016

So, as you might have noticed, I'm totally a fan of this library :) Also, we've been using it at work a lot lately, and it's really working out for us. So I thought I might see if anything can be optimized a little.

We use asdict a lot. I actually totally love the way attrs interacts with data contained in Python primitives, like dicts and lists. At a later date I might submit some converters that make these transformations really easy. However, now I was taking a look at asdict performance, to see if it can be optimized.

Actually my first instinct was to see if I can use Cython to speed things up, mostly because I wanted to learn Cython a little more. The changes I'm submitting now have a far greater effect than using Cython though, so I'm not sure Cython's worth it. Maybe later?

Anyway, on to measuring. I'm using the perf library as a more reliable, fancy timeit. (Newest version: pip install git+https://github.com/haypo/perf.git#egg=perf) We actually have a way of generating attrs classes, in tests: the nested_classes Hypothesis strategy. We can use a pre-seeded Random instance to make it reproducible.

For example, seeding with the number 1, pre-changes:

> pyperf timeit -g -s "from attr import asdict; from random import Random; from tests import nested_classes; r = Random(1); C = nested_classes.example(r); c=C()" "asdict(c)"
....................
499 us: 2 ##########################
500 us: 1 #############
502 us: 1 #############
503 us: 0 |
504 us: 1 #############
506 us: 4 #####################################################
507 us: 3 ########################################
508 us: 5 ##################################################################
510 us: 5 ##################################################################
511 us: 4 #####################################################
512 us: 6 ###############################################################################
514 us: 6 ###############################################################################
515 us: 5 ##################################################################
516 us: 6 ###############################################################################
518 us: 1 #############
519 us: 2 ##########################
521 us: 2 ##########################
522 us: 0 |
523 us: 2 ##########################
525 us: 2 ##########################
526 us: 2 ##########################

Median +- std dev: 513 us +- 6 us

Post changes:

> pyperf timeit -g -s "from attr import asdict; from random import Random; from tests import nested_classes; r = Random(1); C = nested_classes.example(r); c=C()" "asdict(c)"
....................
20.6 us: 2 ##################
20.7 us: 2 ##################
20.8 us: 4 ###################################
20.9 us: 5 ############################################
21.0 us: 5 ############################################
21.1 us: 8 ######################################################################
21.2 us: 3 ##########################
21.3 us: 2 ##################
21.4 us: 6 #####################################################
21.5 us: 3 ##########################
21.7 us: 9 ###############################################################################
21.8 us: 4 ###################################
21.9 us: 1 #########
22.0 us: 0 |
22.1 us: 2 ##################
22.2 us: 2 ##################
22.3 us: 0 |
22.4 us: 0 |
22.5 us: 1 #########
22.6 us: 0 |
22.7 us: 1 #########

Median +- std dev: 21.4 us +- 0.4 us

So that's a change of 513 us to 21.4 us. We can get more examples by seeding the Random instance with other numbers. Seeding with 2 will produce a larger class, and the difference will be 1.74 ms +- 0.03 ms to 72.9 us +- 1.4 us.

I think you'll find the changes are reasonable. :)

@codecov-io
Copy link

codecov-io commented Aug 5, 2016

Current coverage is 100% (diff: 100%)

Merging #48 into master will not change coverage

@@           master   #48   diff @@
===================================
  Files           7     7          
  Lines         368   364     -4   
  Methods         0     0          
  Messages        0     0          
  Branches       86    86          
===================================
- Hits          368   364     -4   
  Misses          0     0          
  Partials        0     0          

Powered by Codecov. Last update e833e0c...cf16f36

return False
else:
return True
return hasattr(cl, "__attrs_attrs__")

This comment was marked as spam.

@hynek
Copy link
Member

hynek commented Aug 6, 2016

  1. Thanks for the kind feedback. As you imagine, that’s not something people get a lot :)
  2. I was thinking about playing with Cython to make attrs faster too heh. But my guess is tat it doesn’t really make a lot of sense for the core functionality since it’s mostly dict wrangling that is really fast in Python? In any case it should be optional b/c Cython is sometimes slower than Python on PyPy.
  3. I feel like the changes warrant a changelog entry. :)

@Tinche
Copy link
Member Author

Tinche commented Aug 6, 2016

I don't think I made any behavioral changes, so I can just mention the speedup in the changelog.

A question though. Why does fields() do a deepcopy? The attrs_attrs is already a tuple (so immutable), right? Deep copy is slow as hell, it strikes me this might be a little too much effort to prevent someone from changing the attributes, especially in adults-consenting Python. The speedups in this PR are basically just from avoiding the deep copy.

@hynek
Copy link
Member

hynek commented Aug 7, 2016

Yeah I was kind of overly careful. I kind of still see it's value for user-facing stuff (maybe people don’t realize literally everything breaks if they fuck around with the returned Attributes?) but it def shouldn’t be used for internal stuff. IOW if you find more usages feel free to swap them out. :)

@Tinche
Copy link
Member Author

Tinche commented Aug 7, 2016

Alright, review comments applied.

Why don't we just make the attributes themselves immutable? Then sharing a tuple of immutable objects would always be safe. (Well, not many things in Python are truly immutable, but within reason.)

@hynek
Copy link
Member

hynek commented Aug 8, 2016

Sure that would work too. I didn’t spend too much time thinking about that back then.

@hynek hynek merged commit 588fe48 into python-attrs:master Aug 8, 2016
@Tinche Tinche deleted the feature/fast-asdict branch January 20, 2017 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants