New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a helper API for assoc + validate #116

Closed
exarkun opened this Issue Nov 29, 2016 · 14 comments

Comments

Projects
None yet
3 participants
@exarkun

exarkun commented Nov 29, 2016

It's common to want to do:

x = assoc(y, a=b)
attr.validate(x)

It would be nice if the easy, obvious way of making a copy of an object with some changes had the validation built in (instead of the easy, obvious way skipping validation).

@hynek

This comment has been minimized.

Member

hynek commented Nov 29, 2016

Huh!

I kind of would’ve expected that assoc() validates implicitly but it clearly isn’t.

I feel like it should!?

I don’t even care as much about possible breakage as about the implicit performance regression. But assoc’s job is to create new objects…

@Tinche

This comment has been minimized.

Member

Tinche commented Nov 29, 2016

Agreed. I was surprised that assoc doesn't create a new object using __init__ in the first place. (In fact it uses copy.copy.) However, this approach works for instances without an __init__.

My gut feeling is it's ok to expect assoc to require an attrs initializer, but I've used Clojure so that maybe colors it. Maybe introduce this requirement and change the implementation to not use copy.copy?

I can provide some benchmarks in the evening.

@Tinche

This comment has been minimized.

Member

Tinche commented Nov 29, 2016

A small addendum: I'm not sure about copy.copy, but copy.deepcopy is so slow as to be basically the worst choice for copying objects if you know there are no cycles. (We need this at work to be fast, and the fastest way we've found is to use pickle. This is almost 20 times faster.)

So there might be a performance win here?

@hynek

This comment has been minimized.

Member

hynek commented Nov 29, 2016

Show us the numbers sir!

@exarkun

This comment has been minimized.

exarkun commented Nov 29, 2016

I'd be happy to have assoc do validation all the time. I thought it might skip validation for consistency with regular attribute setting.

@Tinche

This comment has been minimized.

Member

Tinche commented Nov 29, 2016

I'm linking a module containing three classes of increasing complexity, and a new implementation of assoc using __init__ directly. Here are some benchmarks from my home desktop (64-bit AMD, Linux, CPython 3.5.2) using perf. Numbers are in microseconds.

https://gist.github.com/Tinche/372796f6e0e899ba099810981d5da541

Note that for old assoc, I'm actually using assoc(); validate() since we agree assoc should validate.

                   class A         class B        class C
------------------------------------------------------------------
Old assoc     |  10.7  +- 0.3    12.9  +- 0.2    15.3  +- 0.3
New assoc     |   2.05 +- 0.05    4.94 +- 0.11    8.07 +- 0.18

In more complex classes validators and converters dominate, I think.

The assoc API states if a user puts in a non-existent attribute, an AttrsAttributeNotFoundError should be raised. However, it'd be trivial (and a little faster, but we're talking hundreds of nanoseconds) to actually just propagate a TypeError out, like this:

TypeError: __init__() got an unexpected keyword argument 'aaaa'

this is what would happen if you tried feeding a non-existent field into an ordinary Python class initializer. To me this is more logical. I've implemented the old way to not break compatibility and pass the tests though.

Here are the actual commands if you want to reproduce:

sudo pyperf system tune
pyperf timeit --rigorous -g --duplicate 5 -s "from t import A, B, C; from attr import validate; from attr import assoc; a = A(1); b = B('t', 1, None); c = C(a=2, b='t', c='5', d=None)" "assoc(a, a=1); validate(a)"
pyperf timeit --rigorous -g --duplicate 5 -s "from t import A, B, C; from attr import validate; from t import assoc; a = A(1); b = B('t', 1, None); c = C(a=2, b='t', c='5', d=None)" "assoc(a, a=1)"
pyperf timeit --rigorous -g --duplicate 5 -s "from t import A, B, C; from attr import validate; from attr import assoc; a = A(1); b = B('t', 1, None); c = C(a=2, b='t', c='5', d=None)" "assoc(b, b='b'); validate(b);"
pyperf timeit --rigorous -g --duplicate 5 -s "from t import A, B, C; from attr import validate; from t import assoc; a = A(1); b = B('t', 1, None); c = C(a=2, b='t', c='5', d=None)" "assoc(b, b='b')"
pyperf timeit --rigorous -g --duplicate 5 -s "from t import A, B, C; from attr import validate; from attr import assoc; a = A(1); b = B('t', 1, None); c = C(a=2, b='t', c='5', d=None)" "assoc(c, a=2, i=b''); validate(c)"
pyperf timeit --rigorous -g --duplicate 5 -s "from t import A, B, C; from attr import validate; from t import assoc; a = A(1); b = B('t', 1, None); c = C(a=2, b='t', c='5', d=None)" "assoc(c, a=2, i=b'')"
sudo pyperf system reset

If this looks OK, a pull request can easily be made.

@hynek

This comment has been minimized.

Member

hynek commented Nov 30, 2016

Such vroom vroom, much fast, very nice!

There’s one thing though: does this work with aliased attributes? (i.e. _x is x in __init__). I’m afraid that case has to be handled (and added as a test case). I suspect it’s technically broken too ATM.

@Tinche

This comment has been minimized.

Member

Tinche commented Nov 30, 2016

Hm. What's the desired behavior?

class A:
    _x = attr.ib()

a = A(x=1), assoc(a, x=2) vs a = A(x=1), assoc(a, _x=2)

I guess it comes down to whether you consider assoc a setter or initializer. The current behavior is assoc(a, _x=2), so backwards compatibility favors that.

I'll take a look at what it takes and what the numbers are.

@hynek

This comment has been minimized.

Member

hynek commented Nov 30, 2016

good question :|

it’s kind of both

@hynek

This comment has been minimized.

Member

hynek commented Dec 10, 2016

I’ve meditated over it and it’s totally a setter but we should support both which IMHO shouldn’t be a big deal. We’d have to look twice anyways. I just hope it’s possible to implement without measurable penalties to normal cases. But as I see it, it’s rather faster?

@Tinche

This comment has been minimized.

Member

Tinche commented Dec 10, 2016

I have a version that works with private attributes, just haven't gotten around to submitting it yet.

Uh, support both ways? I think it can be done with a performance penalty only for private attributes.

Semi-relatedly, I think it'd be useful to have a comprehensive benchmark suite for attrs. I don't exactly know what this would look like.

@hynek

This comment has been minimized.

Member

hynek commented Dec 10, 2016

Agreed and I have no idea too. :-| Travis is not deterministic enough so we could just write a benchmark that can be ran locally to get some numbers but I don’t think we can do something automatic.

@hynek

This comment has been minimized.

Member

hynek commented Feb 4, 2017

This has been fixed by evolve which happens to be faster and all-around prettier!

@hynek hynek closed this Feb 4, 2017

@exarkun

This comment has been minimized.

exarkun commented Feb 4, 2017

evolve: #135

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment