Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init hooks #68

Closed
glyph opened this issue Aug 21, 2016 · 96 comments
Closed

init hooks #68

glyph opened this issue Aug 21, 2016 · 96 comments
Labels

Comments

@glyph
Copy link
Contributor

glyph commented Aug 21, 2016

Sometimes I want to write an attr.s class that has some behavior in its constructor. Particularly:

  • I want to validate between/among different related values passed to the c'tor
  • I want to establish a bidirectional relationship to one of the arguments passed to the c'tor (for example: protocol/transport hook-up)
  • I want to do some gross I/O because it's a handle for some external gross I/O object (i.e. I want to open something, for example) (sorry)

Arguably, many of these things could be expressed as a function, but some of them involve preserving valid structural relationships between the object and its dependencies, so I'd really prefer to be able to use __init__ in some cases.

I know, I can write my own __init__. I know! But if I do that, I give up not just on the convenience of attr.s writing the __init__ for me, but also the useful behavior of attr.s: i.e. validators, default factories.

All of my use-cases here would be easily accomplished with either a pre-__init__ and post-__init__ hook. Most of the ones I'm actually interested in would all be achievable with a post-__init__ hook; I think pre-__init__ might be pointless given the availability of convert=.

This has been discussed already, but some of the other issues (#33 #24 #38 #38 #58) which have raised the possibility of a post-__init__ hook have been somewhat conflated with better support for inheritance. As we all know, inheritance is deprecated and will be removed in python 3.7 so we don't need to focus on that.

@hynek hynek added this to the 16.1 milestone Aug 21, 2016
@hynek hynek added the Feature label Aug 21, 2016
@Insoleet
Copy link
Contributor

As we all know, inheritance is deprecated and will be removed in python 3.7 so we don't need to focus on that.

I think the problem is that many libraries are based on inheritance. Gtk-python, PyQt, Django... I think it would reduce too much the use cases of attr if inheritance was totally not supported.

@wearpants
Copy link

Re: __pre_init__ could be useful if you need to set up some state on the instance before validating for some reason

@hynek
Copy link
Member

hynek commented Aug 21, 2016

@Insoleet I’m not sure what you’re trying to say; is there anything about this proposal contrary to that?

@Insoleet
Copy link
Contributor

I'm not sure, I would just like to be sure that you guys takes inheritance into account and not just don't care about it :)

The __pre_init__ hook is important for PyQt for example. It is the first thing to do before doing any PyQt related thing in the constructor (see for example : http://stackoverflow.com/questions/12280371/python-runtimeerror-super-class-init-of-s-was-never-called )

@hynek
Copy link
Member

hynek commented Aug 22, 2016

Random idea time! What do y’all think about:

import attr

class C:
    x = attr.ib()

   @attr.hooks.post_init
   def i_dont_care_how_its_called(self):
       # ...

Positive:

  • It’s much cleaner than adding another argument to attr.s
  • More explicit. Someone who never heard of attrs can deduce the meaning immediately.
  • you can have multiple of them which is useful in subclassing

Negative:

  • can be rather confusing if multiple hooks are registered?

@wearpants
Copy link

+1

I'd say just call them in order of definition within class... For
subclasses, after super().init()

On Aug 22, 2016 4:49 AM, "Hynek Schlawack" notifications@github.com wrote:

Random idea time! What do y’all think about:

import attr
class C:
x = attr.ib()

@attr.hooks.post_init
def i_dont_care_how_its_called(self):
# ...

Positive:

  • It’s much cleaner than adding another argument to attr.s
  • you can have multiple of them which is useful in subclassing

Negative:

  • can be rather confusing if multiple hooks are registered?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#68 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAe2AJABwv2cDbn6tlD6dPimcn-9FhCrks5qiWKzgaJpZM4JpO43
.

@Tinche
Copy link
Member

Tinche commented Aug 22, 2016

I like the cleanness of the decorator idea, although I'd alias it attr.post_init.

attrs has never played too well with non-attrs superclasses (since we never call super), so I'm not too worried there, and this feature might even improve the situation by allowing users to manually do it themselves.

What about attrs superclasses? Do we call their hooks, and if so in which order? (Presumably MRO or reverse MRO. My gut feeling is reverse MRO, so the basest class first.)

@glyph
Copy link
Contributor Author

glyph commented Aug 22, 2016

What about attrs superclasses? Do we call their hooks, and if so in which order? (Presumably MRO or reverse MRO. My gut feeling is reverse MRO, so the basest class first.)

MRO. The only right way to cooperatively call an overriden method is super(MyClass, self).__init__(). In py3 this is built in to the language; it's what super().__init__(...) does. Reversing the MRO is just a mistake; the basest class is always object and you never want to upcall to its __init__.

@glyph
Copy link
Contributor Author

glyph commented Aug 22, 2016

I very much like the idea of this hook decorator, though! I don't like typing underscores, so my suggestion for the spelling would be @attr.after.init and (maybe later) @attr.before.init ("pre" and "post" are more jargon-y than "before" and "after", and are also prefixes rather than words, which means preinit would be closer to valid, but looks weird).

@glyph
Copy link
Contributor Author

glyph commented Aug 22, 2016

Regarding "if multiple hooks are registered" - my suggestion for the first iteration of this would be to just make registering a second hook into an immediate exception at registration time. If we come up with a non-confusing way to register multiple hooks, it's always fine to add it later.

@Tinche
Copy link
Member

Tinche commented Aug 22, 2016

MRO. The only right way to cooperatively call an overriden method is super(MyClass, self).init(). In py3 this is built in to the language; it's what super().init(...) does. Reversing the MRO is just a mistake; the basest class is always object and you never want to upcall to its init.

Just to clarify my gut feeling: usually when you have a class hierarchy, the first line in your __init__ will be a call to super() (in Java, for example, not doing this on the first line is a compile error if I remember correctly from a past life), then the business logic of that particular __init__. So __init__s get called in the MRO, but the business logic gets executed reverse MRO, with the most base class setting itself up first. Then subclasses can count on the superclass invariants holding.

Since attrs will be in charge of invoking hooks, the user doesn't get to have a choice.

class A:
    x = attr.ib()
    y = attr.ib()

    @after_init
    def after(self):
        # Set me up.
        # If I'm an instance of B, B.after has been called already.
        # Maybe I want to make sure y is greater than x.
        if self.x > self.y:
            self.x, self.y = self.y, self.x

class B(A):
    z = attr.ib()

    @after_init
    def after(self):
        # This gets called before A.after
        # Has the superclass set itself up?
        # Maybe I want to make sure z is greater than y, but I can't yet, because A.after might flip them.

Of course, even in Java you can call super() with the result of an expression or function IIRC, so the truth is you get to execute some code before the call to super anyway. And of course Python isn't Java.
¯_(ツ)_/¯

@max-sixty
Copy link

Have you guys thought about using __new__ for the attr logic, and then letting users use __init__ for themselves?

I've built a similar library (not nearly as impressive, but prior to this coming along), and that technique serves us very well

@glyph
Copy link
Contributor Author

glyph commented Aug 23, 2016

For serialization and deserialization, the expected protocol for __new__ is that it never requires any arguments. Violating this expectation is possible, but ugly. Not that you should use Pickle for most things, but it would be annoying to break the ability to use attrs with multiprocessing, for example, without a bunch of additional explicit work.

@max-sixty
Copy link

For serialization and deserialization, the expected protocol for __new__ is that it never requires any arguments

I don't have much context, so potentially I'm missing something
But serialization & deserialization generally doesn't call __new__ at all. Serialization pickles the contents of __dict__ and then deserialization puts the contents back on __dict__.

We use __getstate__ to only pickle what we want to retain. __new__ still isn't called

@wearpants
Copy link

There are a few variants of the pickle protocol that do call __new__.
It's also useful if you're writing a __setstate__. Finally, this
wouldn't work with inheriting from anything with its own __new__ like
native C extensions (like PyQt) or classes with slots; if memory serves,
you can't have more than one __new__ in an inheritance hierarchy

I may be wrong on some of the details above, it's been quite a while, but
__new__ is not the way to do this IMO

On Aug 23, 2016 12:44 AM, "Maximilian Roos" notifications@github.com
wrote:

For serialization and deserialization, the expected protocol for new
is that it never requires any arguments

I don't have much context, so potentially I'm missing something
But serialization & deserialization generally doesn't call new at
all. Serialization pickles the contents of dict and then
deserialization puts the contents back on dict.

We use getstate to only pickle what we want to retain. new still
isn't called


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#68 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAe2AOj_WFFqjdoBH3pb8fXH24Seu6sKks5qinrCgaJpZM4JpO43
.

@hynek
Copy link
Member

hynek commented Aug 23, 2016

Yeah so any type of classic meta programming is out of the question. attrs classes will always be regular Python classes with some generated methods attached to it. I think there’s enough alternatives that stretch Python’s meta possibilities but for me it was always a big design goal to spit out normal classes without surprises.

@max-sixty
Copy link

This isn't my battle, so I'll leave this here unless really needed. But I'd really encourage you guys to have a read up on how __new__ / __init__ works. The words above aren't correct and I think you'll find it much more pythonic and an easier fit with existing classes (The _post_init stuff makes me wince a bit!).

if memory serves, you can't have more than one __new__ in an inheritance hierarchy

Nope! It's just like __init__

Finally, this wouldn't work with inheriting from anything with its own __new__ like native C extensions (like PyQt) or classes with slots

It would need to super up, in the same way that you'd need to super up if using __init__ when inheriting from a class that relied on __init__. By using __new__, you get out of the way of the much more common user usage of __init__

any type of classic meta programming is out of the question

No meta programming here!

Overall I would love to transition to this library, and you guys have done some amazing work...

@wearpants
Copy link

On Aug 23, 2016 10:01 AM, "Maximilian Roos" notifications@github.com
wrote:

This isn't my battle, so I'll leave this here unless really needed. But
I'd really encourage you guys to have a read up on how new / init
works.

Likewise.

Let's drop it.

@Tinche
Copy link
Member

Tinche commented Aug 23, 2016

Ok looks like we got bogged down in discussing inheritance stuff, which is exactly what @glyph warned against in the OP.

A suggestion:

  • one after_init per class. Like __init__.
  • if your class has an after_init, attrs calls it and only it. If you're inheriting from a class that has an after_init too, call it yourself. Like __init__.
  • if your class doesn't have an after_init, at class creation time we will walk the MRO (in proper MRO, not reverse), find the first class that has it, and insert a call to that. Basically like __init__.

@hynek hynek removed this from the 16.1 milestone Aug 24, 2016
@hynek
Copy link
Member

hynek commented Aug 24, 2016

Removing milestone, this bikeshed will take a while to pain.

@glyph
Copy link
Contributor Author

glyph commented Aug 24, 2016

@hynek - Small PRs, small releases :)

@glyph
Copy link
Contributor Author

glyph commented Aug 24, 2016

It sounds like we're converging on after_init/after.init though. @Tinche's summary is more or less exactly what I think should happen - do you have any objections?

@hynek
Copy link
Member

hynek commented Sep 1, 2016

I feel there has to be the word hook somewhere. attr.hook.after[._]init looks good to me. I know you hate typing (except e-mails ;P) but shit should make some sense when looking at it, agreed?

@glyph
Copy link
Contributor Author

glyph commented Sep 1, 2016

@hynek I specifically wanted to avoid the word "hook", not for brevity, but it sounds like jargon. The word "after" is very clear (this is "after init") and "hook" is something some users might not have heard of, and for the users who have heard of it, it's redundant.

@glyph
Copy link
Contributor Author

glyph commented Sep 1, 2016

@hynek trust me, if I'm going for brevity, you'll know. import a; @a.tt.rs; @a.ft.rr for example :)

@glyph
Copy link
Contributor Author

glyph commented Sep 1, 2016

(I am not strongly opposed though, if you really think it's more readable it's fine with me.)

@vortec
Copy link
Contributor

vortec commented Sep 20, 2016

What's wrong with class methods?

@attr.s
class Object:
    id = attr.ib()
    position = attr.ib(validator=instance_of(int))

    @classmethod
    def create(cls, *args, **kwargs):
        obj = cls(*args, **kwargs)
        obj.position = obj.position + 1
        return obj

obj = Object.create(123, position=2)
print(obj.position)

@wearpants
Copy link

Because then the standard constructor is callable but gives an unitialized
instance. See previous comments

On Sep 20, 2016 9:03 AM, "Fabian Kochem" notifications@github.com wrote:

What's wrong with class methods?

@attr.sclass Object:
id = attr.ib()
position = attr.ib(validator=instance_of(int))

@classmethod
def create(cls, *args, **kwargs):
    obj = cls(*args, **kwargs)
    obj.position = obj.position + 1
    return obj

obj = Object.create(123, position=2)print(obj.position)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#68 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAe2AOBIeKhht6Iib-eGItRqvz_XAl4Eks5qr9mLgaJpZM4JpO43
.

@vortec
Copy link
Contributor

vortec commented Sep 20, 2016

Sorry, I just searched for "class method", not for "classmethod".

@wearpants
Copy link

No worries, I wouldn't read the whole thread either. :-)

@wearpants
Copy link

Here's SO on how to find yourself: http://stackoverflow.com/posts/9934299/revisions

@mathieulongtin
Copy link

Let's step back and talk about the use cases:

  1. You need pre-processing before the attributes are even set
  2. You need post-processing after the attributes are set
  3. A combination of both cases

In case 1 and 3, you might as well write __init__ completely, including attribute assignments. attr is still useful because it generates comparison and representation methods. However, this:

    self.a = a
    self.b = b

is not very different amount of typing, and has a clearer intent than this:

    attr.init(self, a, b)

In case 2, any of the post-init hook solution mentioned above would work. And there is no issue with inheritance in that case.

@glyph
Copy link
Contributor Author

glyph commented Oct 21, 2016

is not very different amount of typing, and has a clearer intent than this:

It's a potentially wrong intent, though. attr.init(self, a, b) will apply coercers and validators to a and b, but manually setting the attributes won't.

@mathieulongtin
Copy link

My thinking is by the time you write def __init__(self, a, b, c):, you might as well do all the coercing and validating and defaulting in init.

@mathieulongtin
Copy link

attr is mostly about not repeating yourself and avoiding boilerplate. attr.init makes for a lot of repeating.

How about this. Kind of like @argh.expect_obj.

@attr.s(wrap_init=True):
class Xyz(object):
  a = attr.ib(default="a"); b = attr.ib(default="b")
  def __init__(self, args):
    # preprocessing, submitted values for a and b are in args.a and args.b
    # they can be modified
    args.set_attr()  # default, type-checks, coerces and set attributes
    # postprocessing goes here, attributes are fully set
  1. attr wraps the given init with a generated one that passes the arguments in the args object.
  2. No need to repeat the list of arguments, just put them in the attr.ib declarations.
  3. args.set_attr() may be broken down in args.set_default(), args.coerce(), args.validate() and args.assign() if fine control is needed.

From the outside, Abc.__init__ takes an optional a and b argument and looks perfectly normal.

@glyph
Copy link
Contributor Author

glyph commented Oct 22, 2016

From the outside, Abc.init takes an optional a and b argument and looks perfectly normal.

Personally (and I suspect @hynek would agree) I find the idea of writing an __init__ but lying about its signature bad. Potentially a better suggestion would be our own special method name (I could have sworn that someone had suggested __post_init__ above). Perhaps __attrs_init__(args)?

@hynek
Copy link
Member

hynek commented Oct 22, 2016

__post_init__ was my original idea heh.

Problem is that it's gonna need some way to signal that it's there (checking using getattr seems gross and not fitting the explicit design of attrs) which brings us back to the decorator solution... 🎠

@mathieulongtin
Copy link

Wouldn't attr_init be able to use type(self).__dict__['__attr_init__'] to grab the class specific init, therefore skipping inheritance issues?

@ambv
Copy link
Contributor

ambv commented Nov 2, 2016

Two ideas.

  1. The order of field declarations is known. So, you could initialize your fields deterministically with a factory, if that factory were given a self attribute.
  2. But since "1." isn't entirely generic, having the generated __init__() call __post_init__() makes sense to me. It is explicit enough already that you use the class decorator so forcing users to put another decorator every time is counter-productive. Isn't attrs all about avoiding boilerplate?

Subclassing and the custom super() ideas all sound inferior to me. They are far less obvious which means I couldn't recommend using them to regular Joes (like myself).

@hynek
Copy link
Member

hynek commented Nov 3, 2016

A new painter appears! :D

At this I think the question is whether we go:

  • magic dunder method, or
  • decorator.

I’m not sure what you mean by subclassing/custom super, because it’s all just about people wanting to subclass non-attrs classes and call super on them. It’s not meant as a mean to solve the problem here but a goal to enable. :)

Thoughts

Magic Dunder

  • It’s more error prone since mistyping the name will result in nothing happening.
  • One hook per class. Or do we traverse the MRO to collect them all? I guess we have to, if the super classes depend on it?

Decorator

  • More explicit.
  • Allows for multiple hooks even throughout hierarchies. Meaning: attrs can collect all hooks and execute them.
  • That has the downside of possible conflicts though?

So personally, I feel like a decorator would be more attrs-style because it’s more explicit and less error prone and plays well WRT to intra-attrs subclassing.

I’d love thoughts on these two approaches.

I guess if someone offered me a good PR for Lukasz’s 1. point I’d begrudgingly merge it.

@ambv
Copy link
Contributor

ambv commented Nov 5, 2016

Well, a counter-argument to the typos in the name is that you can also mistype init, str, etc. and they will all happily do nothing.

As for allowing multiple hooks with a decorator, then you run into issues with ordering. I'd rather only have to override one hook, possibly calling super() inside it. While this also has problems, it's less custom and magical than decorators.

@hynek
Copy link
Member

hynek commented Nov 5, 2016

Yes but what about subclassing? What if someone relies on the hook being run to initialize an attribute? :-/

@glyph
Copy link
Contributor Author

glyph commented Nov 5, 2016

If you go with __post_init__ I think it's actually a lot more straightforward to deal with subclassing.

…reads through the whole previous discussion to make sure nobody else has raised this already…

The problem with __init__ and subclassing is that it doesn't have a fixed signature. So if the MRO might get reordered by diamond inheritance or what have you, you don't know what to call super().__init__ with; it's literally impossible to know in the general case. attrs actually has more information in __init__ than regular old Python code does, so it might actually be able to solve this problem, especially if the other superclasses are also attrs classes.

Let me posit such a solution, actually. Instead of trying to make __init__ cooperative, let's say attrs just makes all inherited attr.ibs part of your constructor. (Maybe it does this already? I dunno.) The important thing is that attrs defines an __init__ that doesn't upcall other __init__s at all, sidestepping the inheritance problem entirely. This uber-__init__ then calls __post_init__ once, at the end of the __init__.

Making __post_init__ itself cooperative is then very easy: it does have a fixed signature. Just super().__post_init__() at the beginning or end of your __post_init__ (depending on your behavioral dependencies); this will correctly invoke __post_init__ up and down the MRO, and you can either depend on your superclasses's post-init behavior, in which case you invoke it early, or you know you're dealing only with your own subclass's state, in which case you put it at the end.

@glyph
Copy link
Contributor Author

glyph commented Nov 5, 2016

After typing that all out and swapping the whole previous conversation into my brain again, I'm coming down more and more in favor of just a __post_init__ special method. In many ways it's less magical than the decorator-based solution.

@mathieulongtin
Copy link

attrs initializes parents attrs class's attributes without ever calling super(). It just inspects the list of arguments, including the parents.

So should pre-init and post-init follow the same rule as attrs' __init__? It gets called magically in the parent class by the generated __init__?

Should attr's generated init be like this:

def __init__(self, ...):
  self.__pre_init__()
  super(Ba, self).__pre_init__()  # do this for all parent classes
  # init attributes
  super(Ba, self).__post_init__()  # do this for all parent classes
  self.__post_init__()

Or should it be like this:

def __init__(self, ...):
  self.__pre_init__()
  # init attributes
  self.__post_init__()

And the defined __pre_init__ and __post_init__ have to call super().__pre_init__ and super().__post_init__, if needed?

@glyph
Copy link
Contributor Author

glyph commented Nov 5, 2016

@mathieulongtin Okay so I just re-invented what it already does? :) (I have very lightly used inheritance with attrs once or twice, but never had a complex hierarchy and never inherited anything that required an __init__, or one attrs class from another.)

@glyph
Copy link
Contributor Author

glyph commented Nov 5, 2016

__pre_init__ is a separate problem; I'm not entirely sure we need it.

__post_init__, though, should be called exactly once at the end of the generated __init__. super() works fine for methods other than __init__; you either rely on your superclass's behavior or not, you either call it first or call it last. So let's not do anything magic, just let users implement inheritance with the regular semantics that are usually used for method overloading.

Honestly, the more I think about it, the more I think that with this hook attrs might actually fix the second biggest problem with subclassing that is specific to Python. (The first biggest being confusion about who owns the self namespace when inheriting across library boundaries...)

@hynek
Copy link
Member

hynek commented Nov 7, 2016

Hm how do you mean “other than __init__”? It would be nice if the feature could stop people from opening issues, that they want to use attrs with PyQT etc.

@hynek
Copy link
Member

hynek commented Nov 20, 2016

Fixed in #111! 🚲 🏠 🌈

@Arcitec
Copy link

Arcitec commented Sep 29, 2019

Semi-related: Here's a ticket (and comment) describing how to run all post-init hooks in a full class inheritance hierarchy with attrs:

#167 (comment)

@Arcitec
Copy link

Arcitec commented Sep 29, 2019

PS, for anyone as confused as I was about the end of the OP's post:

This has been discussed already, but some of the other issues (#33 #24 #38 #38 #58) which have raised the possibility of a post-__init__ hook have been somewhat conflated with better support for inheritance. As we all know, inheritance is deprecated and will be removed in python 3.7 so we don't need to focus on that.

^ That comment is a joke. Class inheritance is not deprecated in python and is not being removed. In fact the python docs at https://docs.python.org/3/tutorial/classes.html#inheritance say "Python classes provide all the standard features of Object Oriented Programming: the class inheritance mechanism allows multiple base classes, a derived class can override any methods of its base class or classes, and a method can call the method of a base class with the same name." and "Of course, a language feature would not be worthy of the name "class" without supporting inheritance."

Python is not removing class inheritance.

@glyph
Copy link
Contributor Author

glyph commented Sep 30, 2019

Python is not removing class inheritance.

The joke is only funny because it should, though.

Looks like we missed the boat for 3.7 but there's always Python 4!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests