Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enum: Add Flags and IntFlags #67779

Closed
serhiy-storchaka opened this issue Mar 5, 2015 · 80 comments
Closed

enum: Add Flags and IntFlags #67779

serhiy-storchaka opened this issue Mar 5, 2015 · 80 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Mar 5, 2015

BPO 23591
Nosy @warsaw, @rhettinger, @ezio-melotti, @bitdancer, @ethanfurman, @serhiy-storchaka, @MojoVampire, @vedgar
Files
  • intflags.patch
  • intflags_2.patch
  • intflags_3.patch
  • issue23591.stoneleaf.02.patch
  • bit_length.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ethanfurman'
    closed_at = <Date 2016-11-21.16:31:14.558>
    created_at = <Date 2015-03-05.15:31:23.274>
    labels = ['type-feature', 'library']
    title = 'enum: Add Flags and IntFlags'
    updated_at = <Date 2016-11-21.16:31:14.556>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-11-21.16:31:14.556>
    actor = 'python-dev'
    assignee = 'ethan.furman'
    closed = True
    closed_date = <Date 2016-11-21.16:31:14.558>
    closer = 'python-dev'
    components = ['Library (Lib)']
    creation = <Date 2015-03-05.15:31:23.274>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['38344', '38387', '44114', '44278', '44319']
    hgrepos = []
    issue_num = 23591
    keywords = ['patch']
    message_count = 80.0
    messages = ['237269', '237491', '237494', '237532', '247018', '247026', '247033', '247040', '269018', '269023', '269792', '269919', '270110', '272202', '272230', '272303', '272651', '272674', '272693', '272701', '272719', '272732', '272737', '272741', '272788', '272793', '272795', '272796', '272797', '272798', '272799', '272926', '272927', '272956', '272958', '273934', '273939', '273948', '273954', '273958', '274003', '274048', '274105', '274107', '274206', '274230', '274241', '274279', '274280', '274311', '274317', '274343', '274372', '274373', '274377', '274379', '274403', '274822', '274826', '274828', '274868', '274869', '274870', '274875', '274876', '274975', '274976', '274979', '275093', '275736', '275743', '275801', '275802', '276335', '276348', '276354', '276358', '276907', '277432', '281376']
    nosy_count = 10.0
    nosy_names = ['barry', 'rhettinger', 'ezio.melotti', 'r.david.murray', 'eli.bendersky', 'ethan.furman', 'python-dev', 'serhiy.storchaka', 'josh.r', 'veky']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue23591'
    versions = ['Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Mar 5, 2015

    Here is preliminary implementation of IntFlags (no docs). This is an int subclass purposed to represent a set of integer flags. It supports named constants (as IntEnum), supports bitwise operations (&, |, ~) and has funny str and repr. See discussion on Python-Ideas [1].

    The patch includes tests and few examples of using IntFlags in the stdlib.

    [1] http://comments.gmane.org/gmane.comp.python.ideas/32267

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Mar 5, 2015
    @ethanfurman ethanfurman self-assigned this Mar 6, 2015
    @vadmium
    Copy link
    Member

    vadmium commented Mar 8, 2015

    It would make more sense and be more consistent if the str() and repr() used one’s complement in all cases, i.e.:

    self.assertEqual(str(Perm(~0)), "~0")

    Also, the repr() seems to be doing a bad attempt at Python pseudo code. Instead of

    <Perm.R|W: 3>

    maybe it could be something like these:

    <Perm.R|Perm.W: 3> # Mirroring str() -> "Perm.R|Perm.W"
    <Perm R|W: 3>
    <Perm: R|W = 3>

    I wonder if the addition (+) operator should also be overridden; I haven’t looked, but I suspect some people may do FLAG1 + FLAG2 instead of FLAG1 | FLAG2.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Mar 8, 2015

    The current patch is more along the lines of a proof-of-concept.

    The final IntFlag type (if there is one) would be quite a bit more extensive since part of the reason for its existence is to not lose type -- so pretty much every __op__ would have to be overridden.

    Agreed about the repr() and str(), though.

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Mar 8, 2015

    I choose repr() so that for single flag it looks as for IntEnum instance. Other variants except <Perm.R|Perm.W: 3> have different format. <Perm.R|Perm.W: 3> is too verbose, it repeats the same class name multiple times.

    But I don't like current repr.

    Other operators are not overridden for purpose. IntFlags is int and can be used in all cases when int is used, but not in all cases the result should preserve the type. FLAG1 + FLAG2 works as if flags would be ints, but it should be rewritten to FLAG1 | FLAG2 to got the benefit from using IntFlags.

    However I forgot to override the exclusive-or operator. It should be overridden.

    Updated patch changes str and repr, override the exclusive-or operator and adds some documentation.

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Jul 21, 2015

    Why have you rejected this issue Ethan? I think this is useful feature and must be in Python, and other core developers agreed with this. Only minor implementation details are discussable.

    @bitdancer
    Copy link
    Member

    bitdancer commented Jul 21, 2015

    I agree. Closing this does not follow our normal development workflow, since there has been a consensus in favor and none for rejection. If you think the API needs wider discussion a new thread can be started on python-ideas.

    @bitdancer bitdancer reopened this Jul 21, 2015
    @bitdancer
    Copy link
    Member

    bitdancer commented Jul 21, 2015

    Ethan: to clarify...based on my years of watching this tracker, the way we normally handle an issue like this is to leave the issue open until it is resolved, sometimes with people proposing competing patches. (There might be reasons to change that tradition, but if so the forum for that discussion would be python-workflow.)

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Jul 21, 2015

    My experience is that a module maintainer, or somebody claiming to speak for the module maintainer, can close any issue in their area at any time regardless of the number of core devs in favor of a change.

    Whatever. I'll leave this open and write up a spec of what IntFlags should look like in case somebody wants to write the patch for it.

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Jun 21, 2016

    What should be done for landing this in 3.6?

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Jun 21, 2016

    Just look briefly through your patches, and they look pretty good. I'll take a more in-depth look in the next couple weeks. (Feel free to ping again if you don't see any activity from me, and thanks for your patience.)

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Jul 4, 2016

    Ping again.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Jul 7, 2016

    Reviewing...

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Jul 10, 2016

    I don't think I'll have this in before the next alpha (today? tomorrow?) but I'll get it merged in the next couple weeks (need to do some integration work with the other Enum enhancements).

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Aug 9, 2016

    Flags -> int-backed, but not ints; useful for pure Python flags
    IntFlags -> ints (like IntEnum); useful for interfacing with other systems

    Are there better names?

    @ethanfurman ethanfurman changed the title Add IntFlags Add Flags and IntFlags Aug 9, 2016
    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Aug 9, 2016

    The name IntFlags is inspired by the Flags attribute in C# (this is the most close concept). The "Int" prefix is added for parallel with IntEnum and because the class behaves as int in some contexts (supports operators |, &, ^, ~ and can be passed to functions that require int). I don't know whether there is a need in Flags class that is not a subclass of int.

    Other languages provide something like EnumSet - a specialized set of enums. But it has different interface.

    1. If there are enum members FOO and BAR, their union is {FOO, BAR}, not FOO|BAR.
    2. The union member itself is not a set.
    3. A set can contain only predefined set of values (IntFlag allows non-defined bits be set).

    If you thing that non-int flags could be useful, you can consider adding EnumSet for general enums and IntFlags (or just Flags) for int-like enums. But I think that new API can just use a set or a tuple of enums. IntFlags is needed for compatibility with old API or C API.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Aug 10, 2016

    The idea behind Flags is that they are easily combined enums. The advantage they have over IntFlags is they do not interact with integers, and they cannot have non-existing values used. By keeping the same API as IntFlags we only have two APIs instead of three (Enum and Flag instead of Enum, EnumSet, and IntFlag), and also make life easier for changing code from one to the other (Flag <-> IntFlag).

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Aug 14, 2016

    Due to dynamic typing EnumSet actually is not needed in Python. You can use sets, tuples or lists for combining enums in new API. IntFlags is needed for old API that already uses ints. I think introducing non-int Flags is overengineering, can't imagine somebody will use it in new code.

    There is no much time left before feature freeze. If general IntFlags will not added, I'm going to open separate issues for adding specialized class for every particular case (re, os, etc).

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Aug 14, 2016

    IntFlags will be in before feature freeze.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Aug 14, 2016

    I tried to implement this once. The biggest problem I encountered is inconsistency with Enum identity: Enum's main idea is that its objects are pre-created and __new__ just return one of them, so equality is simply identity.

    If you try to do this with flags, you quickly hit the exponential blowup and with most flags applications, it's utterly impractical to construct all the combinations at the start. So you have MyFlags.a|MyFlags.b is not MyFlags.a|MyFlags.b, and that is very weird and unexpected (especially if MyFlags.a is MyFlags.a, which it is if this case just delegates to Enum.__new__).

    How does this patch propose to solve this problem?

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Aug 14, 2016

    Currently, the patch has the main values pre-created (so identity works as expected), and combinations are created on the fly (like normal class instances) -- in which case identity cannot be relied upon.

    Once I have the basic work done, I'll go back and add a cache so each unique value is only created once; at some point I'll make those cached values weakref'ed so they disappear when no longer used.

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Aug 15, 2016

    I don't think this is a problem. For now os.O_WRONLY is os.O_WRONLY, but os.O_WRONLY|os.O_CREAT|os.O_TRUNC is not os.O_WRONLY|os.O_CREAT|os.O_TRUNC. There is an exponential number of combinations (actually infinite number if count not named bits) and you shouldn't use identity check for flags.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Aug 15, 2016

    It's true, but it seems that with Enums, we're trying to retrain people to use identity testing (see https://docs.python.org/3/library/enum.html#comparisons). It would be unfortunate if we had to reuntrain them. :-/

    Ethan's proposal (of caching weakrefs) is fine, if he manages to do it correctly (I didn't). It's much more complicated than it seems at first, since it has to work during the class definition itself, too.

    Ok, second problem I encountered: zeros. Imagine

        class MyFlags(enum.Flags):
            NONE = 0
            FIRST = 1 << 0
            SECOND = 1 << 1

    What is MyFlags.FIRST & MyFlags.SECOND? 0 or MyFlags.NONE? Or even False? :-) I would almost be for the third option, if not for the fact that & is used not only for "membership" testing, but also (with ~) for clearing flags. Imitating Go, we might want to introduce another operator, &~, for clearing flags, but that would probably be a too great disruption.

    MyFlags.NONE seems cool, but then we have an enum member that is false, another thing we have tried to avoid with current Enum design (that's why functional API starts values at 1, for example). And what about the case when MyFlags doesn't define NONE, only FIRST and SECOND?

    If you opt for 0, you open another can of worms: first, return type depends on argument values, and Guido hates that. :-) Also, that 0 is still somehow tied to MyFlags: it would be a type error to write (MyFlags.FIRST & MyFlags.SECOND) | AnotherFlags.SOMETHING, right?

    Maybe the correct solution is to _always_ have a special value in every Flags class for such cases, but then what is it's ~NONE? :-) Fortunately, once you add __NONE__ and __ALL__, you get a Boolean closure, and you can meaningfully define &, |, ^ and ~ in a type-safe way. But that's probably overkill.

    And so on... there are a lot of problems, and I've been through them. I'd like you to succeed where I didn't, but I have a bad feeling.

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Aug 15, 2016

    You still can use identity testing for named instances of IntFlags. But since the purpose of IntFlags is replacing int flags, tested values can be int (e.g. when read from files as ints). For unknown values you should use either equality testing or wrap them in IntFlags.

    In your example MyFlags.FIRST & MyFlags.SECOND is MyFlags.NONE. If MyFlags.NONE not exists, the result is MyFlags(0). You can apply the patch and experiment with it.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Aug 15, 2016

    Yes, IntFlags sidesteps a lot of these issues (though not all: inconsistency with a lot of principles of IntEnum is still jarring). But I thought we were talking about Flags too (it is not in the patch, as far as I see).

    But now I see that Flags was kinda abandoned, and we're going forward only with IntFlags, right? Maybe the title should be changed?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 4, 2016

    New changeset 8e9d3a5d47d5 by Ethan Furman in branch 'default':
    bpo-23591: more docs; slight change to repr
    https://hg.python.org/cpython/rev/8e9d3a5d47d5

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 4, 2016

    As a side note, I suspect there would be much less confusion with Flag if we had an AutoEnum.

    No, there would be _different_ confusion. :-P

    Anyway, let's get back to the discussion. I am quite aware about what's the intended use, but you can't just assume people will know about it. In my view, you must do one of two things:

    1. (at least in documentation, and preferably in the code by raising Exceptions at class definition time) forbid the use of Flags whose values are not either a) powers of two, or b) bitwise or of some already declared values

    -or-

    1. Specify and implement a robust algorithm for finding the "cover of bits" for a given numeric value. If the intended algorithm is really "pick the largest member value smaller than given value, subtract and repeat until 0 remains", then it should be said so in the documentation, and preferably some reasons given for the usage of that exact algorithm. ("it was easiest to implement" does not rank very high on my list.)

    In other words, MyFlags(7) should either be illegal, or I should be able to interpret what it will be by reading the documentation. Leaving it unspecified is not acceptable IMO.

    (In case it isn't clear: I'm for option 1. I _don't_ intend to write MyFlags ever in Python. But if I happen to write it unintentionally (e.g. if I forget to declare 2), I would like Python to tell me I'm doing something wrong.)

    (But if you really want option 2 for some reason, that's ok too. I'm just saying I would like it specified, with a rationale if it's not too much of a problem.)

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 5, 2016

    I am quite aware about what's the intended use, but you can't just
    assume people will know about it.

    Fair enough.

    In my view, you must do one of two things:

    1. (at least in documentation, and preferably in the code by raising Exceptions at class definition time) forbid the use of Flags whose values are not either a) powers of two, or b) bitwise or of some already declared values

    The closest I would get to forbidding something would be to forbid non-integers as values -- and I'm not going to do that. In general, Python doesn't take such steps (one notable exception being sum refusing to work with strings because it was such a common trap).

    -or-

    1. Specify and implement a robust algorithm for finding the "cover of
      bits" for a given numeric value. If the intended algorithm is really
      "pick the largest member value smaller than given value, subtract and
      repeat until 0 remains", then it should be said so in the
      documentation, and preferably some reasons given for the usage of that
      exact algorithm. ("it was easiest to implement" does not rank very high
      on my list.)

    How about "it works for me"? Seriously, if every bit is named then the exact repr used is irrelevant. If you don't like the standard method then write your own __repr__ method, because that's what we're talking about -- not the actual value (which isn't going to be affected by the algorithm for finding the "cover of bits", but the display of the names).

    In other words, MyFlags(7) should either be illegal, or I should be
    able to interpret what it will be by reading the documentation. Leaving
    it unspecified is not acceptable IMO.

    Thankfully, docs can still change during beta.

    (In case it isn't clear: I'm for option 1. I _don't_ intend to write
    MyFlags ever in Python. But if I happen to write it unintentionally
    (e.g. if I forget to declare 2), I would like Python to tell me I'm
    doing something wrong.)

    We could certainly add a decorator that double-checks that, just like we have the unique decorator to ensure no duplicates in an Enum. Any idea what to call it? complete? named? allbits?

    (But if you really want option 2 for some reason, that's ok too. I'm
    just saying I would like it specified, with a rationale if it's not
    too much of a problem.)

    (1) isn't going to happen, except possibly as a decorator. If you would like to contribute code and/or docs for (2) and/or a decorator for (1) I'd be happy to review.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 7, 2016

    I would like to add a function that can be used when creating a Flag (and Enum, but that's less important) that will generate the next value.

    This is important because in Flag the values are an important internal detail, but are largely irrelevant to the user (if they weren't then an IntFlag is probably more appropriate).

    Actually, this function already exists and is used to supply the values when using the Functional API: _generate_next_value_ . But that's a bit clunky for use during class definition:

    class Color(Flag):
        red = _generate_next_value_()
        blue = _generate_next_value_()
        green = _generate_next_value_()

    What would be a better name for class use?

    • _next_value_
    • _value_
    • _flag_
    • _next_flag_

    Any others?

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 7, 2016

    I think we had that discussion in the other thread, and concluded that auto() (or _auto_() if you insist) is quite fine. I think it's important to emphasize it's automatically generated, not that it's really "next" in any particular order.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 7, 2016

    I like it! (Although I have no idea which other thread you are talking about.)

    And yes, it needs the leading and trailing underscore so as not to clash with member names.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 7, 2016

    Any problems with:

    class Color(Enum):  # or Color(Flag)
      red = _auto_
      green = _auto_
      blue = _auto_

    In other words:

    • is the missing parenthesis an issue?
    • are linters/checkers/IDEs going to have (or report) problems with that?

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 7, 2016

    For me, it is an issue. But you probably already know that.
    (http://bugs.python.org/issue26988#msg273125) (That's "the other thread".)

    Try to explain: how exactly is that different than wanting "file.close" to close the file, or "quit" to quit the REPL? And I surely hope we're not going in that direction.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 7, 2016

    For me, it is an issue. But you probably already know that.
    (http://bugs.python.org/issue26988#msg273125) (That's "the other thread".)

    Ah, thanks. I had forgotten about that one.

    For what it's worth, I largely agree with you there.

    Try to explain: how exactly is that different than wanting "file.close"
    to close the file, or "quit" to quit the REPL?

    Enum is already full of magic:

    • class is iterable
    • class is indexable
    • member attributes are instances of class
    • members are all created when class is created
    • etc.

    Flag, especially, needs a way to specify "give me a working value" even when the user doesn't care what that value is (as your, um, interesting examples have shown ;).

    Whatever we choose as the placeholder (since it has been decided that no placeholder is too magical) will still have magic involved, so I don't want parenthesis there disguising that.

    I'd also be happy with _member_ instead of _auto_ -- maybe that makes even more sense.

    And I surely hope we're not going in that direction [dropping parens from function calls].

    No, we're not.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 7, 2016

    _member_() is fine with me, though I prefer _auto_(). _auto_member_() is probably too much. :-)

    The argument "we already do magic, so let's do also this bit of non-connected magic" seems very weak to me. But in fact those other things are not _magic_ in the same sense as this one (except repatching __new__, I agree that's weird, but Guido started that game when he declared bool unsubclassable:). They are just added behaviors.

    This is something fundamental: it is breaking the promise that class body is a suite of commands, where Python statements (such as assignment) have their usual semantics. If I ever see, in any suite,

        a = x
        b = x

    and after that a is not b, I feel cheated. Even more strongly: I don't know how to think about the code anymore. Those look like assignment statements, but they are not. What are they? Is this Python?

    And what if you really do want to make aliases? Are you saying that

        a = x
        b = a

    should be fundamentally different than the above? Oh, please let's not go there. :-)

    @rhettinger
    Copy link
    Contributor

    rhettinger commented Sep 7, 2016

    Any problems with:

    class Color(Enum): # or Color(Flag)
    red = _auto_
    green = _auto_
    blue = _auto_

    As long as _auto_ has been defined somewhere (i.e. from enum import _auto_), it is normal Python and doesn't fight with the rest of language or its toolchains.

    The idea does seem to be at odds with the enum module itself. Heretofore, the rule for the module is that if the same object is assigned to multiple variable names, those names become synonyms.

        >>> from enum import Enum
        >>> class Color(Enum):
    	red = 1
    	magenta = 1
    	orange = 2
    	ochre = 2
    
        >>> Color.orange is Color.ochre
        True

    Also, I still question the need, you already have multiple ways to do it:

        Color = Enum('Color', ['red', 'green', 'blue'])
    
        Color = Enum('Color', '''
            red
            green
            blue
        ''')

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 8, 2016

    Vedran commented:

    This is something fundamental: it is breaking the promise that class body
    is a suite of commands, where Python statements (such as assignment) have
    their usual semantics.

    I find it curious that you're okay with

    >>> class Color(Enum):
    ...     red = 1
    ...
    >>> Color.red == 1
    False

    But you find this completely incomprehensible

    >>> class Color(Enum):
    ...    red = _auto_
    ...    blue = _auto_
    ...
    >>> Color.red == Color.blue
    False

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 8, 2016

    Yes, I didn't explain well. I'm ok with postprocessing. After all, since we got PEP-520, it's quite obvious that "the namespace class body is executed in" is not the same as "the final __dict__ of a class".

    But even "the namespace class body is executed in" should be a Python namespace, with all its rules. If objects are the same, then after postprocessing they should also be same. Especially if we currently do have semantics for reassignment: aliasing. "Name is not a part of object" is a fundamental tenet of Python. And _auto_ is _auto_.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 8, 2016

    Since you like examples, what do you say about

        class MyEnum(Enum):
            red = some_function()
            blue = red

    Now, is MyEnum.blue the same as MyEnum.red (watch: not "equal", but "same")? Well, it depends on what some_function returns, right? If it returns _auto_, they are not the same, but in all the other cases, blue is just an alias for red. So object identity depends on some value that could be external to the class. To me that's obviously unacceptable.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 8, 2016

    Just so we're clear: I'm going to go with "auto()" and (mostly) use Python's standard routines (with only a little bit of metaclass magic).

    Vedran opined:

    Since you like examples, what do you say about

    I love examples! Examples are like pictures, which are worth a thousand words. ;)

    class MyEnum(Enum):
    red = some_function()
    blue = red

    Now, is MyEnum.blue the same as MyEnum.red (watch: not "equal", but
    "same")?

    Yes.

    Well, it depends on what some_function returns, right?

    Nope.

    If it returns _auto_, they are not the same, but in all the other
    cases, blue is just an alias for red. So object identity depends on
    some value that could be external to the class. To me that's
    obviously unacceptable.

    That would be unacceptable. However, it would also be impossible*, because "_auto_" would be injected into the class namespace during "__prepare__()" and be unavailable for an external function to return.

    • as impossible as anything is in Python -- so really, really difficult

    But even "the namespace class body is executed in" _should_ be a Python
    namespace, with all its rules.

    One of those rules is: the namespace is dict-like.

    Which means it has methods such as __getitem__, __setitem__, etc., which means those methods can implement whatever is needed to give the namespace the desired semantics (within Python syntax).

    Which, to some extent, is what will be used to transform an instance of "auto" into an appropriate integer. In other words:

    >>> class Color(Enum):
    ...     red = auto()
    ...     print(red)
    ...
    1
    >>> Color.red
    <Color.red: 1>

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 11, 2016

    New changeset aad7443e62be by Ethan Furman in branch 'default':
    bpo-23591: add auto() for auto-generating Enum member values
    https://hg.python.org/cpython/rev/aad7443e62be

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 11, 2016

    Which means it has methods such as __getitem__, __setitem__, etc., which means those methods can implement whatever is needed to give the namespace the desired semantics (within Python syntax).

    Ah, _that_'s what you had in mind. (All this time, I thought _auto_ was just a shorter name for _generate_next_value_.:) I'm ok with that. But aren't we then back to square one? (Using magic of the same kind as the "declarative style" that Raymond pronounced not Pythonic enough.)

    Even Raymond says

    As long as _auto_ has been defined somewhere (i.e. from enum import _auto_), it is normal Python

    I'm sure he didn't think of the loophole you're currently exploiting. :-D

    But anyway, I'm happy. Parentheses are there, so the intuition of calling a function to execute code is respected, and if you get out free with this hack, it opens a door a bit wider for complete declarative solution in the future.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 11, 2016

    > Which means it has methods such as __getitem__, __setitem__, etc.,
    > which means those methods can implement whatever is needed to give
    > the namespace the desired semantics (within Python syntax).

    Ah, _that_'s what you had in mind. (All this time, I thought _auto_
    was just a shorter name for _generate_next_value_.:) I'm ok with that.
    But aren't we then back to square one? (Using magic of the same kind
    as the "declarative style" that Raymond pronounced not Pythonic enough.)

    Not at all. The concern with that proposal (bpo-26988: AutoEnum via completely omitting values of any kind) was the transformation of a NameError into a value. The current method is taking a unique object and transforming that into a value, which is entirely analagous to what Enum already does.

    Besides being convenient, it's also necessary to present a cohesive, non-leaky abstraction to the user where Flag is concerned: using ints as the flag values is entirely an implementation detail, but without auto the user would be forced to manage that implementation detail when creating the Flag, which negates much of Flag's value.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 11, 2016

    Many thanks to everyone for their input and help.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 13, 2016

    Now that I actually had the chance to play with the implementation, I see most of my worst fears were justified. :-( Look:

        >>> import enum
        >>> class B(enum.Flag):
                b = 3
                c = 4
                d = 6
        >>> B.b | B.c
        Traceback (most recent call last): ...
        ValueError: 7 is not a valid B
        >>> t = B.b | B.c
        >>> t
        <B.d|1: 7>
        >>> B.b | B.c
        >>> B.b | B.c
        <B.d|1: 7>
        >>> ~B.d
        <B.0: 0>

    Do you really find this behavior acceptable?? I see at least three bugs here.

    At least you did say in the documentation

    Individual flags should have values that are powers of two (1, 2, 4, 8, ...)

    but it seems to me it's not prominent enough.

    ---

    Also, auto, despite parentheses, works exactly as it shouldn't.

        >>> class C(enum.Enum):
    	a = b = enum.auto()	
        >>> C.a
        <C.a: 1>
        >>> C.b
        <C.b: 2>
    
        >>> def f():
                return enum.auto()
        >>> class E(enum.Enum):
                a = b = f()
                c = d = 2
        >>> E.a is E.b
        False
        >>> E.c is E.d
        True
        >>> E.b is E.c
        True

    In my opinion, this is simply horrible. Even _you_ said (http://bugs.python.org/issue23591#msg275093) this would be unacceptable. I'm really, really sad.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 13, 2016

    Vedran:

    1. Thank you for the bug report, it's appreciated.

    2. I would appreciate it even more if you just kept it to a bug report
      without the, for lack of a better word, drama.

    -----------------------------------------------

    >    >>> import enum
    >    >>> class B(enum.Flag):
    >            b = 3
    >            c = 4
    >            d = 6
    >    >>> B.b | B.c
    >    Traceback (most recent call last): ...
    >    ValueError: 7 is not a valid B

    This part is correct. If the programmer wants to use actual values instead of auto() then it is on them to do it correctly. I'd be happy to add a decorator to help ensure all bits have names, but I'm not sure what to call it.

    -----------------------------------------------

    >>> t = B.b | B.c
    >>> t
    <B.d|1: 7>

    >>> ~B.d
    <B.0: 0>

    These parts are not, and will be fixed.

    -----------------------------------------------

    >>> class C(enum.Enum):
    a = b = enum.auto()
    >>> C.a
    <C.a: 1>
    >>> C.b
    <C.b: 2>

    Bug, will be fixed.

    >>> def f():
    return enum.auto()
    >>> class E(enum.Enum):
    a = b = f()
    c = d = 2
    >>> E.a is E.b
    False

    Same as above bug, will be fixed

    >>> E.c is E.d
    True
    >>> E.b is E.c
    True

    Same as above bug, will be fixed. (Had "a" and "b" been given different "auto()"s that last True would be correct as "b" would then have had a value of 2 from auto(), and "c" and "d" also were given values of 2, so all three would have been the same.)

    @ethanfurman ethanfurman reopened this Sep 13, 2016
    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Sep 13, 2016

    About drama: I don't know how to approach the problem. I tell you about various concerns of mine, you seem to agree, and still you do exactly what I was warning you about. It could be that you just consider me a boring nag, but then saying so would be more useful than this situation.

    ---

    Yes, you solved some problems I did have, and it's great, but still many of them remain. This thing is inherently complicated. You are trying to construct a discrete Boolean algebra, but without any guarantee of atoms. Imagine if the language was different:

        class MySets(DiscreteSets):
            a = {0, 1}
            b = {2}
            c = {1, 2}

    You'd naturally view {0}, {1} and {2} as atoms, of different "sort" than {0,1} and such. That's what's missing here: a clear distinction between atoms and "molecules". It could be that the intended usage is for atoms to _always_ be constructed with auto(), and for molecules to be constructed from previous atoms, but then the documentation ought to say so.

    So far, the only thing you say is "then it is on them to do it correctly", but you never really tell what "correctly" means. I understand your desire not to hard forbid anything, but the docs should tell how your class is _intended_ to be used, _AND_ the class should make every reasonable effort to preserve what it can of its behavior even in the face of not-quite-intended usage. Look at Raymond's collections.Counter's treatment of non-integer values for example.

    BTW I'm not aware of any stdlib object whose repr raises ValueError. It surely makes debugging a lot harder.

    @ethanfurman
    Copy link
    Member

    ethanfurman commented Sep 13, 2016

    I'll respond to the bulk of your comment later. For now let me assure you your feedback has been invaluable.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 18, 2016

    New changeset b56290a80ff7 by Ethan Furman in branch '3.6':
    bpo-23591: fix flag decomposition and repr
    https://hg.python.org/cpython/rev/b56290a80ff7

    New changeset 7372c042e9a1 by Ethan Furman in branch 'default':
    bpo-23591: fix flag decomposition and repr
    https://hg.python.org/cpython/rev/7372c042e9a1

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Sep 26, 2016

    The "What’s New In Python 3.6" page should really get an update to mention Flag, IntFlag and auto as enhancements to the enum module. Right now, the ssl module update docs mentions (but does not properly link for some reason) IntFlag, but otherwise, there is no mention of the new feature in either the "What's New" docs or even in the theoretically comprehensive changelog; if you don't think to look at the enum module itself, you'd never know there were new features to use.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 21, 2016

    New changeset f404a59d834e by Ethan Furman in branch '3.6':
    closes bpo-23591: add NEWS entry
    https://hg.python.org/cpython/rev/f404a59d834e

    @python-dev python-dev mannequin closed this as completed Nov 21, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants