-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turning off generation of __match_args__ for dataclasses #87930
Comments
The dataclass decorator can take multiple parameters to enable or disable the generation of certain methods. PEP-634 Structural Pattern Matching extends dataclasses to also generate a __match_args__ attribute. I think adding a parameter to enable and disable generation of __match_args__ should be to dataclass should also be considered for consistency. Note that setting compare=False on a dataclass.field already excludes that field from __match_args__, but there is no way to disable generation of __match_args__ completely. |
What's the situation where having __match_args__ is actually harmful in some way? I understand that if the generated version is wrong, you'd want to specify it yourself. But what's the use case for not having __match_args__ at all? |
I agree with Eric. You can already disable the automatic creation of __match_args__ by setting it yourself on the class being decorated, and "__match_args__ = ()" will make the class behave the exact same as if __match_args__ is not defined at all. >>> from dataclasses import dataclass
>>> @dataclass
... class X:
... __match_args__ = ()
... a: int
... b: int
... c: int
...
>>> X.__match_args__
() I too have trouble imagining a case where the actual *presence* of the attribute matters. But assuming those cases exist, wouldn't a simple "del X.__match_args__" suffice? |
It appears you did find a genuine bug though! I was surprised by this comment, and after digging a bit deeper into _process_class found that we should be generating these from "field_list", not "flds": >>> @dataclass(repr=False, eq=False, init=False)
... class X:
... a: int
... b: int
... c: int
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/bucher/src/cpython/Lib/dataclasses.py", line 1042, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
File "/home/bucher/src/cpython/Lib/dataclasses.py", line 1020, in _process_class
cls.__match_args__ = tuple(f.name for f in flds if f.init)
UnboundLocalError: local variable 'flds' referenced before assignment |
I assume the OP wants to have a class that doesn't allow positional patterns. The right way to spell that is indeed to add __match_args__ = () to the class, there's no need to add another flag to @DataClass. |
The same however is also true for all the other stuff generated by @DataClass. You can for example disable generation of the init method using def __init__(self): pass and dataclass still has a parameter to disable it. I agree that a new parameter isn't strictly required to achieve functionality, however I would still argue that it should be added for consistency with the rest of the dataclass api. |
init=False is used to make sure there's no __init__ defined, because there's a difference between a class with an __init__ and one without. If there was a difference between __match_args__ being not present and __match_args__=(), then I'd support a matchargs=False argument. |
Ah, I see now how this might possibly be useful. If you want to inherit a parent's __match_args__ in a dataclass, it currently must be as spelled something like: @dataclass
class Child(Parent):
__match_args__ = Parent.__match_args__
... It's even uglier when you're unsure if Parent defines __match_args__ at all, or if multiple-inheritance is involved: @dataclass
class Child(Parent, Mixin):
__match_args__ = ()
... del Child.__match_args__ I'm not sure how likely it is that code out in the wild may need to look like this. As I understand it, though, the fact that dataclasses allow for "normal" inheritance is one of their big selling-points. So it could be a valid reason to include this option. If it seems like the above code might become reasonably common, I agree that the proposed solution is much cleaner: @dataclass(match_args=False)
class Child(Parent):
... |
I just realized that in my first two examples, all that's really needed is a "del Child.__match_args__" following each class definition. The main point still stands though: this may be a more common issue than I previously thought. If it is, then a dedicated kwarg to the dataclass decorator probably makes more sense than requiring users to "del" the generated attribute. |
Hmm, good point on the inheritance case. I'd forgotten that this is where init=False and others would come in handy. Although I'm having a hard time figuring out why you'd want a derived class that adds fields but wants to use the parent's __match_args__ (or __init__, for that matter), but I guess it's possible. I don't like del Child.__match_args__. The same pattern could be used for init=False, after all, but there's a parameter for that. |
So, should we reopen this ad add the flag to suppress __match_args__ generation after all? |
I think so, although I'm still contemplating it. I can't decide if we should also add match_arg=False to field(). Or is it good enough to just tell the user to manually set __match_args__ if they want that level of control? I think this is different enough from repr, init, etc. that we don't need to allow the per-field control, although maybe doing so would make the code more robust in terms of re-ordering or adding fields at a later date. |
I don't think we need control at the field level here. Omitting one field |
Okay, I'll re-open this to just add @DataClass(match_args=False). |
Here's a question. If __init__ is not being generated, either because the user supplied one to this class, or if init=False is specified, should __match_args__ be generated? I think the answer should be no, since the code has no idea what the parameters to __init__ will be. But I'd like to hear from people more familiar with pattern matching. I'm working on a patch, and this is my last issue. |
Are there other cases where suppressing one thing affects others? |
Only the complex interactions among the unsafe_hash, eq, and frozen parameters. It feels like if __init__ is not being generated, then the @DataClass code would have no idea what it should set __match_args__ to. Not that this problem isn't strictly related to the match_args=False case, it just occurred to me that it's related while I was writing the documentation. Perhaps this should be a separate issue. |
There may be other reasons why |
I can go either way. It's easy enough for the user to add their own __match_args__, so I won't link them. Here's what I have for the documentation, which is why the issue came up:
|
LGTM |
__match_args__
generation logic for dataclasses #25284Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: