Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify meaning of Final in dataclasses #89547

Open
GBeauregard mannequin opened this issue Oct 5, 2021 · 25 comments
Open

Clarify meaning of Final in dataclasses #89547

GBeauregard mannequin opened this issue Oct 5, 2021 · 25 comments
Labels
stdlib Python modules in the Lib dir topic-typing type-bug An unexpected behavior, bug, or error

Comments

@GBeauregard
Copy link
Mannequin

GBeauregard mannequin commented Oct 5, 2021

BPO 45384
Nosy @ericvsmith, @carljm, @GBeauregard, @saaketp

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-10-05.23:00:02.707>
labels = ['type-bug', 'library']
title = 'Accept Final as indicating ClassVar for dataclass'
updated_at = <Date 2022-01-21.01:21:16.689>
user = 'https://github.com/GBeauregard'

bugs.python.org fields:

activity = <Date 2022-01-21.01:21:16.689>
actor = 'med2277'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2021-10-05.23:00:02.707>
creator = 'GBeauregard'
dependencies = []
files = []
hgrepos = []
issue_num = 45384
keywords = []
message_count = 12.0
messages = ['403277', '403552', '403553', '403555', '403560', '403565', '403615', '403616', '403617', '403890', '403931', '411067']
nosy_count = 6.0
nosy_names = ['eric.smith', 'carljm', 'GBeauregard', 'saaketp', 'msully4321', 'med2277']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue45384'
versions = []

Linked PRs

@GBeauregard
Copy link
Mannequin Author

GBeauregard mannequin commented Oct 5, 2021

PEP-591 for the Final Attribute states "Type checkers should infer a final attribute that is initialized in a class body as being a class variable. Variables should not be annotated with both ClassVar and Final."

This is a bit of a typing conflict for dataclasses, where ClassVar is used to indicate a desired library behavior, but one may want to indicate Final.

I propose accepting the Final attribute as an indicator of a ClassVar in dataclasses class bodies in order to be better compatible with the Final PEP.

There is at least one edge case that would need to be handled where someone might want to explicitly mark a dataclass field Final, which could be allowed as a field:
a: Final[int] = dataclasses.field(init=False, default=10)

@GBeauregard GBeauregard mannequin added type-feature A feature request or enhancement stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Oct 5, 2021
@GBeauregard
Copy link
Mannequin Author

GBeauregard mannequin commented Oct 9, 2021

Hi Eric,

I've been shopping this idea around on the mailing list and haven't received any objections. Do you have any concerns? Can we handle Final with the same checks as ClassVar?

Regarding potentially merging a change, I'm not sure where this falls between a bug fix and a feature. On one hand the PEP-591 instruction to typecheckers on how to treat Final is relatively absolute and the dataclasses behavior could be considered a buggy interaction with it. On the other hand this is sorta a new behavior. Do you have any opinions? Should I not worry about it when working on patch and let core devs figure out if it would need backported?

I've read through the dataclasses code and I think I can implement this myself and submit a PR, but I may need a bit of a heavy handed code review since I've only recently started getting serious with Python.

Thanks for your time and work on dataclassess.

@ericvsmith
Copy link
Member

I was waiting for someone smarter than me to chime in on one of the discussions.

I wouldn't worry about whether it's a bug or feature, at this point. Assuming buy-in from type checkers, I'd probably call it a bug, but I can be reasoned with.

One thing I don't understand is what you mean by:

"""
There is at least one edge case that would need to be handled where someone might want to explicitly mark a dataclass field Final, which could be allowed as a field:
a: Final[int] = dataclasses.field(init=False, default=10)
"""

I assume we'd want to treat it like a ClassVar, whatever that does. What's your thought? Are you saying ClassVar works incorrectly in this instance.

@GBeauregard
Copy link
Mannequin Author

GBeauregard mannequin commented Oct 9, 2021

When I originally submitted the issue I hadn't finished going through all of the dataclasses code and it hadn't even occurred to me that it could be valid to use ClassVar with field(). I (wrongly) assumed this would always raise and that field() is only valid for things intended to be instance vars.

I do find this behavior a little surprising, but on reflection I don't think it's explicitly wrong as long we raise for default_factory like it currently does. I think it's then appropriate to just do the exact same behavior for Final as ClassVar.

I'm going to start working on a PR, thanks for your feedback.

@saaketp
Copy link
Mannequin

saaketp mannequin commented Oct 10, 2021

Treating Final as ClassVar by default may be fine,
but it should not throw when using default_factory like ClassVar does.

There are valid uses of Final with instance variable when one would want the value to be unchanged after the __init__ runs
but different instances can be initialized with different values that are generated by a default_factory.

A quick search on github for this pattern gives this
https://github.com/166MMX/hiro-python-library/blob/fb29e3247a8fe1b0f7dc4e68141cf7340a8dd0a5/src/arago/hiro/model/ws.py#L120
which will break if Final throws when using default_factory.

PEP-591 says:
Type checkers should infer a final attribute _that is initialized in a class body_ as being a class variable.
When using default_factory the attribute is not initialized inside the class body but when the instance is initialized.
So allowing instance level Final with default_factory will not be going against the PEP.

@GBeauregard
Copy link
Mannequin Author

GBeauregard mannequin commented Oct 10, 2021

Yeah, I was just discussing this with someone in IRC, and I appreciate an example of it in the wild.

I agree there's some wiggle room with what "initialized in a class body" means when it comes to dataclasses.

I see several interpretations there, and I would strongly prefer feedback from typing folks, particularly since they would be responsible for implementing any Final default_factory exceptions.

On the implementation side this does complicate things a bit depending on specifics. Are Final default_factory fields real fields or pseudo-fields? (i.e. are they returned by dataclasses.fields()?) Depending on how this works dataclasses might need a bit more refactoring than I'd be the right person for, but I'm still willing to give it a shot.

@carljm
Copy link
Member

carljm commented Oct 11, 2021

Are Final default_factory fields real fields or pseudo-fields? (i.e. are they returned by dataclasses.fields()?)

They are real fields, returned by dataclasses.fields().

In my opinion, the behavior change proposed in this bug is a bad idea all around, and should not be made, and the inconsistency with PEP-591 should rather be resolved by explicitly specifying the interaction with dataclasses in a modification to the PEP.

Currently the meaning of:

@dataclass
class C:
    x: Final[int] = 3

is well-defined, intuitive, and implemented consistently both in the runtime and in type checkers. It specifies a dataclass field of type int, with a default value of 3 for new instances, which can be overridden with an init arg, but cannot be modified (per type checker; runtime doesn't enforce Final) after the instance is initialized.

Changing the meaning of the above code to be "a dataclass with no fields, but one final class attribute of value 3" is a backwards-incompatible change to a less useful and less intuitive behavior.

I argue the current behavior is intuitive because in general the type annotation on a dataclass attribute applies to the eventual instance attribute, not to the immediate RHS -- this is made very clear by the fact that typecheckers happily accept x: int = dataclasses.field(...) which in a non-dataclass context would be a type error. Therefore the Final should similarly be taken to apply to the eventual instance attribute, not to the immediate assignment. And therefore it should not (in the case of dataclasses) imply ClassVar.

I realize that this means that if we want to allow final class attributes on dataclasses, it would require wrapping an explicit ClassVar around Final, which violates the current text of PEP-591. I would suggest this is simply because that PEP did not consider the specific case of dataclasses, and the PEP should be amended to carve out dataclasses specifically.

@GBeauregard
Copy link
Mannequin Author

GBeauregard mannequin commented Oct 11, 2021

Thanks for the feedback Carl.

Your proposed nesting PEP change is not possible: ClassVar[Final[int]] isn't valid because Final[int] isn't a type. As far as I know this would need type intersections to be possible.

I'm going to try sending a one-off email to the PEP authors since probably whatever happens the PEP needs a clarification anyway.

@carljm
Copy link
Member

carljm commented Oct 11, 2021

Good idea to check with the PEP authors.

I don’t think allowing both ClassVar and Final in dataclasses requires general intersection types. Neither ClassVar nor Final are real types; they aren’t part of the type of the value. They are more like special annotations on a name, which are wrapped around a type as syntactic convenience. You’re right that it would require more than just amendment to the PEP text, though; it might require changes to type checkers, and it would also require changes to the runtime behavior of the typing module to special-case allowing ClassVar[Final[…]]. And the downside of this change is that it couldn’t be context sensitive to only be allowed in dataclasses. But I think this isn’t a big problem; type checkers could still error on that wrapping in non dataclass contexts if they want to.

But even if that change can’t be made, I think backwards compatibility still precludes changing the interpretation of x: Final[int] = 3 on a dataclass, and it is more valuable to be able to specify Final instance attributes (fields) than final class attributes on dataclasses.

@msully4321
Copy link
Mannequin

msully4321 mannequin commented Oct 14, 2021

I tend to agree with Carl that making Final imply ClassVar for dataclass would make things worse.

For better or worse (mostly better, but there are downsides like this!), dataclass class bodies are essentially written in their own domain specific language, and allowances need to be made in how typing things are interpreted in such a case, and I think that Carl is right that the right interpretation in the case of dataclasses is that the annotation describes the eventual instance attribute.

At least, I feel this way 80% of the time. I can definitely see the argument for wanting consistency in the interpretation.

From a more directly practical perspective, though, the change would also be a breaking one (though /arguably/ only of broken code), and so maybe not worth it even if we would prefer the changed behavior.

I think the right approach is probably to just append the PEP and then maybe also support ClassVar[Final[Whatever]]. It shouldn't need intersection types or anything; if it's a pain to implement it would be for annoying reasons and not deep ones.

@GBeauregard
Copy link
Mannequin Author

GBeauregard mannequin commented Oct 14, 2021

Hi Michael,

Thanks for taking the time to look into this.

I don't feel strongly enough about following the existing PEP wording to desire creating a situation where instance vars can't be marked Final if we can instead make a workaround with ClassVar[Final[...]]. Plus, I can appreciate the argument that dataclasses are their own thing that should be treated specially in the PEP.

If you know what's involved in formally proposing and enacting a PEP amendment I can get behind that.

@med2277
Copy link
Mannequin

med2277 mannequin commented Jan 21, 2022

I recently hit this issue working on a config/parsing runtime type checking library (similar in spirit to pydantic).

The one other special typeform I was using these with that led me to discover this issue was Annotated. I use Annotated a fair amount to do some runtime analysis and I was used to Annotated[typeform] always works. But ClassVar and Final are special and Annotated[ClassVar[...]] and Annotated[Final[...]] both fail. I find Annotated interaction also weird. I ended up working around it by doing ClassVar[Annotated[...]] and stripping the classvar/final to look for the annotation metadata.

I think all 3 of annotated/final/classvar should be order compatible as they all serve to add information on the type they contain.

If we ignore Annotated, I would say ClassVar/Final should be order compatible and a rule that Final[ClassVar[...]] works but not ClassVar[Final[...]] or vice versa would be weird.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@benoit-dubreuil
Copy link

Any update on this?

@ekchew
Copy link

ekchew commented Sep 22, 2022

I have been reading this thread with much interest after realizing

@dataclass
class Foo:
    k_default_bar: Final[int] = 0
    bar: int = k_default_bar

does not do what one would think. (Writing Foo(1) at this point sets k_default_bar to 1 and bar to 0 from what I can see, though I'm not sure if the latter is UB?)

After reading through the replies, I agree that there is merit to leaving the current Final behavior as-is, as people may be relying on it to specify immutable instance attributes by this point. However, the class attribute form needs to be supported in some way, as it is extremely useful as in the above example.

There seems to be a lot of argument over whether Final[ClassVar[...]] (or ClassVar[Final[...]]) should or should not be allowed in the context of dataclass? Is there any reason these could not be merged into a single ClassFinal[...] or something along those lines? That would be my preferred solution as an end user as it would be clean and clearly specify intent, but I'm not sure what it would imply from a language architect's perspective?

@carljm
Copy link
Member

carljm commented Jan 4, 2023

My reading of this thread is that there is basically agreement on what should happen (though there are some details that may need further discussion), it's just that the specific next steps are not clear, because we are in this gray area where the issue seems too small for a new PEP, but amending existing finalized PEPs is not common. Concretely I think what should happen is this:

  • PEP 591 should either be amended or superseded to clarify that Final[...] = ... in a dataclass specifies a final instance attribute and does not imply ClassVar. This is just a clarification that the status quo interpretation is considered correct and not a bug, and is the minimal thing that should happen here.

Secondly, in order to allow final classvars on dataclasses, we could also do these:

  • Amend or supersede PEP 591 to allow ClassVar[Final[...]] in a dataclass.
  • Change typing.py so that it allows ClassVar[Final[...]] at runtime.
  • Update typecheckers to also allow this, at least in the context of a dataclass.

And there are a few open questions:

  • Should we also allow Final[ClassVar[[...]] or should the ordering be constrained?
  • Should Annotated also be allowed to wrap either Final or ClassVar, and/or be wrapped by them?

The question of Annotated wrapping ClassVar/Final is really a separate one, but similar so it may make sense to handle them together.

I think if we were only doing the first thing listed above, it could be done as an amendment clarification to PEP 591. But if we want to allow ClassVar[Final[...]], given the open questions and the need to get consensus from all type checker implementers, it would probably be best for someone who wants to move this forward to write a short new PEP on the topic of how all these pseudo-type special forms can be allowed interact with each other.

Is there any reason these could not be merged into a single ClassFinal[...]

I don't see much reason to do this. ClassFinal[...] does not seem any more clear than ClassVar[Final[...]], and it sets a precedent for proliferating a combinatorial explosion of special forms rather than composing orthogonal ones.

@Prometheus3375
Copy link
Contributor

Prometheus3375 commented Sep 8, 2023

  • Should we also allow Final[ClassVar[[...]] or should the ordering be constrained?

I think the correct order is Final[ClassVar[...]] because it spells final class variable .... The other order spells class variable final .. which I find awkward. Order Final[ClassVar[...]] is also compatible with order in some C-like languages final static ....

Given that, the order should be restricted, so only Final implementation will be updated to accept ClassVar.

But in my opinion it will be better to introduce dataclass-specific annotation FinalClassVar. There are already several dataclass-specific annotations like KW_ONLY and InitVar.

Pros:

  • No updates to Final and ClassVar implementation is required.
    • As a result, no type checker is forced to mutate their behavior.
    • Any third-party library that uses type annotations will not be forced to support and correctly resolve nesting of Final and ClassVar.
  • No discussion about the order of nesting.
  • No changes to Annotated.
  • No amendments to PEP 591.

Cons:

  • Amendment to dataclass PEP or docs to include note about differences between Final, ClassVar and FinalClassVar.
  • Type checker should support new annotation correctly.

Why I think this is the best solution:

  • The issue is only specific to dataclasses, because in any other situation Final attribute in class body automatically means ClassVar.
  • Altering well-defined behavior just to support dataclass-specific case can lead to some unpredictable consequences.
  • Solving dataclass-specific issue with dataclass-specific change I find safe.
  • Type checkers usually have separate rules for dataclasses; imo, it is better to alter only this set of rules, rather than updating global rules for Final and ClassVar.
  • Any dataclass-like third-party library (ex. attrs) can introduce their own annotation for this case or add support for FinalClassVar from dataclasses.

@Prometheus3375
Copy link
Contributor

Prometheus3375 commented Sep 8, 2023

For those who want to define final class variable now, there is a workaround:

from dataclasses import dataclass
from typing import Final


class Base:
    default_f1: Final = 1  # or Final[int]


@dataclass
class MyClass(Base):
    f1: int = Base.default_f1

Also, remember that not annotated attributes are not processed by dataclass decorator at all. Useful, when you do not want to add ClassVar annotation.

from dataclasses import dataclass
from typing import ClassVar


@dataclass
class MyClass:
    default_f1 = 1  # not added to __dataclass_fields__
    f1: int = default_f1
    class_field: ClassVar[int] = 1  # added to __dataclass_fields__, but is not returned by fields

Such attributes are not final and can be overwritten in subclasses.

@JelleZijlstra
Copy link
Member

I agree with @carljm's post above (#89547 (comment)), and the good news is that now we have a clear path to make it happen: the Typing Council can change the spec. The procedure is laid out at https://github.com/python/typing-council (basically, first open a discussion on discuss.python.org, then propose a concrete change to the spec and ask the Council for a decision).

Separately, the runtime should be changed to allow both of these:

>>> class a:
...     x: ClassVar[Final[int]]
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in a
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 377, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 489, in __getitem__
    return self._getitem(self, parameters)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 641, in ClassVar
    item = _type_check(parameters, f'{self} accepts only single type.')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 196, in _type_check
    raise TypeError(f"{arg} is not valid as type argument")
TypeError: typing.Final[int] is not valid as type argument
>>> class a:
...     x: Final[ClassVar[int]]
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in a
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 377, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 489, in __getitem__
    return self._getitem(self, parameters)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 663, in Final
    item = _type_check(parameters, f'{self} accepts only single type.')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 196, in _type_check
    raise TypeError(f"{arg} is not valid as type argument")
TypeError: typing.ClassVar[int] is not valid as type argument

I would accept a patch to CPython to allow both of these variants, regardless of whether the typing spec is changed, under the general principle that the runtime should be more permissive to allow experiments with extending the type system.

@hmc-cs-mdrissi
Copy link
Contributor

I'd be interested in working on this. This feels like an issue I understand and could use as my first time working on typing.py directly.

@JelleZijlstra
Copy link
Member

Sounds great, feel free to open a PR and I'll take a look. (I haven't checked recently but I think there's a place in typing._type_check or similar where Final/ClassVar are hardcoded; that's probably what needs to be changed.)

@carljm carljm changed the title Accept Final as indicating ClassVar for dataclass Clarify meaning of Final in dataclasses Feb 28, 2024
@carljm
Copy link
Member

carljm commented Feb 29, 2024

Thanks for taking on the runtime change, @hmc-cs-mdrissi ! I can do the typing spec PR: https://discuss.python.org/t/treatment-of-final-attributes-in-dataclass-likes/47154/4

@hmc-cs-mdrissi
Copy link
Contributor

hmc-cs-mdrissi commented Feb 29, 2024

I've gotten,

>>> class a:
...     x: ClassVar[Final[int]]
>>> class a:
...     x: Final[ClassVar[int]]

both to work. But a different test case fails that expected ClassVar[ClassVar[int]] to fail. My preference is while that's weird and not meaningful allowing some special forms in ClassVar like Final but not others (itself) will likely add unneeded runtime validation complexity. So I'd lean to allow ClassVar[ClassVar[int]] at runtime while type checkers are free to forbid it.

Should Annotated also be allowed to wrap either Final or ClassVar, and/or be wrapped by them?
The question of Annotated wrapping ClassVar/Final is really a separate one, but similar so it may make sense to handle them together.

I'm also leaning to include test cases for Annotated interaction and ensure that

class ACF:
  x: Annotated[ClassVar[Final[int]], "a decoration"]

and similar variations are well tested.

@JelleZijlstra
Copy link
Member

I think it's OK to allow ClassVar[ClassVar[int]]. It's weird but not especially harmful.

Type checkers in general should allow Annotated at any level around a qualifier, so I agree that your example should be allowed.

@JWCS
Copy link

JWCS commented Mar 27, 2024

Coming from the runtime type-checking side, now that #116096 is merged, I noticed that dataclass will handle ClassVar[Final[T]] correctly, since it already checks for ClassVar, but unwrapping special forms for inspecting if is InitVar or ClassVar is still TBD. (ie Annotated[ClassVar/InitVar, ...], Final[ClassVar[T]])
Just checking, this (hypothetical) dataclasses PR is still waiting on @carljm 's typing spec & subsequent approval ?
Or is the typing consensus we need outside of lifting the runtime limitations on the dataclasses implementation?

@carljm
Copy link
Member

carljm commented Mar 29, 2024

I've submitted a pull request to the typing spec: python/typing#1669

The typing spec applies to typecheckers, not to the runtime, so I don't think a dataclasses behavior change to unwrap type qualifiers looking for ClassVar necessarily needs to wait for approval of the typing spec change. That said, if the typing council chooses to specify that Final always implies ClassVar, even in dataclasses (the opposite of what my PR proposes), that would suggest a very different change to the runtime behavior (where dataclasses would start to detect Final and treat it as ClassVar), so it may be best to wait for a typing council decision.

diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir topic-typing type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

9 participants