Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: clarify interaction of Final and dataclass #1669

Merged
merged 2 commits into from
Apr 11, 2024

Conversation

carljm
Copy link
Member

@carljm carljm commented Mar 29, 2024

See discussion thread at https://discuss.python.org/t/treatment-of-final-attributes-in-dataclass-likes/47154 and issue at python/cpython#89547

Consider a dataclass with a Final-annotated initialized assignment in the class body:

@dataclass
class C:
    x: Final[int] = 3

This is currently under-specified in the typing spec. Neither PEP 591 or PEP 681 clearly specified this composition.

There are two possible consistent interpretations:

  1. It could be a class-only variable (implicit ClassVar) C.x with final value 3. In this interpretation, it should not be a dataclass field, because dataclass fields are inherently set on instances, they cannot be ClassVar. This is the interpretation suggested by PEP 591 (and thus also the current typing spec for Final), which say that Final with an assigned value in a class body is always implicitly a ClassVar.

  2. It could be a dataclass field x with default value 3, which cannot be reassigned on an instance after it is initialized in the generated __init__ method.

This pull request specifies option 2.

I will also provide an update to the conformance suite for this, if the Typing Council accepts the spec change.

Current runtime behavior

The stdlib dataclasses module considers x in this example to be a dataclass field.

If the assigned value is a default value (as in the example above), dataclasses leaves that default value in place as a class attribute (so x is a class-and-instance variable, and C.x == 3.) If the assigned value is a field(...) call, dataclasses does not leave any attribute x on the class at all, so x is purely a dataclass field / instance variable.

Current type-checker behavior

Playground links:

mypy
pyre
pyright

All three agree with the runtime behavior of dataclasses, considering x to be a dataclass field which is included in the dataclass __init__ method and set on instances in __init__. All three prohibit further assignments to x on instances, noting that it is a final attribute. Pyright also mentions that it is a ClassVar, which is confusing given that "ClassVar" and "dataclass field" are (or should be) mutually exclusive.

All three assume that C.x is available and of type int, whether we have x: Final[int] = 3 or x: Final[int] = field(default_value=3). That is, none of the type checkers model the actual runtime behavior that field() objects are removed from the class and not replaced with anything. (Arguably this behavior is just a bug/inconsistency in the dataclasses runtime implementation.)

Considerations

  • Option 2 (proposed in this PR) is clearly less disruptive to existing code, as it effectively just specifies the de facto current behavior of the runtime and all type checkers. The only practical change to type checker behavior in the given examples that would occur as a result of this spec change is that Pyright should not claim in its error message that x is a ClassVar, but otherwise behavior would remain as it is; errors would still be raised on exactly the same lines by all three type checkers.
  • Option 2 has the downside that one cannot assume as an invariant that the syntax x: Final[int] = 3 always means that x will always have value exactly 3. In dataclass bodies this will be more nuanced: the class attribute will always be 3, but it will be shadowed by an instance attribute that may have a different value on any given instance. On the other hand, it is already the case that the semantics of annotated assignments in dataclass bodies differ from other contexts: x: int = field(...) when field(...) does not return int would obviously be a type error anywhere else. This example illustrates that in dataclasses, the annotation generally does not apply to the immediate assignment, but to the type of the dataclass field (instance var) specified by the annotated assignment.
  • If we select Option 1, it will be quite cumbersome to achieve a Final dataclass field (one would have to write out the __init__ method manually instead of allowing dataclass to generate it, eliminating a key benefit of dataclasses.) If we select Option 2, it remains quite easy to have a final classvar on a dataclass, using a ClassVar[Final[...]] annotation.

@carljm carljm changed the title clarify interaction of Final and dataclass spec: clarify interaction of Final and dataclass Mar 29, 2024
@Tinche
Copy link

Tinche commented Mar 29, 2024

This will likely affect attrs too so I'm following the discussion. I'm personally rooting for option 2.

A small nit:

If the assigned value is a default value (as in the example above), dataclasses leaves that default value in place as a class attribute

That only seems to be true for non-slots dataclasses. If the dataclass is created using slots=True (and this is the default in attrs, for example), C.x will be a slot descriptor instead.

@srittau srittau added the Typing Council decision Needs to be approved by the Typing Council. Do not merge until approved. label Mar 29, 2024
erictraut added a commit to microsoft/pyright that referenced this pull request Mar 31, 2024
…rked `ClassVar`) within dataclass class bodies. This is consistent with the runtime and [this proposed change to the typing spec](python/typing#1669).
@erictraut
Copy link
Collaborator

Thanks for the PR. Personally, I agree that option 2 is the preferred option, and I like the wording in the proposed typing spec change.

I also agree that pyright's current error message is misleading (doesn't match the runtime behavior), so I've updated pyright accordingly so it doesn't mention ClassVar in this situation.

If you don't get any further feedback on the PR within the next few days, the next step is to create an issue in the typing council repo asking for a formal decision about the spec update.

If this is approved, we'll want to augment the compliance tests to cover the new cases. This could be done as part of the same PR or a separate PR.

@JelleZijlstra JelleZijlstra merged commit 7a70cd6 into python:main Apr 11, 2024
4 checks passed
JelleZijlstra pushed a commit to JelleZijlstra/typing that referenced this pull request Apr 11, 2024
@carljm
Copy link
Member Author

carljm commented Apr 22, 2024

(btw, I am still planning to add conformance tests for this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing Council decision Needs to be approved by the Typing Council. Do not merge until approved.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants