Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stubgen: Fix generated dataclass __init__ signature #16906

Merged
merged 1 commit into from
Feb 16, 2024

Conversation

hamdanal
Copy link
Collaborator

Fixes #16811

stubgen was swallowing default values for __init__ methods generated by the dataclass plugin making their signature incorrect. This is because the plugin does not include the argument's initializer in the generated signature. I changed it to include a dummy ellipsis so that stubgen can generate correct code.
I also fixed arguments added by the dataclass plugin with the invalid names * and ** to have the valid and unique names *generated_args and **generated_kwargs (with extra underscores to make them unique if necessary). This removes the need for the hack to special case them in stubgen and is less confusing for someone looking at them in a stub file.

Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@AlexWaygood
Copy link
Member

I believe it's valid to use @dataclass in a stub file, and that if you do have @dataclass in a stub file, type checkers are meant to synthesize the generated __init__ method (and other generated methods) in the same way they do when they see dataclasses in .py files. Maybe the best behaviour here would be for stubgen not to add an __init__ method at all for dataclasses, if the original dataclass doesn't have an __init__ method? It would lead to simpler stubs

@hamdanal
Copy link
Collaborator Author

I believe it's valid to use @dataclass in a stub file, and that if you do have @dataclass in a stub file, type checkers are meant to synthesize the generated __init__ method (and other generated methods) in the same way they do when they see dataclasses in .py files. Maybe the best behaviour here would be for stubgen not to add an __init__ method at all for dataclasses, if the original dataclass doesn't have an __init__ method? It would lead to simpler stubs

You are right, that's why I changed stubgen to keep the dataclass decorator recently. The problem is that we cannot keep field assignments in the stub so we lose information that is usually inferred from calls like a = field(init=False) (a is not part of the __init__ signature) or b = field(default_factory=list) (b is optional because it has a default value) which leads to the type checker generating wrong signature for __init__. It is safe to delete the generated method in simple cases (where there is no loss of information due to absence of field() calls) later though.

@AlexWaygood
Copy link
Member

The problem is that we cannot keep field assignments in the stub so we lose information that is usually inferred from calls like a = field(init=False) (a is not part of the __init__ signature) or b = field(default_factory=list) (b is optional because it has a default value) which leads to the type checker generating wrong signature for __init__. It is safe to delete the generated method in simple cases (where there is no loss of information due to absence of field() calls) later though.

I see -- that makes sense, thanks!

Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one aspect of the issue that isn't fixed by this PR: the __init__ parameters still don't have types. That can wait for another PR, though.

@hamdanal
Copy link
Collaborator Author

There is one aspect of the issue that isn't fixed by this PR: the __init__ parameters still don't have types. That can wait for another PR, though.

Including these types is a step into uncharted territory for stubgen as these types are inferred and stubgen doesn’t know how to handle them yet. I tried a little bit to include them before submitting the PR but they require some non trivial work (e.g. adding new imports for the inferred annotations and handling name collisions) so I left them out for now.
Note that there is a feature request already for stubgen including inferred types here #3475.

@JelleZijlstra JelleZijlstra merged commit bfbac5e into python:master Feb 16, 2024
18 checks passed
@hamdanal hamdanal deleted the stubgen-dataclass branch February 16, 2024 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incomplete and incorrect stub generation of a dataclass.
3 participants