Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for dataclasses #5010

Merged
merged 23 commits into from Jun 5, 2018

Conversation

Projects
None yet
4 participants
@Bogdanp
Copy link
Contributor

Bogdanp commented May 6, 2018

This is a work in progress for adding support for dataclasses as a plugin (heavily inspired by and reusing some parts of the attrs plugin). It's pretty close to complete, but I wanted to get some feedback before continuing as I'm new to the codebase. I'll take this over the finish line next weekend if it's deemed desirable, but here are my open questions:

  • Is the way I'm hooking the plugin in correct? fullname in get_class_decorator_hook doesn't seem to have the value I expect (i.e. the name is qualified to the module dataclass is imported in, so its value is some_package.some_module.dataclass rather than dataclasses.dataclass).
  • Should I be attempting to re-use more code between the two plugins? I only re-used the things that seemed to be fully generic between the two.

I tried it out on a real-world Python 3.6 project that makes heavy use of dataclasses and it seems to work great so far.

This would close #4792.

@ilevkivskyi ilevkivskyi self-assigned this May 7, 2018

auto_attribs_default=True
)
# TODO: It's unclear if fullname should be a fully qualified

This comment has been minimized.

@euresti

euresti May 7, 2018

Contributor

I believe this happens if the module is not in the typeshed.

This comment has been minimized.

@Bogdanp

Bogdanp May 12, 2018

Author Contributor

Thanks! Should I add the module to the typeshed first?

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 12, 2018

Collaborator

I am fine with this solution. We can check later if this is fixed when the stub is in typeshed.

This comment has been minimized.

@JelleZijlstra

JelleZijlstra May 14, 2018

Collaborator

There is already an open typeshed PR to add a dataclasses stub: python/typeshed#1944. I found some issues in it, but the original author hasn't responded for a while. @Bogdanp if you're interested, you could take over that PR and I can help you get it merged into typeshed.

@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

Thanks for working on this! The PR looks good, I have made a first round of review. Please let me know if you are still interested in working on this.

@@ -2,20 +2,19 @@

from abc import abstractmethod
from functools import partial
from typing import Callable, List, Tuple, Optional, NamedTuple, TypeVar, Dict
from typing import Callable, Dict, List, NamedTuple, Optional, Tuple, TypeVar

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

Please don't do automatic sort of imports. This only increases the size of the diff and makes it harder to review.

if fullname in mypy.plugins.attrs.attr_class_makers:
return mypy.plugins.attrs.attr_class_maker_callback
elif fullname in mypy.plugins.attrs.attr_dataclass_makers:
from mypy.plugins import attrs

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

Why do you need the local import? If there is an import cycle, you should try breaking it. At mypy we try hard to not introduce import cycles, because they complicate the (already complex) code logic.

This comment has been minimized.

@Bogdanp

Bogdanp May 20, 2018

Author Contributor

There had already been a circular dependency that was being resolved by this line. When I extracted mypy.plugins.common the same thing continued to work fine for the test suite, but the dynamically-generated test to ensure import mypy.plugins.common could be imported failed. I think the circular dependency can be fixed by extracting ClassDefContext into a separate module (like mypy.plugin_context). Let me know if you'd like me to do that!

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

I think the circular dependency can be fixed by extracting ClassDefContext into a separate module (like mypy.plugin_context). Let me know if you'd like me to do that!

I think moving out all the interfaces to a separate file mypy.api is a more permanent solution. But it is a big refactoring, so it is better to do this in a separate PR.

from mypy.types import CallableType, Overloaded


def _get_decorator_bool_argument(

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

Do I need to review these three functions or you just copied these from attrs?

This comment has been minimized.

@Bogdanp

Bogdanp May 20, 2018

Author Contributor

Yes, these are the same functions from the attrs module, unchanged.

@@ -0,0 +1,27 @@
# Builtins stub used to support @dataclass tests.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

Do you really need a separate fixture? You can try tweaking an existing one.


def dataclass_class_maker_callback(ctx: ClassDefContext) -> None:
"""Hooks into the class typechecking process to add support for dataclasses.
"""

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

It is good that you added docstrings to all functions. We like the well documented code 👍


app = Application.parse('')

[builtins fixtures/list.pyi]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

It looks like several tests are missing. Could you please add at least these:

  • Incremental tests, these are tests where you update something in the code and check that (de-)serialisation works correctly (you can look at the initial and subsequent attrs PR to get some ideas).
  • Test all error messages that you add.
  • Test all special feature in dataclasses, such as ClassVar and InitVar and special methods (you can look for some examples in the PEP)
  • Check situations where user defines some dunder methods manually

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

It looks like you didn't yet added incremental tests. They live in check-incremental.test (you can look for example in attrs PR).

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Could you please add some tests for generic dataclasses, in particular type inference (again, you can look for examples in the original attrs PR)

frozen: bool = ...) -> Callable[[_C], _C]: ...


def field(*,

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

It is good that you added this to lib-stub. Be sure to make updates here, if python/typeshed#1944 gets updated.

@dataclass
class Person:
name: str
age: int = field(default=0, init=False)

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

I would add more combinations of different values for init and present/absent defaults.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Have you added these tests?

This comment has been minimized.

@Bogdanp

Bogdanp May 27, 2018

Author Contributor

I have now


reveal_type(Person) # E: Revealed type is 'def (name: builtins.str) -> __main__.Person'
john = Person('John')
john.age = 24

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

I would move this below the next line to check the type is inferred from class, not from this assignment.

@dataclass
class Application:
name: str = 'Unnamed'
rating: int = 0

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 14, 2018

Collaborator

I would add a test that an error is given if a field without a default comes after the one with a default.

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

ilevkivskyi commented May 20, 2018

@Bogdanp are you still interested in working on this? If you don't have time or desire, then we can pick up the PR.

@Bogdanp

This comment has been minimized.

Copy link
Contributor Author

Bogdanp commented May 20, 2018

@ilevkivskyi yup. I plan on working on it later today.

@Bogdanp

This comment has been minimized.

Copy link
Contributor Author

Bogdanp commented May 20, 2018

@ilevkivskyi I believe I have addressed all the points you raised except for

  • the circular import (see this comment) and
  • adding support for InitVars.

I'm a bit out of my depth when it comes to adding support for InitVars. I could not find how to add a new generic alias type (which I think InitVar counts as). Help would be appreciated here or I would be happy to let someone more experienced implement that part.

@Bogdanp Bogdanp changed the title [wip] add basic support for dataclasses add support for dataclasses May 20, 2018

self._freeze(attributes)

info.metadata['dataclass'] = {
'attributes': {attr.name: attr.serialize() for attr in attributes},

This comment has been minimized.

@JelleZijlstra

JelleZijlstra May 20, 2018

Collaborator

This probably should be an OrderedDict in order to preserve order in pre-3.6 Python versions. I think that's what's causing the test failure on 3.4 and 3.5 in Travis.

This comment has been minimized.

@Bogdanp

Bogdanp May 20, 2018

Author Contributor

Ahh, that's right! I'll make the change.

This comment has been minimized.

@Bogdanp

Bogdanp May 20, 2018

Author Contributor

Change made. :)

This comment has been minimized.

@JelleZijlstra

JelleZijlstra May 20, 2018

Collaborator

Thanks! For the future though, it's generally easier on the reviewer to push new commits instead of amending your previous commit and force-pushing as you're doing now. That way, it's easier for the reviewer to see what changes were made in response to the review.

We squash the commits anyway before merging, so it's OK to have a lot of commits in your PR branch.

@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

Thanks for continuing to work on this!

I have several more comments and suggestions. Fox the next time, please don't rebase: I need to see your full commit history, otherwise it makes reviewing hard (especially for large PRs like this one).

if fullname in mypy.plugins.attrs.attr_class_makers:
return mypy.plugins.attrs.attr_class_maker_callback
elif fullname in mypy.plugins.attrs.attr_dataclass_makers:
from mypy.plugins import attrs

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

I think the circular dependency can be fixed by extracting ClassDefContext into a separate module (like mypy.plugin_context). Let me know if you'd like me to do that!

I think moving out all the interfaces to a separate file mypy.api is a more permanent solution. But it is a big refactoring, so it is better to do this in a separate PR.

auto_attribs_default=True
)
# TODO: Drop the or clause once dataclasses lands in typeshed.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

I think you can actually overtake the typeshed PR python/typeshed#1944, the original author seems to not have time to finish it.

This comment has been minimized.

@Bogdanp

Bogdanp May 27, 2018

Author Contributor

Sorry, but I don't think I'll have the time to carry that through at the moment.

from mypy.types import CallableType, NoneTyp, Type, TypeVarDef, TypeVarType
from mypy.typevars import fill_typevars

#: The set of decorators that generate dataclasses.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Why a colon after #?

This comment has been minimized.

@Bogdanp

Bogdanp May 27, 2018

Author Contributor

Habit. :)

Sphinx picks these up as docstrings for the module vars, but I'll remove this one.

def __init__(self, ctx: ClassDefContext) -> None:
self._ctx = ctx

def transform(self) -> None:

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Add a docstring for this function.

args=[attr.to_argument(info) for attr in attributes if attr.is_in_init],
return_type=NoneTyp(),
)
for stmt in self._ctx.cls.defs.body:

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Why do you need this block? Add a comment about this.

class A:
a: int


This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Please use at most one empty line in tests (even between tests).


reveal_type(SpecializedApplication) # E: Revealed type is 'def (id: Union[builtins.int, None], name: builtins.str, rating: builtins.int =) -> __main__.SpecializedApplication'

[builtins fixtures/list.pyi]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Newline is missing at the end of file.


app = Application.parse('')

[builtins fixtures/list.pyi]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

It looks like you didn't yet added incremental tests. They live in check-incremental.test (you can look for example in attrs PR).


app = Application.parse('')

[builtins fixtures/list.pyi]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

Could you please add some tests for generic dataclasses, in particular type inference (again, you can look for examples in the original attrs PR)



@overload
def dataclass(_cls: _C,

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 24, 2018

Collaborator

You also need to add InitVar (here and on typeshed). You can define it as a normal generic class:

class InitVar(Generic[_T]):
    pass

Then in the plugin you can test for it e.g. like this if isinstance(some_type, Instance) and some_type.info.fullname() == 'dataclasses.InitVar'.

This comment has been minimized.

@Bogdanp

Bogdanp May 27, 2018

Author Contributor

I had tried that and couldn't get it to work, but I must've messed something up last time because it works just fine now.

Bogdanp added some commits May 27, 2018

@Bogdanp

This comment has been minimized.

Copy link
Contributor Author

Bogdanp commented May 27, 2018

@ilevkivskyi do you have any advice with regards to how to best remove InitVar attributes from a class? I've tried removing them from the SymbolTable, but that breaks inheritance. The best I could come up with was to remove them from the class' symbol table and add the types to the DataclassAttributes but then I wasn't sure how to best serialize/deserialize the types.

@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

Thanks! This looks almost ready, but there are few more comments.

Most importantly it looks like you still didn't add incremental tests. They live in check-incremental.test (you can look for examples in attrs tests). Please add at least these tests:

  • Basic functionality (creating boilerplate methods) still works after loading from cache
  • Base classes work after loading from cache (the base class and main class should be in different files).
  • Forward references work correctly. For example:
@dataclass
class C:
    x: 'Other'
class Other:
    ...

IIRC all these kinds of tests should be there for attrs.

'has_default': self.has_default,
'line': self.line,
'column': self.column,
}

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

The best I could come up with was to remove them from the class' symbol table and add the types to the DataclassAttributes but then I wasn't sure how to best serialize/deserialize the types.

You can add list of InitVars here. Most types in mypy support (de-)serialization (those that are not supported like partial types, or forward reference types, should not appear in at the point where serialization happens). You can find signatures in mypy.types.Type.serialize and mypy.types.Type.deserialize.

This comment has been minimized.

@Bogdanp

Bogdanp May 28, 2018

Author Contributor

I had tried that, but deserialization failed with De-serialization failure: TypeInfo not fixed. Reading the docstring for FakeInfo, I understand this is supposed to be done in two passes, but it looks like the fixup step requires a lot of information I don't necessarily have/that it would be ugly to pass around. Am I missing something obvious?

info = self._ctx.cls.info
for attr in attributes:
try:
node = info.names[attr.name].node

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

I think this will look cleaner if refactored via info.names.get(attr.name) which cannot raise.

setup.cfg Outdated
@@ -48,3 +48,6 @@ parallel = true

[coverage:report]
show_missing = true

[isort]
multi_line_output = 5

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

Why do you want this addition? Can it be made in a separate PR?

This comment has been minimized.

@Bogdanp

Bogdanp May 28, 2018

Author Contributor

That was accidental, sorry! I'll remove it.

try:
# parse_bool returns an optional bool, so we corece it
# to a bool here in order to appease the type checker.
is_in_init = bool(ctx.api.parse_bool(field_args['init']))

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

This will be definitely cleaner if refactored via filed_args.get('init') that can't raise. (Also the comment/coercion will be unnecessary after refactoring.)

This comment has been minimized.

@Bogdanp

Bogdanp May 28, 2018

Author Contributor

I may be missing something but I don't think that's true. field_args is a dict from str -> Expression so I'll still need the parse_bool call, which means I'll also need the bool coercion since parse_bool returns an Optional[bool]. I'll refactor to drop the try-catch though (although I do think this approach is cleaner)!


# Treat the assignment as an instance-level assignment
# even though it looks like a class-level assignment.
node.is_initialized_in_class = False

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

Could you please add a test that fails without this change? Or if you already added, could you please tell me the name?

This comment has been minimized.

@Bogdanp

Bogdanp May 28, 2018

Author Contributor

It appears that wasn't necessary after all so I removed it.

def problem(self) -> T:
return self.z # E: Incompatible return value type (got "List[T]", expected "T")

reveal_type(A) # E: Revealed type is 'def [T] (x: T`1, y: T`1, z: builtins.list[T`1]) -> __main__.A[T`1]'

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

This check is not very useful, I would rather called it with incompatible args to see an error, e.g. A('no', 'way', [1, 2]).

reveal_type(a) # E: Revealed type is '__main__.A[builtins.int*]'
reveal_type(a.x) # E: Revealed type is 'builtins.int*'
reveal_type(a.y) # E: Revealed type is 'builtins.int*'
reveal_type(a.z) # E: Revealed type is 'builtins.list[builtins.int*]'

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

I would add a test that a.foo() returns an int.


[builtins fixtures/list.pyi]

[case testDataclassGenerics]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi May 28, 2018

Collaborator

Thanks for adding some tests!

Bogdanp added some commits May 28, 2018

@Bogdanp

This comment has been minimized.

Copy link
Contributor Author

Bogdanp commented May 28, 2018

Thanks for the review @ilevkivskyi. I believe I've made all the requested changes apart from InitVar support which I'll try to tackle again towards the end of the week.

@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

I'll try to tackle again towards the end of the week.

We are close to the end of the week, so here is another (hopefully last) round of review.

Note that we will still need to add function plugin hooks to make asdict, astuple, replace etc. more precise (they can't be typed precisely in typeshed) in a separate PR. Are you interested in working on the second PR?

auto_attribs_default=True
)
# TODO: Drop the or clause once dataclasses lands in typeshed.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

The typeshed PR is landed. Is this still needed?

This comment has been minimized.

@Bogdanp

Bogdanp Jun 4, 2018

Author Contributor

Looks like it still is, not sure why, but after updating the typeshed submodule to the current master, if I remove the or clause and run mypy against my project, the issue still happens.

'has_default': self.has_default,
'line': self.line,
'column': self.column,
}

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

Am I missing something obvious?

No, you are right, this way it will not work. Here is another idea about InitVars that I think should work. You can just store names of InitVars in a special filed in .metadata, then later you can find the types in the type of __init__ method, since in mypy both arg types and and arg names are stored. This is a bit hacky, but should be robust and is much simpler than manually traversing types in fixup phase.

frozen: bool = ...) -> Callable[[_C], _C]: ...


def field(*,

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

It looks like current definition in typeshed is a bit different, could you please sync this one?

This comment has been minimized.

@Bogdanp

Bogdanp Jun 4, 2018

Author Contributor

This definition diverges from the one in typeshed in that it returns _T rather than Field[_T]. Returning Field[_T] breaks things like:

@dataclass
class Foo:
  x: int = field(default=42)

Because int and Field[int] are incompatible.


[builtins fixtures/list.pyi]
[out1]
[out2]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

I think this test is not testing anything. If all files are unchanged like here, then mypy will at most load caches, but will not make any checks. What you should do is to put first part of the test in another module (e.g. [file b.py]). And then on the second run (i.e. in [file b.py.2]) just make a cosmetic change, like comment somewhere, so that mypy rechecks b.py while using cache for (unchanged) a.py.


[builtins fixtures/list.pyi]
[out1]
main:7: error: Revealed type is 'def (x: builtins.int) -> __main__.B'

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

Any error (including a reveal) prevents writing the cache, so you should use reveal_type only on second run.


[case testIncrementalDataclassesDunder]
from a import A
reveal_type(A) # E: Revealed type is 'def (a: builtins.int) -> a.A'

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

Also here, please don't add errors on the first run.


[builtins fixtures/list.pyi]
[out1]
[out2]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

I like this test case, simple and to the point.


[builtins fixtures/list.pyi]
[out1]
main:2: error: Argument 2 to "B" has incompatible type "str"; expected "int"

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

I like this one too. Although it has an error on first run, we should keep it, just to be sure, the error is cleared when error is fixed.


[builtins fixtures/list.pyi]
[out1]
[out2]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

I would introduce a simple change in one of a or b on the second run.

A(b=B(42))
A(b=42) # E: Argument "b" to "A" has incompatible type "int"; expected "B"

[builtins fixtures/list.pyi]

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Jun 2, 2018

Collaborator

No newline at the end of file. We really should make this a lint error :-)

Bogdanp added some commits Jun 4, 2018

Bogdanp added some commits Jun 4, 2018

@Bogdanp

This comment has been minimized.

Copy link
Contributor Author

Bogdanp commented Jun 4, 2018

Note that we will still need to add function plugin hooks to make asdict, astuple, replace etc. more precise (they can't be typed precisely in typeshed) in a separate PR. Are you interested in working on the second PR?

I don't think I'll have the free time to tackle that. In fact, I won't have time to work on this PR either in the next two or three weeks. Feel free to take over if you feel there are any other changes that need to be made.

Ivan Levkivskyi

@ilevkivskyi ilevkivskyi referenced this pull request Jun 5, 2018

Merged

Sync typeshed #5151

Ivan Levkivskyi added some commits Jun 5, 2018

@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

OK, I added some polish, this is now ready. I will merge once tests pass. Thanks for all your work!

I will continue working on this in subsequent PRs.

@ilevkivskyi ilevkivskyi merged commit df828b0 into python:master Jun 5, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.