Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary implementation of stdlib/3.7/dataclasses.pyi. #1944

Merged
merged 7 commits into from
May 30, 2018

Conversation

gwk
Copy link
Contributor

@gwk gwk commented Mar 6, 2018

This is a first effort at adding typing for the 3.7 dataclasses module.

@@ -0,0 +1,47 @@
from typing import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these stubs!

A couple of style comments:

  • don't use import *
  • names that don't actually exist at runtime (like T and DictType) should be private, so their names should start with an underscore.
  • indent with four spaces
  • add spaces after : in annotations and around = for annotated functions with arguments.

@JelleZijlstra JelleZijlstra mentioned this pull request Mar 18, 2018
41 tasks
@gwk
Copy link
Contributor Author

gwk commented Mar 26, 2018

@JelleZijlstra thanks for the feedback; working on an update now. Does the private rule include public names imported from typing? from typing import Type or from typing import Type as _Type?

@JelleZijlstra
Copy link
Member

Names imported with from A import B are not public; those imported with from A import B as C (or from A import B as B) are, unless their name starts with an underscore.

@gwk
Copy link
Contributor Author

gwk commented Mar 26, 2018

I made the requested style changes, and one important fix: dataclass is now typed as an overloaded function, for the uncalled @dataclass and called @dataclass(...) cases.

@JelleZijlstra
Copy link
Member

There are still a few Travis failures that should be fixable.

@gwk gwk force-pushed the dataclasses branch 2 times, most recently from 2648516 to c031abd Compare March 26, 2018 20:20
@gwk
Copy link
Contributor Author

gwk commented Mar 27, 2018

@JelleZijlstra latest push passed all tests - please give it another look!


@overload
def dataclass(*, init: bool = ..., repr: bool = ..., eq: bool = ...,
order: bool = ..., hash: Optional[bool] = ..., frozen: bool = ...) -> Callable[[Type[_T]], Type[_T]]: ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@gwk gwk Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The hash tristate parameter changed to unsafe_hash boolean in cpython commit dbf9cff48a4ad0fd58e1c623ce1f36c3dd3d5f38.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the code against the implementation of dataclasses.py (not the documentation since it doesn't exist yet: https://bugs.python.org/issue32216) and noticed a couple of places where it can be improved.

Also cc @ericvsmith who implemented dataclasses and @ilevkivskyi who is interested in writing dataclass support in mypy.

_DictType = TypeVar('_DictType', bound=dict)
_TupleType = TypeVar('_TupleType', bound=tuple)

class _MISSING_TYPE: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're missing MISSING.


def field(*, default: Union[_T, _MISSING_TYPE] = ..., default_factory: Union[Callable[[], _T], _MISSING_TYPE] = ...,
init: bool = ..., repr: bool = ..., hash: Optional[bool] = ..., compare: bool = ...,
metadata: Optional[Mapping[str, Any]] = ...) -> Field: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments in the code say

# This function is used instead of exposing Field creation directly,
#  so that a type checker can be told (via overloads) that this is a
#  function whose type depends on its parameters.

I do think we should use overloads. An overload with default given should return a Field[_T] and allow default_factory as an argument, and one with default not given or _MISSING_TYPE should not allow default_factory and return Field[Any].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An overload with default given should return a Field[_T] and allow default_factory as an argument

I think you mean "disallow default_factory as an argument", correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops yes, they are exclusive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I read it, there are three cases: neither provided (which returns Field[Any]), just default provided, and just default_factory provided. The last two both return Field[_T]. Is that correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds right.

init: bool = ..., repr: bool = ..., hash: Optional[bool] = ..., compare: bool = ...,
metadata: Optional[Mapping[str, Any]] = ...) -> Field: ...

def fields(class_or_instance: Type) -> Tuple[Field, ...]: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also accepts dataclass instances. We should probably just type it as Any; the mypy plugin can do something smarter internally if we want to.

Also, we should be explicit here and say Field[Any].


class _InitVarMeta(type): ...

def asdict(obj: Any, *, dict_factory: _DictType = ...) -> _DictType: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite correct; the argument type should be something like Type[_DictType].

Also, the range of acceptable type is a little wider than what's given here: in practice, dict_factory can be any callable accepting a list of (key, value) tuples. We could perhaps better type this as: def asdict(obj: Any, *, dict_factory: Callable[[List[Tuple[str, Any]], _T]) -> _T: ..., where _T = TypeVar('_T').

All this also goes for astuple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This took me a couple of readings but I now understand and agree that using Callable is the most accurate approach.

class InitVar(metaclass=_InitVarMeta): ...

def make_dataclass(cls_name: str, fields: Iterable[Union[str, Tuple[str, type], Tuple[str, type, Field]]], *,
bases: Tuple[type, ...] = ..., namespace: Dict[str, Any] = ...,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace is Optional

@glenjamin
Copy link

Is there a good way to pull this definition into a python 3.6 project? Will the file need to be duplicated this repo into the stdlib and thirdparty sections?

@gvanrossum
Copy link
Member

gvanrossum commented Apr 2, 2018 via email

@glenjamin
Copy link

Having now played around with this definition a bit, am I right in thinking that mypy is going to need some sort of custom support to be able to correctly type dataclasses?

The problem I'm currently running into is that although the dataclass' properties are correctly specified, there is nothing in the decorator's type signature to indicate that an __init__ method has been generated, so the type checker falls back to object.__init__ which doesn't accept any arguments.

@gwk
Copy link
Contributor Author

gwk commented Apr 2, 2018

thanks to everyone for the reviews; I'll continue working whenever I find a free moment, hopefully next week. @glenjamin what do you mean about 3.6? Here is my attempt to backport dataclasses for use in both 3.6 and 3.7: https://github.com/gwk/pithy/tree/master/pithy/dataclasses. backport.py is copied from the cpython, and backport.pyi is the annotations file from this PR. __init__.py uses the backport only if the 3.7 std import fails. Is this more or less what you are imagining?

Regarding __init__, you are correct. See above for the referenced mypy issue.

@glenjamin
Copy link

glenjamin commented Apr 2, 2018

@glenjamin what do you mean about 3.6?

I was only asking about the typeshed mechanims required to use the same pyi file for both without having to maintain two copies - the answer provided by Guido in #1944 (comment) seems obvious in retrospect.

@JelleZijlstra
Copy link
Member

It's actually not that obvious: see our woes in #1175.

@ilevkivskyi
Copy link
Member

@gwk Do you have time to finish this within next week (or preferably next few days)? If you don't, then I will need to continue your work myself (Python 3.7 release is coming very soon).

@JelleZijlstra
Copy link
Member

@ilevkivskyi if you like, I can take over the implementation here if you do the review.

@gwk
Copy link
Contributor Author

gwk commented May 28, 2018

@JelleZijlstra @ilevkivskyi I'll try to update this now to address your comments.

@ilevkivskyi
Copy link
Member

I'll try to update this now to address your comments.

Great! Just please don't rebase so that it will be easier to review, and don't forget to push your local changes so that we see the new commits :-)

@gwk
Copy link
Contributor Author

gwk commented May 29, 2018

@JelleZijlstra I have pushed the changes. @ilevkivskyi apologies in advance, I had previously rebased this branch against master and was not in a position to undo that operation. I hope it doesn't cause too much trouble for review. I did make the changes as a second commit so you will at least be able to see them separately!

@JelleZijlstra
Copy link
Member

Thanks! Could you fix the CI errors?



@overload # `default` and `default_factory` are optional and mutually exclusive.
def field(*, default: _T = ..., default_factory: _MISSING_TYPE = ...,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three overloads are actually the same unless I'm missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? The return type of the last one is Field[Any], while the first two are constrained.


class _InitVarMeta(type): ...

def asdict(obj: Any, *, dict_factory: _DictFactory = ...) -> _T: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be _DictFactory[_T]. Maybe just skip the type alias since it's only used once and write Callable... here directly.


_T = TypeVar('_T')

_DictFactory = Callable[List[Tuple[str, Any]], _T]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite right; you need one more set of [] around the first argument to Callable, because that's how you indicate the argument list.

@bluetech
Copy link
Contributor

Is there any avenue in the future to make dataclasses.replace type-safe? e.g.

@dataclass
class C:
    field: int

c = C(field='field')
dataclasses.replace(c, doesnotexist=10)  # Should not type check.
dataclasses.replace(c, field='notint')  # Should not type check.

Asking as a heavy user of this function (or rather attr.evolve currently).

@ilevkivskyi
Copy link
Member

Is there any avenue in the future to make dataclasses.replace type-safe?

This is work for mypy plugin, the stub is only part of the (unfinished) picture.

@gwk
Copy link
Contributor Author

gwk commented May 29, 2018

@JelleZijlstra @ilevkivskyi I can't figure out how to fix the error regarding field overlapping overloads, even after removing the _MISSING_TYPE parameters from the overload signatures.

stdlib/3.7/dataclasses.pyi:37: error: Overloaded function signatures 1 and 3 overlap with incompatible return types
stdlib/3.7/dataclasses.pyi:42: error: Overloaded function signatures 2 and 3 overlap with incompatible return types

What is strange is that when I remove the additional optional parameters as shown below, it passes:

@overload
def field(*, default: _T) -> Field[_T]: ...

@overload
def field(*, default_factory: Callable[[], _T]) -> Field[_T]: ...

@overload
def field() -> Field[Any]: ...

This suggests to me that mypy has a bug, possibly around non-optional parameters following a *.

@JelleZijlstra
Copy link
Member

That does look like a bug. Some of the work @Michael0x2a is currently doing on mypy may end up fixing this.

Sometimes mypy reports errors on the overload definition, but then behaves correctly when you use it. Can you try adding # type: ignores into the stub and seeing whether mypy has correct output for a few calls like field(default=3), field(default_factory=int), etc.?

@ilevkivskyi
Copy link
Member

@JelleZijlstra Yes, Michael's PR fixed it. I restarted the build after mypy merge and it is all green now. Is anything left to do with this PR?


class FrozenInstanceError(AttributeError): ...

class InitVar(metaclass=_InitVarMeta): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think this one should be Generic[_T].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the usage of InitVar; I simply translated this from the implementation. Nor do I know quite what you are asking me to change the definition to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilevkivskyi if you tell me what you want me to change it to I'll do so and we can close this out. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write

class InitVar(Generic[_T]): ...

and remove _InitVarMeta completely because it is private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @JelleZijlstra do you have other comments?

@JelleZijlstra JelleZijlstra merged commit 66226ab into python:master May 30, 2018
@gwk gwk deleted the dataclasses branch May 30, 2018 20:38
yedpodtrzitko pushed a commit to yedpodtrzitko/typeshed that referenced this pull request Jan 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants