-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
from __future__ import annotations breaks dataclasses ClassVar and InitVar handling #77634
Comments
This is broken in 3.7 (both beta 3 and 4): from __future__ import annotations
from dataclasses import dataclass
from typing import ClassVar, Any
@dataclass
class C():
class_var: ClassVar[Any] = object()
data: str Traceback: Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 850, in dataclass
return wrap(_cls)
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 842, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 763, in _process_class
else 'self',
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 442, in _init_fn
raise TypeError(f'non-default argument {f.name!r} '
TypeError: non-default argument 'data' follows default argument |
To be clear: it works just fine with the annotations import. |
Sorry, mean to say it works just fine *without* the import. |
This is a known issue, but it wasn't being tracked here. So, thanks for opening the issue. ericvsmith/dataclasses#92 (comment) Not to put Łukasz on the spot (he's sitting behind me even as we speak), but I think we missed the window for 3.7.0 for this. I'll discuss it with him. |
Well, this is a rather big deal and I'd like to see it solved for 3.7.0. Ned, this is after the last beta but as far as I understand, we're still fine committing fixes that maintain ABI until RC1, right? Note that this isn't specific to the field: "ClassVar[SomeTypeReferencedLater]" = get_some_type_object() the effect is the same. There are two ways to solve it, the right way and the fast way. The right way is to call
This is why attrs went with the fast way which covers most (but not all) bases in this case. If the annotation is a string, just check if it starts with "typing.ClassVar", "t.ClassVar", or just "ClassVar". That's a fast check and doesn't ever require importing typing. On the flip side, the 0.001% of users [1]_ who import ClassVar differently, will not have their class variables recognized. So, Eric, unless you really want to do the right thing here and make dataclasses forever slower to start up than attrs, I would be happy to provide you with a patch for this during sprints. [1] .. Figure made up on the spot. |
Yes |
See also [2]_ for a brief discussion of forward references, which makes get_type_hints() undesirable in this case.
I'm okay with the fast check for the string "ClassVar". My only concern is how the user would figure out what's going on, if for example they "import typing as T". The generated __init__ wouldn't have the params they expect, but with default values thrown in I'm not so sure how quickly they'd notice. Hopefully they'd figure out soon enough there's a problem, but I'm not sure they'd know the cause if it. Maybe we could do some additional checking if typing has been imported? If we see "T.ClassVar" ("[identifier].ClassVar" appears at the beginning of the string) then if typing has been imported, further check that "T is typing"? Although this isn't going to help with "from typing import ClassVar as CV" (but only 0.00004% of users would do that [3]_) and I don't want to use regex's for this. str.partition() is likely good enough, if we decide to go this route. Is there any scenario where the following code would be useful, or expected to work, if "import typing as T" hadn't been executed before @DataClass runs? After all, if we do decide to call get_type_hints() it wouldn't work without it. field: "T.ClassVar[SomeTypeReferencedLater]" = get_some_type_object() But again, unless [2]_ is addressed, get_type_hints() will fail unless both T and SomeTypeReferencedLater are known when @DataClass runs, if we used get_type_hints(). So, I guess this is my roundabout way of saying we should do the best we can with string inspection, and not use get_type_hints(). Maybe we can discuss it with Guido at the sprints. For all of this, I'm assuming we'll do something similar with InitVar. Although we obviously know it's been imported, it doesn't solve all of the other issues with get_type_hints. [2] .. python/typing#508 [3] .. Also a made up number. |
I see that python/typing#508 is also referenced in python-attrs/attrs#361, where it contributed to attrs using string inspection. |
Perhaps if someone, in search of a speedup, were sort of rolling their own lighter-weight equivalent to the typing module (in order to avoid importing the full typing module), but duck typed enough to fool the average type checker? Is that possible? |
The more I think about this, the more I think Łukasz is correct that just checking for strings starting with "ClassVar", "typing.ClassVar", or "t.ClassVar" is the correct thing to do. This is the change he made in python-attrs/attrs#367. In some private email I had discussed extracting the module name, if any, from the string annotation, and looking it up in sys.modules. This doesn't buy you much because you have to know how the caller imported typing. If they used "import typing as t", then you can't look up "t" in sys.modules. You could do some horrible frame trick to find out what the caller knew as "t", but that still won't work in plenty of cases. I don't want to use a complex solution that still doesn't work in all cases, and would indeed work in fewer places than just examining the string. The only name we could reliably detect is "typing.ClassVar", all others wouldn't be in sys.modules correctly. So, that leaves us with just looking at the string and guessing what the caller meant. I think the three strings that Łukasz suggested are good enough. This will need to be well documented. The documentation should note that things like this will break: from __future__ import annotations
from typing import ClassVar as CV
@dataclass
class C:
x: CV x will not be a class variable here, since @DataClass will see the annotation as "CV" and not know that it means typing.ClassVar. InitVar has a similar problem. What strings to use there? "InitVar" and "dataclasses.InitVar", of course. But maybe "dc.InitVar"? It's hard to say with no in-the-field usage examples to search for. I'll start with just the first two strings. I really don't want to use regexes here. But a refinement would be to not just use .startswith, and instead use a word boundary. After all, "ClassVarOfMine" should probably not be considered a ClassVar. I'll leave that issue for another day, though. I'll have a patch ready before the sprints are over. |
"t.ClassVar" looks ugly... |
We can't break the API at this point in the release cycle. But I am open to what string prefixes we should allow. |
We hadn't release RC, and we hadn't documented dataclass module yet. I think we've learned lesson that we shouldn't use typing in modules other than typing...
If it is really needed, I think we should only allow "typing.ClassVar". |
This is a blanket statement that as hurtful as it is factually incorrect. Let me address the latter.
I came up with "typing", "t", and straight "ClassVar" by grepping through typed codebases I work with. I agree that "t" is rather arbitrary so I'm totally fine with leaving that one out. |
Lending my voice to the suggestion of limiting the class attribute check to |
When I said "use typing module", I meant "using type information from
Yes. I think we shouldn't parse annotation value until we can expect
How about performance? I agree that annotation could very useful. Once it is happen, it's very hard to remove it from stdlib. |
At this stage in the release cycle, if you really feel strongly about this, you should take it up with Guido directly. |
There have been comments on the PR, but I'd like to focus the higher level issue back here. Specifically, see my comment #6768 (comment) To summarize: I still think string inspections are the best we can do. I'm going to try to organize a meeting with Guido, Ivan, and Łukasz at the sprints on Monday. |
Followup from our meeting at the sprints: we're going to go with inspecting the type annotation string and use heuristics to determine if the type is a ClassVar or InitVar. I'll follow up with more specifics on the approach. This will obviously need to make it in to the documentation. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: