-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many functions that accept Iterable should actually accept Iterable | SupportsGetItem #7813
Comments
Related python/mypy#2220 |
I suggest adding to |
Or, well, since SupportsGetItem has two type parameters that won't exactly work, but something along those lines. |
This affects every single place where We could add an alias like you suggest and use it everywhere, but that doesn't seem like a great solution. The alias wouldn't get used in code outside typeshed, it would be harder for users to pick up this special case, and type checker error messages may get worse. |
I don't think it's common to call
True. But in a minor way. It would probably only change from showing "Iterable" to "SequenceOrIterable". Which is in fact more correct and less misleading. |
In the end, the job of typeshed is to correctly annotate the stdlib |
Anything useful you do with an iterable will ultimately call one of those functions (or some other part of the Python implementation that supports the |
Right. So maybe indeed the right thing is to add it to |
I don't think typeshed needs changes here, the problem is that |
I disagree, the type annotations are wrong. They specify the argument has to have |
Wait what behavior do you mean? Apart from having |
The behavior is representable in a stub. Iterable would be defined something like this: class _IterIterable(Protocol[_T_co]):
def __iter__(self) -> Iterator[_T_co]: ...
class _GetItemIterable(Protocol[_T_co]):
def __getitem__(self, __idx: int) -> _T_co: ...
Iterable: TypeAlias = _IterIterable[_T_co] | _GetItemIterable[_T_co] But I still don't like this change much. Suppose we add a
And all that for an edge case so rare that it's only been reported a handful of times in mypy's years of history. |
Even that's not exactly correct, since it doesn't do the |
I ran into it with |
I really wouldn't mind if the fix is in type checkers instead of in typeshed. But I do think there should be a fix :) Personally I think that having a special case for Iterable in all type checkers is an "even worse" solution. @erictraut maybe you could have an opinion or suggestion here |
The fix here cannot (or rather should not) be provided by type checkers. If this problem needs to be fixed, it should be done so in the stubs. I also question whether this is a problem that needs fixing. As Jelle indicated, this seems like an extreme edge case that rarely comes up. Perhaps we'd be better off documenting a workaround? In addition to the downsides that Jelle mentioned, I'll add one more: replacing |
Thanks for weighing in, good points
Do you mean implementing |
Or wrapping it in a call to the |
Definitely not an option for |
I opened a PR against torch that might help :-) pytorch/pytorch#77116 |
Here's another possible workaround, assuming I can't modify original library's code (as @hauntsaninja did), and I want to have a function accepting any iterable (e.g. pytorch Dataset and other "normal" iterables), with only a thin wrapper without coercing to from typing import Generic, Iterable, Iterator, Protocol, TypeVar, Union
from typing_extensions import TypeAlias
T_co = TypeVar("T_co", covariant=True)
class GetItemIterable(Protocol[T_co]):
def __getitem__(self, __idx: int) -> T_co: ...
AnyIterable: TypeAlias = Union[Iterable[T_co], GetItemIterable[T_co]]
class GetItemIterableWrapper(Generic[T_co]):
def __init__(self, iterable: GetItemIterable[T_co]):
self.iterable = iterable
def __iter__(self) -> Iterator[T_co]:
return iter(self.iterable) # type: ignore
def to_iterable(x: AnyIterable[T_co]) -> Iterable[T_co]:
return x if isinstance(x, Iterable) else GetItemIterableWrapper(x)
# Example use:
class MyIterable:
def __getitem__(self, idx: int) -> str:
if idx >= 3:
raise IndexError()
return "blah"
def strjoin1(sep: str, iterable: AnyIterable[str]):
# The following line gives type error (but works at runtime)
return sep.join(iterable)
def strjoin2(sep: str, iterable: AnyIterable[str]):
# The following line gives no error
return sep.join(to_iterable(iterable))
print("Testing __iter__ iterable:")
print(strjoin1(" ", ["1", "2", "3"]))
print(strjoin2(" ", ["1", "2", "3"]))
print("\nTesting __getitem__ iterable:")
print(strjoin1(" ", MyIterable()))
print(strjoin2(" ", MyIterable())) |
Hmm, I guess I would be in favour of a single typeshed change: allow |
Yes, that would remove the need for |
Oh, you are totally right, it isn't! Because an iterator is also iterable... |
Hopefully #7817 will go some way towards making things more ergonomic :-) |
Thanks for the report, and I agree that this is one of 9857 things that are slightly annoying about Python's typing system. I think I speak for my fellow maintainers, though, when I say that we're not going to make the change you're asking for here. While it's true that objects with I'm glad @hauntsaninja's ingenious #7817 was merged, and I hope his pytorch PR is also merged :) |
Thanks all for your insightful comments, I totally agree with what you said @AlexWaygood. I think the change applied by @hauntsaninja is an excellent workaround. One thing that could be done is to document this somewhere, but seems like there is no real "Python typing docs" place, only the type checkers document this individually |
We could put up an essay on typing.readthedocs.io, similar to https://typing.readthedocs.io/en/latest/source/unreachable.html about exhaustiveness checking. |
Many functions in stdlib that are documented to accept an "iterable" are annotated as
Iterable[T]
. This means the arguments must have__iter__
. However, the actual functions don't require the arguments to have__iter__
, they also work with arguments that have only__getitem__
. That's because internally they useiter
(in Python) or PySequence_Fast (in C).As the docs for both
iter
andPySequence_Fast
functions show, these functions support two protocols for the argument: the "iterable" protocol (__iter__
) and the "sequence" protocol (__getitem__
).Example functions that have this bug right now:
For enumerate, a bug was actually filed on the mypy repo but in fact I believe this should be corrected in typeshed.
Aside:
for
loops also similarly accept "iterables" that have only__getitem__
, but that is implemented inside type checkers themselves. For now mypy still doesn't support it properly, but Pyright does.The text was updated successfully, but these errors were encountered: