Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surprising error involving bytes duck-type compatibility #12643

Closed
dvarrazzo opened this issue Apr 21, 2022 · 9 comments
Closed

Surprising error involving bytes duck-type compatibility #12643

dvarrazzo opened this issue Apr 21, 2022 · 9 comments
Labels
bug mypy got something wrong topic-reachability Detecting unreachable code topic-type-narrowing Conditional type narrowing / binder

Comments

@dvarrazzo
Copy link

Mypy seems to fail to narrow down union with not conditions. I couldn't find an issue already open on the topic.

To Reproduce

Running mypy on the following sample, an error is reported:

from typing import Union

Buffer = Union[bytes, bytearray, memoryview]

def f(data: Buffer) -> str:
    if not isinstance(data, bytes):
        data = bytes(data)

    return data.decode()
$ mypy testunion.py 
testunion.py:9: error: Item "memoryview" of "Union[bytes, bytearray, memoryview]" has no attribute "decode"  [union-attr]

This is unexpected because, after the if, the type of data is definitely bytes.

If a test if isinstance(data, (bytearray, memoryview)) is used instead there is no error. However this has likely a lower performance and requires more maintenance should other types be added to the union.

Your Environment

  • Mypy version used: 0.941
  • Python version used: 3.8.10
  • In pyproject.toml:
[tool.mypy]
warn_unused_ignores = true
show_error_codes = true
strict = true
@dvarrazzo dvarrazzo added the bug mypy got something wrong label Apr 21, 2022
@AlexWaygood AlexWaygood added the topic-type-narrowing Conditional type narrowing / binder label Apr 21, 2022
@AlexWaygood
Copy link
Member

AlexWaygood commented Apr 21, 2022

Unfortunately I believe this may actually be expected behaviour, despite the fact that it is somewhat baffling and counter-intuitive. It likely stems from the documentation for typing.ByteString:

class typing.ByteString(Sequence[int])
A generic version of collections.abc.ByteString.

This type represents the types bytes, bytearray, and memoryview of byte sequences.

As a shorthand for this type, bytes can be used to annotate arguments of any of the types mentioned above.

In other words (although it is not stated in a PEP anywhere), the typing module docs state that type checkers should treat bytearray and memoryview as implicit subtypes of bytes. This causes mypy to make false assumptions about reachability, as we can see if we run mypy on your snippet with the --warn-unreachable flag:

from typing import Union

Buffer = Union[bytes, bytearray, memoryview]

def f(data: Buffer) -> str:
    if not isinstance(data, bytes):
        data = bytes(data)  # error: Statement is unreachable

    return data.decode()  # error: Item "memoryview" of "Union[bytes, bytearray, memoryview]" has no attribute "decode

We can also see that mypy is able to quite happily do type narrowing using not conditions if we use different types in the Union; it's only the weird nexus of {bytes, bytearray, memoryview} in this snippet that causes confusion for mypy. Mypy emits no errors if run on the following code:

from typing import Union

Buffer = Union[bytes, int, list[int]]

def f(data: Buffer) -> str:
    if not isinstance(data, bytes):
        data = bytes(data)

    return data.decode()

@AlexWaygood AlexWaygood added the topic-reachability Detecting unreachable code label Apr 21, 2022
@JelleZijlstra
Copy link
Member

Was just going to write the same thing @AlexWaygood said :)

We maybe should reconsider this ByteString behavior. It's confusing, underspecified, and not useful as frequently as with int/float.

@AlexWaygood
Copy link
Member

AlexWaygood commented Apr 21, 2022

We maybe should reconsider this ByteString behavior. It's confusing, underspecified, and not useful as frequently as with int/float.

Agree 100%.

@hauntsaninja hauntsaninja changed the title Failure to narrow down unions using not conditions Surprising error involving bytes duck-type compatibility Apr 21, 2022
@dvarrazzo
Copy link
Author

As a shorthand for this type, bytes can be used to annotate arguments of any of the types mentioned above.

You are going to send lady Liskov to an early grave 😄 This looks like a bug to me: how can bytes be a a valid annotation for memoryview, if the latter doesn't have methods such as decode()?

I believe I have been fighting mypy issues for two years in psycopg 3 codebase thanks to this helpful shortcut.

@JelleZijlstra
Copy link
Member

@CarliJoy
Copy link

The problem with duck typing is even worse: #12824

@CarliJoy
Copy link

Let's fix it: https://github.com/JelleZijlstra/peps/blob/bufferpep/pep-9999.rst.
@JelleZijlstra the link gives a 404 not found

@AlexWaygood
Copy link
Member

@ilevkivskyi
Copy link
Member

FWIW the original example doesn't give an error in the current master (and probably also in 0.990)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug mypy got something wrong topic-reachability Detecting unreachable code topic-type-narrowing Conditional type narrowing / binder
Projects
None yet
Development

No branches or pull requests

5 participants