Conversation
Merging this PR will not alter performance🎉 Hooray!
|
There was a problem hiding this comment.
Code Review
This pull request modernizes the project's type-checking infrastructure by migrating mypy configuration to pyproject.toml and adding automated stub validation tests using stubtest and pyright. The type stubs in xxhash/init.pyi have been updated to utilize a custom buffer protocol and explicit exports. Feedback suggests using the tempfile module in tests to prevent potential race conditions and adding positional-only markers to various function signatures in the stubs to correctly match the underlying C extension's behavior.
|
|
||
| def _run_pyright(self, source: str) -> subprocess.CompletedProcess: | ||
| """Run pyright on a temporary file with the given source.""" | ||
| tmp = Path(__file__).resolve().parent / "__pyright_check_tmp.py" |
| def __init__(self, data: _DataType = ..., seed: int = ...) -> None: ... | ||
| def update(self, data: _DataType) -> None: ... |
There was a problem hiding this comment.
The xxhash C extension methods do not support keyword arguments. Adding the positional-only marker / ensures that type checkers will correctly flag invalid usage like xxh32(data=b'...'), which currently results in a TypeError at runtime.
| def __init__(self, data: _DataType = ..., seed: int = ...) -> None: ... | |
| def update(self, data: _DataType) -> None: ... | |
| def __init__(self, data: _DataType = ..., seed: int = ..., /) -> None: ... | |
| def update(self, data: _DataType, /) -> None: ... |
| def xxh32_digest(data: _DataType, seed: int = ...) -> bytes: ... | ||
| def xxh32_hexdigest(data: _DataType, seed: int = ...) -> str: ... | ||
| def xxh32_intdigest(data: _DataType, seed: int = ...) -> int: ... |
There was a problem hiding this comment.
These module-level functions are positional-only at runtime. Adding the / marker improves the accuracy of the stubs and prevents users from attempting to use keyword arguments that are not supported by the underlying C implementation.
| def xxh32_digest(data: _DataType, seed: int = ...) -> bytes: ... | |
| def xxh32_hexdigest(data: _DataType, seed: int = ...) -> str: ... | |
| def xxh32_intdigest(data: _DataType, seed: int = ...) -> int: ... | |
| def xxh32_digest(data: _DataType, seed: int = ..., /) -> bytes: ... | |
| def xxh32_hexdigest(data: _DataType, seed: int = ..., /) -> str: ... | |
| def xxh32_intdigest(data: _DataType, seed: int = ..., /) -> int: ... |
| def xxh3_64_digest(data: _DataType, seed: int = ...) -> bytes: ... | ||
| def xxh3_64_hexdigest(data: _DataType, seed: int = ...) -> str: ... | ||
| def xxh3_64_intdigest(data: _DataType, seed: int = ...) -> int: ... |
There was a problem hiding this comment.
These functions are positional-only at runtime. Adding the / marker ensures type-level correctness.
| def xxh3_64_digest(data: _DataType, seed: int = ...) -> bytes: ... | |
| def xxh3_64_hexdigest(data: _DataType, seed: int = ...) -> str: ... | |
| def xxh3_64_intdigest(data: _DataType, seed: int = ...) -> int: ... | |
| def xxh3_64_digest(data: _DataType, seed: int = ..., /) -> bytes: ... | |
| def xxh3_64_hexdigest(data: _DataType, seed: int = ..., /) -> str: ... | |
| def xxh3_64_intdigest(data: _DataType, seed: int = ..., /) -> int: ... |
| def xxh3_128_digest(data: _DataType, seed: int = ...) -> bytes: ... | ||
| def xxh3_128_hexdigest(data: _DataType, seed: int = ...) -> str: ... | ||
| def xxh3_128_intdigest(data: _DataType, seed: int = ...) -> int: ... |
There was a problem hiding this comment.
These functions are positional-only at runtime. Adding the / marker ensures type-level correctness.
| def xxh3_128_digest(data: _DataType, seed: int = ...) -> bytes: ... | |
| def xxh3_128_hexdigest(data: _DataType, seed: int = ...) -> str: ... | |
| def xxh3_128_intdigest(data: _DataType, seed: int = ...) -> int: ... | |
| def xxh3_128_digest(data: _DataType, seed: int = ..., /) -> bytes: ... | |
| def xxh3_128_hexdigest(data: _DataType, seed: int = ..., /) -> str: ... | |
| def xxh3_128_intdigest(data: _DataType, seed: int = ..., /) -> int: ... |
|
@gemini review |
There was a problem hiding this comment.
Code Review
This pull request migrates the project's Mypy configuration to pyproject.toml, removes legacy Pipfile and run_mypy.sh files, and introduces new test suites for stub validation using mypy.stubtest and pyright. The type stubs in xxhash/init.pyi were updated to use a custom buffer protocol and standardized parameter names. Feedback suggests that the custom _Buffer protocol should be replaced with typing_extensions.Buffer to maintain compatibility with Python versions prior to 3.12, where built-in types do not explicitly define the buffer method in their stubs.
| from typing import Protocol, final | ||
|
|
||
| # __buffer__ protocol makes this redundant on python 3.12+ | ||
| Buffer.register(array.ArrayType) | ||
| _InputType = Union[str, Buffer] | ||
| class _Buffer(Protocol): | ||
| """Objects that support the buffer protocol (PEP 688).""" | ||
| def __buffer__(self, flags: int, /) -> memoryview: ... | ||
|
|
||
| _DataType = _Buffer |
There was a problem hiding this comment.
Using a custom Protocol to define the buffer protocol is problematic for compatibility with Python versions prior to 3.12. In those versions, built-in types like bytes, bytearray, and memoryview do not explicitly define the __buffer__ method in their stubs, which will cause type checkers like mypy to report errors when these types are passed where _Buffer is expected.
It is recommended to use typing_extensions.Buffer (which backports collections.abc.Buffer from PEP 688). Type checkers have special-case handling for this type to ensure it correctly matches all buffer-providing types across all supported Python versions.
from typing import final
from typing_extensions import Buffer
_DataType = Buffer
|
@gemini review |
There was a problem hiding this comment.
Code Review
This pull request migrates mypy configuration to pyproject.toml, removes legacy environment files, and updates the type stubs in xxhash/__init__.pyi to use a custom buffer protocol. It also adds a new test suite to validate these stubs using pyright. Feedback indicates that the custom _Buffer protocol based on PEP 688 will break type checking on Python versions earlier than 3.12, suggesting a return to typing_extensions.Buffer. Additionally, it is recommended to handle cases where the pyright executable is missing in the new test suite to avoid hard failures.
| from typing import Protocol, final | ||
|
|
||
| # __buffer__ protocol makes this redundant on python 3.12+ | ||
| Buffer.register(array.ArrayType) | ||
| _InputType = Union[str, Buffer] | ||
| class _Buffer(Protocol): | ||
| """Objects that support the buffer protocol (PEP 688).""" | ||
| def __buffer__(self, flags: int, /) -> memoryview: ... | ||
|
|
||
| _DataType = _Buffer |
There was a problem hiding this comment.
The custom _Buffer protocol defined here is based on PEP 688, which was introduced in Python 3.12. On older Python versions (like 3.9, which this project appears to support), built-in types like bytes, bytearray, and memoryview do not have a Python-accessible __buffer__ method, so they will not match this protocol in type checkers like mypy or pyright unless special-cased.
Furthermore, array.array (which was explicitly supported in the previous version of the stubs) also lacks this method. Using typing_extensions.Buffer is the recommended way to represent the buffer protocol across all supported Python versions, as it is specifically recognized by type checkers to include all buffer-like types (including C-level implementations).
| from typing import Protocol, final | |
| # __buffer__ protocol makes this redundant on python 3.12+ | |
| Buffer.register(array.ArrayType) | |
| _InputType = Union[str, Buffer] | |
| class _Buffer(Protocol): | |
| """Objects that support the buffer protocol (PEP 688).""" | |
| def __buffer__(self, flags: int, /) -> memoryview: ... | |
| _DataType = _Buffer | |
| import array | |
| from typing import final | |
| from typing_extensions import Buffer | |
| # __buffer__ protocol makes this redundant on python 3.12+ | |
| Buffer.register(array.ArrayType) | |
| _DataType = Buffer |
| try: | ||
| return subprocess.run( | ||
| ["pyright", "--project", str(repo_root), tmp_path], | ||
| capture_output=True, | ||
| text=True, | ||
| cwd=repo_root, | ||
| timeout=30, | ||
| ) |
There was a problem hiding this comment.
The test will fail with a FileNotFoundError if the pyright executable is not installed in the environment. It is better to catch this exception and skip the test gracefully, especially for local development environments where pyright might not be globally available.
| try: | |
| return subprocess.run( | |
| ["pyright", "--project", str(repo_root), tmp_path], | |
| capture_output=True, | |
| text=True, | |
| cwd=repo_root, | |
| timeout=30, | |
| ) | |
| try: | |
| return subprocess.run( | |
| ["pyright", "--project", str(repo_root), tmp_path], | |
| capture_output=True, | |
| text=True, | |
| cwd=repo_root, | |
| timeout=30, | |
| ) | |
| except FileNotFoundError: | |
| self.skipTest("pyright executable not found") |
- Rename input -> data, _InputType -> _DataType in .pyi stubs - Add __all__ and algorithms_guaranteed to .pyi - Add test_stubs_pyright.py (pyright validation) - Migrate mypy.ini -> pyproject.toml [tool.mypy] - Add pyright to CI test dependencies - Remove Pipfile, Pipfile.lock, typecheck.sh - Bump version to 3.8.0.dev7
2c13672 to
fcd50e7
Compare
|
@gemini review |
There was a problem hiding this comment.
Code Review
This pull request migrates the mypy configuration to pyproject.toml, removes legacy Pipfile and mypy configuration files, and updates the type stubs in xxhash/__init__.pyi. The stub updates include renaming parameters to data, adding __all__, and replacing typing_extensions.Buffer with a custom _Buffer protocol based on PEP 688. Additionally, a new test suite has been added to validate these stubs using pyright. Feedback was provided regarding the custom _Buffer protocol, noting that it may cause type-checking errors on Python versions earlier than 3.12 and suggesting the use of typing_extensions.Buffer for better compatibility.
| from typing import Protocol, final | ||
|
|
||
| # __buffer__ protocol makes this redundant on python 3.12+ | ||
| Buffer.register(array.ArrayType) | ||
| _InputType = Union[str, Buffer] | ||
| class _Buffer(Protocol): | ||
| """Objects that support the buffer protocol (PEP 688).""" | ||
| def __buffer__(self, flags: int, /) -> memoryview: ... | ||
|
|
||
| _DataType = _Buffer |
There was a problem hiding this comment.
Using a custom Protocol with __buffer__ to represent the buffer protocol is based on PEP 688, which is fully supported in Python 3.12+. For older Python versions, built-in types like bytes, bytearray, and array.array do not have a __buffer__ method in their type stubs within typeshed. This can lead to false-positive type errors in type checkers (like mypy or pyright) when they are configured to target Python versions prior to 3.12.
Since this project appears to support Python 3.9+, it is recommended to use typing_extensions.Buffer. This is the standard, backported way to represent the buffer protocol across all supported Python versions and is correctly recognized by modern type checkers for all built-in buffer types.
from typing import final
from typing_extensions import Buffer
_DataType = Buffer
Changes
input→dataparameter in.pyistubs_InputType→_DataType(Buffer-only,stris rejected by C code)__all__to.pyimatching runtime exports__init__.pymissingalgorithms_guaranteedto stubsmypy.ini→pyproject.toml [tool.mypy]Pipfile,Pipfile.lock,typecheck.sh(no longer needed)mypyandpyrightto CI test dependencies