-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Py_hash_t Py_HashBytes(const void*, Py_ssize_t)
#13
Comments
This is necessary to implement a hashable object that compares equal to a I'd name the public function |
What's the behavior for negative length? Currently, it seems like negative length is cast to unsigned length and so have funny behavior. I suggest to change the length type to unsigned I would be fine with Other examples of
In the C library, size_t is common. Examples:
|
All objects lengths in Python are signed, including the I don't really care either way, but insisting on |
If you want to keep |
Negative length could for example be clamped to 0, and perhaps coupled with a debug-mode assertion? |
I'm fine with |
If the parameter type is not |
It's not a @pitrou, I see you've added a thumbs-up to my comment. IMO, the docs will be fairly important in this case; I'd really like to frame it as “implementing equality & hashes for (immutable) buffer objects” rather than “here's a function you can use”. Negatives aren't the only case of invalid sizes. IMO it's OK if |
Perhaps something like: .. c:function:: Py_hash_t Py_HashBuffer(const void* ptr, Py_ssize_t len)
Compute and return the hash value of a buffer of *len* bytes
starting at address *ptr*. This hash value is guaranteed to be
equal to the hash value of a :class:`bytes` object with the same
contents.
This function is meant to ease implementation of hashing for
immutable objects providing the :ref:`buffer protocol <bufferobjects>`. |
I meant something like: .. c:function:: Py_hash_t Py_HashBuffer(const void* ptr, Py_ssize_t len)
Compute and return the hash value of a buffer of *len* bytes
starting at address *ptr*. The hash is guaranteed to match that of
:class:`bytes`, :class:`memoryview`, and other built-in objects
that implement the :ref:`buffer protocol <bufferobjects>`.
Use this function to implement hashing for immutable
objects whose `tp_richcompare` function compares
to another object's buffer. But the details can be hashed out in review. |
Well, I think this is overdue for a vote:
|
Ping @iritkatriel who didn't vote. |
Sorry. |
@pitrou: You can go ahead with |
CPython has an internal API
Py_hash_t _Py_HashBytes(const void*, Py_ssize_t)
that implements hashing of a buffer of bytes, consistently with the hash output of thebytes
object. It was added (by me) in python/cpython@ce4a9daIt is currently used internally for hashing
bytes
objects (of course), but alsostr
objects,memoryview
objects, somedatetime
objects, and a couple other duties:Third-party libraries may want to define buffer-like objects and ensure that they are hashable in a way that's compatible with built-in
bytes
objects. Currently this would mean relying on the aforementioned internal API. An example I'm familiar with is theBuffer
object in PyArrow.I simply propose making the API public and renaming it to
Py_HashBytes
, such that third-party libraries have access to the same facility.The text was updated successfully, but these errors were encountered: