Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmh3 not 64-bit ready #10

Closed
pitrou opened this issue May 24, 2017 · 1 comment
Closed

mmh3 not 64-bit ready #10

pitrou opened this issue May 24, 2017 · 1 comment

Comments

@pitrou
Copy link

pitrou commented May 24, 2017

mmh3 cannot hash data larger than 2**31 bytes:

>>> import mmh3
>>> import numpy as np
>>> a = np.zeros(2**30, dtype=np.int8)
>>> mmh3.hash_bytes(a)
b"O\xc5\xf1\xf2\x80';s\x1b\xddc\xa1E\x8d\xe3r"
>>> a = np.zeros(2**32, dtype=np.int8)
>>> mmh3.hash_bytes(a)
Traceback (most recent call last):
  File "<ipython-input-9-918a38167947>", line 1, in <module>
    mmh3.hash_bytes(a)
OverflowError: size does not fit in an int

The solution is to either use the s* code instead of s# in PyArg_ParseTuple(), or define the PY_SSIZET_CLEAN macro and change size fields from int to Py_ssize_t. See https://docs.python.org/2.7/c-api/arg.html . I can also make a PR if you want.

Also, there's no test suite?

@hajimes
Copy link
Owner

hajimes commented May 27, 2017

Hi pitrou. Thank you so much for your suggestion and providing concrete solutions to improve the code. I just updated this library to incorporate your suggestion. Feel free to ask me if you still need other features.

@hajimes hajimes closed this as completed May 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants