Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max Hashes #34

Open
wolfospealain opened this issue Jul 7, 2018 · 1 comment
Open

Max Hashes #34

wolfospealain opened this issue Jul 7, 2018 · 1 comment

Comments

@wolfospealain
Copy link
Collaborator

What is the logic behind the max_hashes of 1028 * 128?

@chadnetzer
Copy link
Collaborator

The MAX_HASHES is likely an artifact of being ported over from C++, which probably used a static hash table (C++ had no standard hash table for the longest time). It's totally superfluous in Python, and the branch I'm currently working on eliminates it entirely. Python dictionaries grow dynamically, and do the hashing for you, so it makes little sense to hash ahead of time (except possibly to make a compact string hash to use with shelve or anydbm). And since the table stores all the file entries anyway, limiting the range of hash values doesn't really save on memory.

As for the actual values (1024 * 128), I suspect that since C++ doesn't have an exponentiation operator like Python, it was left as a multiplication to visually indicate that the MAX_HASHES was a power of two.

In any case, it is a vestige and can be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants