New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename Hashtbl.SeededHashedType.{hash => seeded_hash}
#11157
Conversation
I forgot about this issue, thanks for resurrecting it! I like the proposed change very much, as it makes it possible to export both kinds of hash function from the same module. In turn this facilitates the use of seeded hash tables. |
cc @ygrek @thierry-martinez @anmonteiro @pqwy @aantron the maintainers of some/most of the affected packages (not sure who is a contact for the Tezos codebase) |
Also cc @raphael-proust @yurug for the Tezos packages |
AFAICT, it'd be relatively easy to adapt I'm surprised My main question is: How can we make a library that's compatible with both the existing interface and the proposed interface? In the general case do we need a preprocessor with an OCaml version check? |
I don't see an easy way without a preprocessor. Another possibility is to depend on a compatibility shim such as https://github.com/thierry-martinez/stdcompat (which itself implements the preprocessor logic). |
It does not seem to use seeded hash tables. |
I would be very happy with this change, and I think that compatibility shims (generic existing ones and/or a dedicated one) will be enough to have a convenient transition path. |
There is buy-in from the maintainers of most of the affected packages, so I think this PR is on the right path. We still need an official review or two of this easy PR. Perhaps @alainfrisch and @gasche could do the honours? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have approved the PR because I believe that the implementation is correct.
In the long run, I don't know if the strategy of picking a seed at table-creation time is the right way to protect ourselves from collision attacks. Also, I don't know if hash
has the right API: as a user having to write hash functions, I have often wished for a "digest"-style API where we take as input a hash state and we feed it new data, with a separate finalization step. This is all fairly orthogonal from the present PR -- even though it suggests that the hash API may want to evolve again in the future.
That's the usual approach, AFAIK. Re-seeding an existing table needs a complete rebuild anyway.
This would be nice but is completely orthogonal to the issue at hand. A first step would be to design this API as a separate library, implemented using the C functions from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a second approval, if it can help :-)
Yes, it does, thanks! Will merge soon. |
With this change it is possible to define both seeded and unseeded hash functions in the same module. This breaks code instantianting the
Hashtbl.MakeSeeded
functor, but an OPAM-wide grep shows that not too many packages would be affected:The list only contain the packages directly affected; other packages which vendor one of the previous ones would be affected by transitivity:
Unblocks #8878 #10259