Rename `Hashtbl.SeededHashedType.{hash => seeded_hash}` #11157

nojb · 2022-04-04T12:06:47Z

With this change it is possible to define both seeded and unseeded hash functions in the same module. This breaks code instantianting the Hashtbl.MakeSeeded functor, but an OPAM-wide grep shows that not too many packages would be affected:

distributed
extlib
h2
lru
memtrace
ocaml-in-python
stdcompat
tezos-lwt-result-stdlib

The list only contain the packages directly affected; other packages which vendor one of the previous ones would be affected by transitivity:

coccinelle (stdcompat)
dream (h2)
piaf (h2)
ocaml-gist (h2)
tezos (all packages) (tezos-lwt-result-stdlib)

Unblocks #8878 #10259

xavierleroy · 2022-04-04T14:43:03Z

I forgot about this issue, thanks for resurrecting it! I like the proposed change very much, as it makes it possible to export both kinds of hash function from the same module. In turn this facilitates the use of seeded hash tables.

nojb · 2022-04-04T15:15:58Z

cc @ygrek @thierry-martinez @anmonteiro @pqwy @aantron the maintainers of some/most of the affected packages (not sure who is a contact for the Tezos codebase)

nojb · 2022-04-05T07:28:13Z

Also cc @raphael-proust @yurug for the Tezos packages

raphael-proust · 2022-04-05T07:48:41Z

AFAICT, it'd be relatively easy to adapt tezos-lwt-result-stdlib.
There might be some other minor things that need to be adapted in the tezos codebase but not much.

I'm surprised ringo is not affected. I'll have a look into this.

My main question is: How can we make a library that's compatible with both the existing interface and the proposed interface? In the general case do we need a preprocessor with an OCaml version check?

nojb · 2022-04-05T07:53:33Z

My main question is: How can we make a library that's compatible with both the existing interface and the proposed interface? In the general case do we need a preprocessor with an OCaml version check?

I don't see an easy way without a preprocessor. Another possibility is to depend on a compatibility shim such as https://github.com/thierry-martinez/stdcompat (which itself implements the preprocessor logic).

nojb · 2022-04-05T07:54:53Z

I'm surprised ringo is not affected.

It does not seem to use seeded hash tables.

thierry-martinez · 2022-04-05T10:15:44Z

I would be very happy with this change, and I think that compatibility shims (generic existing ones and/or a dedicated one) will be enough to have a convenient transition path.

nojb · 2022-04-06T06:59:40Z

There is buy-in from the maintainers of most of the affected packages, so I think this PR is on the right path. We still need an official review or two of this easy PR. Perhaps @alainfrisch and @gasche could do the honours?

gasche

I have approved the PR because I believe that the implementation is correct.

In the long run, I don't know if the strategy of picking a seed at table-creation time is the right way to protect ourselves from collision attacks. Also, I don't know if hash has the right API: as a user having to write hash functions, I have often wished for a "digest"-style API where we take as input a hash state and we feed it new data, with a separate finalization step. This is all fairly orthogonal from the present PR -- even though it suggests that the hash API may want to evolve again in the future.

xavierleroy · 2022-04-06T15:43:55Z

I don't know if the strategy of picking a seed at table-creation time is the right way to protect ourselves from collision attacks

That's the usual approach, AFAIK. Re-seeding an existing table needs a complete rebuild anyway.

Also, I don't know if hash has the right API: as a user having to write hash functions, I have often wished for a "digest"-style API where we take as input a hash state and we feed it new data, with a separate finalization step.

This would be nice but is completely orthogonal to the issue at hand. A first step would be to design this API as a separate library, implemented using the C functions from <caml/hash.h>, or just an OCaml reimplementation of MurmurHash.

xavierleroy

Here is a second approval, if it can help :-)

nojb · 2022-04-06T17:30:04Z

Here is a second approval, if it can help :-)

Yes, it does, thanks!

Will merge soon.

nojb force-pushed the seeded_hash branch from 8b20352 to b4c1856 Compare April 4, 2022 12:30

Hashtbl.SeededHashedType.{hash => seeded_hash}

b6cdd19

nojb force-pushed the seeded_hash branch from b4c1856 to a6bdd14 Compare April 4, 2022 12:30

gasche approved these changes Apr 6, 2022

View reviewed changes

xavierleroy approved these changes Apr 6, 2022

View reviewed changes

Changes

5cb0493

nojb force-pushed the seeded_hash branch from a6bdd14 to 5cb0493 Compare April 6, 2022 17:29

nojb merged commit eaa2cb3 into ocaml:trunk Apr 6, 2022

nojb deleted the seeded_hash branch April 6, 2022 17:49

nojb mentioned this pull request Apr 6, 2022

Add String.hash and String.seeded_hash #8878

Merged

patricoferris mentioned this pull request Apr 9, 2022

OCaml 5.0.0 Seeded Hash Compatibility anmonteiro/ocaml-h2#167

Closed

nojb mentioned this pull request May 9, 2022

Add hash, seeded_hash to Int, Char, Bool, Float, Int32, Int64, Nativeint #11246

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename `Hashtbl.SeededHashedType.{hash => seeded_hash}` #11157

Rename `Hashtbl.SeededHashedType.{hash => seeded_hash}` #11157

nojb commented Apr 4, 2022

xavierleroy commented Apr 4, 2022

nojb commented Apr 4, 2022

nojb commented Apr 5, 2022

raphael-proust commented Apr 5, 2022

nojb commented Apr 5, 2022

nojb commented Apr 5, 2022

thierry-martinez commented Apr 5, 2022

nojb commented Apr 6, 2022

gasche left a comment

xavierleroy commented Apr 6, 2022

xavierleroy left a comment

nojb commented Apr 6, 2022

Rename Hashtbl.SeededHashedType.{hash => seeded_hash} #11157

Rename Hashtbl.SeededHashedType.{hash => seeded_hash} #11157

Conversation

nojb commented Apr 4, 2022

xavierleroy commented Apr 4, 2022

nojb commented Apr 4, 2022

nojb commented Apr 5, 2022

raphael-proust commented Apr 5, 2022

nojb commented Apr 5, 2022

nojb commented Apr 5, 2022

thierry-martinez commented Apr 5, 2022

nojb commented Apr 6, 2022

gasche left a comment

Choose a reason for hiding this comment

xavierleroy commented Apr 6, 2022

xavierleroy left a comment

Choose a reason for hiding this comment

nojb commented Apr 6, 2022

Rename `Hashtbl.SeededHashedType.{hash => seeded_hash}` #11157

Rename `Hashtbl.SeededHashedType.{hash => seeded_hash}` #11157