Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scala.util.hashing.MurmurHash3 produces different outputs on 2.12 and 2.13 #11646

Open
pshirshov opened this issue Jul 20, 2019 · 2 comments

Comments

@pshirshov
Copy link

commented Jul 20, 2019

The following code

  def hash(bytes: Array[Byte]): Array[Byte] = {
    BigInt(MurmurHash3.arrayHash(bytes)).toByteArray
  }
  hash("abc".getBytes(StandardCharsets.UTF_8))

produces different output on 2.12 and 2.13. This is not something expected to happen. In case the implementation is considered unstable/incompatible with others it must be explicitly stated in the documentation.

@Ichoran

This comment has been minimized.

Copy link

commented Jul 20, 2019

I think this is a documentation issue--hashes of this nature (fast, non-cryptographic) tend to evolve rapidly and shouldn't be relied upon to be the same from version to version (not even in point releases!) because annoying weaknesses are often detected and fixed, and you don't want to be (dangerously) tied to old problematic versions for a long time.

Also, it would be good to have a stable (if perhaps deprecated) version that gives you a guaranteed algorithm, even if it is weak. But that's a longer-term fix.

In the short run, assuming that the algorithm matches what it's supposed to, we should just fix the documentation to explain that hashes can't be relied upon across versions, with a brief note about why. (I think for people who work seriously with hashing this would be the expectation, but a lot of people use hashes without becoming hashing experts, and we should help them out.)

@pshirshov

This comment has been minimized.

Copy link
Author

commented Jul 20, 2019

I agree, documentation fix would work. At least it will allow people to avoid unnecessary confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.