New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a sane multiplier in @EqualsAndHashCode #660

Closed
lombokissues opened this Issue Jul 14, 2015 · 8 comments

Comments

Projects
None yet
1 participant
@lombokissues
Collaborator

lombokissues commented Jul 14, 2015

Migrated from Google Code (issue 625)

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

👤 Maaartinus   🕗 Jan 09, 2014 at 16:09 UTC

For whatever crazy reason, Java uses in many places the "proven multiplier" 31 in the hashCode computation. This leads to more collisions than necessary, partly because of the unfortunate number, but mainly because of re-using this number in different places.

My attached example shows collisions due to using 31 in String, List, and lombok-generated hashCode. For the 256 generated objects, there are only 64 unique hashes. If lombok used a better multiplier, this number would increase to 144.

Here, nearly any odd number would do. Using 31 is one of the few bad choices.

Collaborator

lombokissues commented Jul 14, 2015

👤 Maaartinus   🕗 Jan 09, 2014 at 16:09 UTC

For whatever crazy reason, Java uses in many places the "proven multiplier" 31 in the hashCode computation. This leads to more collisions than necessary, partly because of the unfortunate number, but mainly because of re-using this number in different places.

My attached example shows collisions due to using 31 in String, List, and lombok-generated hashCode. For the 256 generated objects, there are only 64 unique hashes. If lombok used a better multiplier, this number would increase to 144.

Here, nearly any odd number would do. Using 31 is one of the few bad choices.

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

👤 Maaartinus   🕗 Jan 09, 2014 at 16:09 UTC

🔗 LombokHashCodeDemo.java View file

Collaborator

lombokissues commented Jul 14, 2015

👤 Maaartinus   🕗 Jan 09, 2014 at 16:09 UTC

🔗 LombokHashCodeDemo.java View file

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

👤 reinierz   🕗 Jan 09, 2014 at 20:27 UTC

Such a shame that THIS issue isn't itself a prime number, or I would have used that. Should have waited until 631 ;P

There are some caveats to just changing the implementation on a dime, but, you made your case quite well and we gotta go through this pain at some point, right?

Crossing my fingers that this won't break stuff, I've committed to a randomly chosen shortish prime number (277 is the lucky winner). While I was there, I also updated the primes used to stuff booleans into hashCode (they, too, were directly copied from Effective Java, which is where the '31' comes from and probably explains why its use is so widespread). Instead of 1231 and 1237, now we use 2591 and 2609.

I don't think there's a need to spin out an edge release for this update, but, we'll release soon, so, coming to a maven repo near you soon :P

Collaborator

lombokissues commented Jul 14, 2015

👤 reinierz   🕗 Jan 09, 2014 at 20:27 UTC

Such a shame that THIS issue isn't itself a prime number, or I would have used that. Should have waited until 631 ;P

There are some caveats to just changing the implementation on a dime, but, you made your case quite well and we gotta go through this pain at some point, right?

Crossing my fingers that this won't break stuff, I've committed to a randomly chosen shortish prime number (277 is the lucky winner). While I was there, I also updated the primes used to stuff booleans into hashCode (they, too, were directly copied from Effective Java, which is where the '31' comes from and probably explains why its use is so widespread). Instead of 1231 and 1237, now we use 2591 and 2609.

I don't think there's a need to spin out an edge release for this update, but, we'll release soon, so, coming to a maven repo near you soon :P

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

👤 reinierz   🕗 Jan 15, 2014 at 21:26 UTC

This fix is now part of the latest official release. 1.12.4 was just pushed for download and to maven central.

Collaborator

lombokissues commented Jul 14, 2015

👤 reinierz   🕗 Jan 15, 2014 at 21:26 UTC

This fix is now part of the latest official release. 1.12.4 was just pushed for download and to maven central.

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

👤 Maaartinus   🕗 Jan 15, 2014 at 23:23 UTC

Surprisingly, there's a tiny speed penalty for using such a "huge" multiplier as 227. This penalty is 1-2% in the case of integer pair (which is I guess the maximum as it minimizes the other computations). I'd bet that in any realistic scenario, this is more than offset by the reduction of hash collisions, but I'd suggest to change the multiplier again to avoid this penalty.

An amd86 CPU can encode an immediate operand in a single byte if it fits, otherwise it uses as many bytes as the instruction works with. Since the byte is signed, 227 doesn't fit in, so four bytes instead of one get used, which may make the instruction fetch to a bottleneck. I was curious if this can be actually measured in Java, and it looks like it can and the results are fairly consistent:
https://microbenchmarks.appspot.com/runs/9237a1d7-6836-441e-ab77-7f4ab218b8b7﹟r:scenario.benchmarkSpec.parameters.seed,scenario.benchmarkSpec.parameters.use227

I'd suggest to use the biggest prime fitting in a signed byte, i.e., 113.

Collaborator

lombokissues commented Jul 14, 2015

👤 Maaartinus   🕗 Jan 15, 2014 at 23:23 UTC

Surprisingly, there's a tiny speed penalty for using such a "huge" multiplier as 227. This penalty is 1-2% in the case of integer pair (which is I guess the maximum as it minimizes the other computations). I'd bet that in any realistic scenario, this is more than offset by the reduction of hash collisions, but I'd suggest to change the multiplier again to avoid this penalty.

An amd86 CPU can encode an immediate operand in a single byte if it fits, otherwise it uses as many bytes as the instruction works with. Since the byte is signed, 227 doesn't fit in, so four bytes instead of one get used, which may make the instruction fetch to a bottleneck. I was curious if this can be actually measured in Java, and it looks like it can and the results are fairly consistent:
https://microbenchmarks.appspot.com/runs/9237a1d7-6836-441e-ab77-7f4ab218b8b7﹟r:scenario.benchmarkSpec.parameters.seed,scenario.benchmarkSpec.parameters.use227

I'd suggest to use the biggest prime fitting in a signed byte, i.e., 113.

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

👤 Maaartinus   🕗 Jan 15, 2014 at 23:23 UTC

🔗 LombokHashCodeBenchmark.java View file

Collaborator

lombokissues commented Jul 14, 2015

👤 Maaartinus   🕗 Jan 15, 2014 at 23:23 UTC

🔗 LombokHashCodeBenchmark.java View file

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

👤 r.spilker   🕗 Jan 19, 2014 at 20:53 UTC

We now chose, using pseudo random selection, 59 for the PRIME and 79 and 97 for boolean true and false. It is available in the edge release: https://projectlombok.org/download-edge.html

Collaborator

lombokissues commented Jul 14, 2015

👤 r.spilker   🕗 Jan 19, 2014 at 20:53 UTC

We now chose, using pseudo random selection, 59 for the PRIME and 79 and 97 for boolean true and false. It is available in the edge release: https://projectlombok.org/download-edge.html

@lombokissues

This comment has been minimized.

Show comment
Hide comment
@lombokissues

lombokissues Jul 14, 2015

Collaborator

End of migration

Collaborator

lombokissues commented Jul 14, 2015

End of migration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment