Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provider stronger hash codes #39

Closed
davidmorgan opened this issue Nov 11, 2019 · 8 comments
Closed

Provider stronger hash codes #39

davidmorgan opened this issue Nov 11, 2019 · 8 comments
Assignees
Labels
Projects

Comments

@davidmorgan
Copy link

@davidmorgan davidmorgan commented Nov 11, 2019

I was looking for ways to ease operator == and hashCode in Dart and found equatable.

It seems a reasonable approach overall.

But, looking at the implementation, it looks like it uses XOR to combine hashes. This would mean, for example, that

class Point {
  final int x = 3;
  final int y = 4;
}

and

class Point {
  final int x = 4;
  final int y = 3;
}

will hash to the same value. Did I miss something?

I think the correct approach would be to use something like

https://en.wikipedia.org/wiki/Jenkins_hash_function

--which is what package:quiver uses

https://github.com/google/quiver-dart/blob/master/lib/src/core/hash.dart

Good news is it should be possible to add this within the current implementation with no changes to how it's used :)

Thanks!

@felangel

This comment has been minimized.

Copy link
Owner

@felangel felangel commented Nov 11, 2019

Hi @davidmorgan 👋
Thanks for opening an issue!

You bring up a great point and we can definitely make this improvement. I would love to review a pull request if you want to contribute or alternatively, I can try to make this adjustment over the next few days 👍

Thanks again!

@felangel felangel added this to To do in equatable Nov 11, 2019
@davidmorgan

This comment has been minimized.

Copy link
Author

@davidmorgan davidmorgan commented Nov 13, 2019

I currently have a cold so I'm sticking to email right now ;)

BTW, the reason I was looking into this is I'm considering proposing a lint recommending that people not implement operator== and hashCode by hand; I was looking for libraries to point people to as better. I wrote built_value which is one such library, but I didn't want to just point people to my own library :D

I wonder if you know of any others?

@felangel

This comment has been minimized.

Copy link
Owner

@felangel felangel commented Nov 14, 2019

@davidmorgan hope you feel better! That's awesome, I think a lint would be really nice to have. I honestly only know of built_value and equatable as well 😛.

BTW I was thinking about your proposal and am wondering if there's a concrete use-case where XOR would not work? I agree that it theoretically doesn't satisfy all cases, but given the props override is part of the class implementation I don't think there would be a problem unless props dynamically builds the list and the order changes at runtime and even then runtimeType is included in the hashCode generation so it shouldn't be a problem.

I created a gist to test this behavior and it seems to work fine unless I'm missing something.

Thoughts?

@felangel felangel self-assigned this Nov 14, 2019
@davidmorgan

This comment has been minimized.

Copy link
Author

@davidmorgan davidmorgan commented Nov 14, 2019

The problem is that naturally occurring data might lead to a large number of hash collisions.

For example, if you have a Set of Point objects that represent two diagonal lines:

(1, 2), (2, 4), (3, 6), (4, 8), (2, 1), (4, 2), (6, 3), (8, 4)

then these eight objects hash to only four hash codes.

Another natural example would be if you have one field for first name, one field for last name; and then someone called Morgan David hashes to the same code as David Morgan.

Jenkins hash should be very fast to compute--just a little slower than xor by itself--but distributes the hash keys much more evenly, pretty much removing cases where particular patterns in the data can lead to hash collisions. (i.e. the data that leads to hash collisions looks like random data, and you can only hit it either by coincidence--rare enough that it doesn't matter--or by malicious input crafted to cause hash collisions--which is a whole other conversation).

@felangel

This comment has been minimized.

Copy link
Owner

@felangel felangel commented Nov 14, 2019

Makes sense, thanks for clarifying 💯

@felangel felangel mentioned this issue Nov 17, 2019
3 of 3 tasks complete
@felangel felangel moved this from To do to In progress in equatable Nov 17, 2019
@felangel

This comment has been minimized.

Copy link
Owner

@felangel felangel commented Nov 18, 2019

Done in #40 and will be released as part of v1.0.0

@felangel felangel closed this Nov 18, 2019
@felangel felangel moved this from In progress to Done in equatable Nov 18, 2019
@Renesanse

This comment has been minimized.

Copy link

@Renesanse Renesanse commented Nov 19, 2019

Thank you @davidmorgan for improvements!

@davidmorgan

This comment has been minimized.

Copy link
Author

@davidmorgan davidmorgan commented Nov 19, 2019

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
equatable
  
Done
3 participants
You can’t perform that action at this time.