Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-5816: Hash function produces skewed results on String values wi… #959

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
3 participants
@sohami
Copy link
Contributor

commented Sep 25, 2017

…th same leading prefix

        Note: Changing hash32 computation to use Murmur3.hash32 instead of int casted version of Murmur3.hash64
DRILL-5816: Hash function produces skewed results on String values wi…
…th same leading prefix

            Note: Changing hash32 computation to use Murmur3.hash32 instead of int casted version of Murmur3.hash64

@sohami sohami force-pushed the sohami:DRILL-5816 branch from 4b9dd5b to 711f750 Sep 25, 2017

@sohami

This comment has been minimized.

Copy link
Contributor Author

commented Sep 25, 2017

@amansinha100 - Please help to review.

@paul-rogers

This comment has been minimized.

Copy link
Contributor

commented Sep 26, 2017

The change looks benign. I wonder; do we have a test case that exercises the hash functions so we can be certain that the fix actually solves the original problem?

@sohami

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2017

@paul-rogers - I have added few tests and findings using hash32 and hash64 to compute the 32 bit hash codes in JIRA.

@amansinha100

This comment has been minimized.

Copy link
Contributor

commented Sep 26, 2017

lgtm. +1. @sohami could you pls check with @chunhui-shi if unit tests were added previously for DRILL-4237 (another skew issue)? If not, could you create a new JIRA to add such tests, which should include the data set for DRILL-4237, DRILL-4119, DRILL-5816 among others.

@sohami

This comment has been minimized.

Copy link
Contributor Author

commented Sep 29, 2017

I have created DRILL-5287 for adding tests.

@asfgit asfgit closed this in d59421b Oct 2, 2017

ilooner pushed a commit to ilooner/drill that referenced this pull request Oct 18, 2017

DRILL-5816: Hash function produces skewed results on String values wi…
…th same leading prefix

            Note: Changing hash32 computation to use Murmur3.hash32 instead of int casted version of Murmur3.hash64

closes apache#959
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.