Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PUBDEV-6319: Fix a race condition in group-by implementation #3725

Merged
merged 3 commits into from Jul 25, 2019

Conversation

michalkurka
Copy link
Contributor

No description provided.

@michalkurka michalkurka force-pushed the michalk_fix-group-by branch 3 times, most recently from 1cfa766 to 41b0ca1 Compare July 23, 2019 18:26
@michalkurka michalkurka changed the base branch from rel-yau to master July 23, 2019 18:30
michalkurka added 3 commits July 24, 2019 10:40
This makes sure data is not duplicated when serialized,
this can still have memory impact because keys are not guaranteed to stay
the same as the values.
@michalkurka michalkurka changed the title [DO-NOT-MERGE] Correct usage of NonBlockingHashMap in AstGroup PUBDEV-6319: Fix a race condition in group-by implementation Jul 24, 2019
Copy link
Contributor

@wendycwong wendycwong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you just use a Hashset instead but with it delegating all works to the original nonblockingHashMap iced. Great! I would like to do a timing test after this is merge.

@wendycwong
Copy link
Contributor

Do you want to merge this into rel-yau? Need to fix it there too, no? Wendy

@michalkurka
Copy link
Contributor Author

So you just use a Hashset instead but with it delegating all works to the original nonblockingHashMap iced. Great! I would like to do a timing test after this is merge.

Exactly... that prevents users to use the HashMap the wrong way. But that is just a syntactic sugar; the core of the fix is in the first comment where I swapped working with keys (that can change) to values that are immutable once inserted.

@michalkurka
Copy link
Contributor Author

Do you want to merge this into rel-yau? Need to fix it there too, no? Wendy

Yes, absolutely correct - I will merge into master first and when appropriate backport to a desired Yau fix release.

I didn't see any changes in performance in my tests but I will reach out to you and we can coordinate tests with your test suite.

@michalkurka
Copy link
Contributor Author

@wendycwong I couldn't do full tests yet but multinode tests show the performance is definitely not worse (times in seconds):

New:
Time taken to perform groupby is 129.045972824
Time taken to perform groupby is 139.069935083
Time taken to perform groupby is 136.047094822

Old:
Time taken to perform groupby is 152.91086483
Time taken to perform groupby is 162.909729958
Time taken to perform groupby is 171.010210991

This could be attributed to a faster serialization - we are only serializing the values before we were serializing both keys and values.

@michalkurka michalkurka merged commit 7b92d9c into master Jul 25, 2019
@wendycwong
Copy link
Contributor

@michalk: Nice side effect. You fixed the main problem and then gain more speed along the way. Wish I could do that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants