-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap the call to update_index_info in global:trans [JIRA: RIAK-2441] #1370
Conversation
This fixes a race condition that was causing stats corruption if two processes tried to update the data for the same index at the same time. As detailed in the code comments, this may not have the best possible performance characteristics, but it seems to be good enough and is a very easy workaround for the problems we found. For future reference, we originally found and reproduced this bug using yz_aae_test. It didn't always trigger the race, but adding a short timer:sleep call in between the lookup and insert calls made it very easy to catch.
%% contention if lots of different pieces of code are all relying on | ||
%% it at once. | ||
%% | ||
%% If we do ever want to remove the call to global:trans and replace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rewrite the "If we ever do" part as simply:
If we find that global:trans/3 is a bottleneck, we should explore other options.
And leave it at that. I'm still not sure a CRDT-style conflict resolution system is necessary for this - if nothing, the "put a process in front of the updates" approach, or if it's too slow, a pool of processes, is a simpler approach than attempting to do conflict resolution here.
Otherwise, +1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JeetKunDoug I see your point, but I did want to leave some reference to a conflict resolution approach, rather than just burying the idea. With all the focus on optimization lately, it seems like it's worth preserving an idea that could potentially be faster than using a worker pool, particularly in an area that could become a bottleneck. I'll definitely tone down the comment a bit, but I'd still like to leave a passing mention to the idea in case it turns out to be a good approach in the future. Deal?
Have discussed this with both Torben and Jon - unless our performance testing finds an issue, will move forward with this. |
👍 7058ad7 |
…info Wrap the call to update_index_info in global:trans Reviewed-by: JeetKunDoug
@borshop merge |
create jira issue |
Merged to 2.0 - needs forward merge _[posted via JIRA by Douglas Rohrer]_ |
Forward merge complete. Marking as done. _[posted via JIRA by Nick Marino]_ |
This fixes a race condition that was causing stats corruption if two
processes tried to update the data for the same index at the same time.
As detailed in the code comments, this may not have the best possible
performance characteristics, but it seems to be fine in practice and is a
very easy workaround for the problems we found.
For future reference, we originally found and reproduced this bug using
yz_aae_test. It didn't always trigger the race, but adding a short
timer:sleep call in between the lookup and insert calls made it very
easy to catch.