-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unblock relaxed write consistency level #3901
Unblock relaxed write consistency level #3901
Conversation
@@ -274,32 +278,32 @@ func (w *PointsWriter) writeToShard(shard *meta.ShardInfo, database, retentionPo | |||
var wrote int | |||
timeout := time.After(w.WriteTimeout) | |||
var writeError error | |||
for _, nodeID := range shard.OwnerIDs { | |||
for i := 0; i < len(shard.OwnerIDs); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this change? I don't see us using i
, but I could be missing it. The original is much clearer and idiomatic in Go.
If it was to stop using the nodeID
, you can simply do:
for range shard.OwnerIDS {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The AsyncWriteResult
and for loop change is unnecessary.
Moving the |
@takayuki If you could rebase, revert the looping change, and make sure you've signed the CLA. This change should be good to go. Thanks for the fix! |
Thank you @jwilder Would you spend some more with me, letting me think about the concern I originally had? There are two loops involved here, one for starting goroutines and another for waiting for them to complete their tasks. The problem is, I thought, the second loop might be getting the results in a different order from one in which goroutines started. That's why I had to add AsyncWriteResult handling out-of-order arrivals. Here is a conceptual example, taking the suggestion from @corylanou; the first loop starts node#1, then node#2, whereas the second loop is going to get the results from node#2, then node#1. I thought PointsWriter would have to take it into account.
|
This patch introduces an option in influx allowing users to select write consistency level, which has been implicitly set to 'any' by default so far.
Immediately return once the required number of writes are completed, otherwise requests running with relaxed consistency levels (e.g. any or one) would be blocked unexpectedly, for instance, waiting for dead nodes to respond.
Yeah, that's what I thought too @takayuki -- the new loop made sense to me. |
I've confirmed @takayuki has signed the CLA. |
5d5a650
to
da8efa5
Compare
@takayuki @otoolep The order in which the response arrive does not matter from a write perspective. This code is writing to all shard owners in parallel and returning as soon as it receives the required number of successful responses. It does look like there is a bug in the logging here though: https://github.com/influxdb/influxdb/pull/3901/files#diff-99b560e650ccc1fa3f20c23df5a9b41bL287 which might be what the looping change was intended to fix. It could log the wrong |
…ncy-level Unblock relaxed write consistency level
Thanks @takayuki ! |
Yes, your second point is the one @takayuki and I were making, it was just a logging thing. |
This pull request contains two commits; one introducing an option in cmd/influx to select write consistency level, which has been set to 'any' by default, and another fixing writes possibly blocked even with relaxed write consistency level (e.g. any, one, or quorum).
Assume you have a 3-node cluster and one of those nodes went down. Write requests with write consistency level more relaxed than, or equal to quorum, should be completed immediately and successfully. But, the requests are possibly blocked waiting the dead node to respond.