receive: Only wait for write quorum #2621

brancz · 2020-05-18T08:57:57Z

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Changes

This patch modifies receive replication slightly, in that it doesn't
always wait for all requests to complete anymore. If quorum amount of
replication requests were successful it now does not wait for the
remaining request to finish as it's not necessary to reach quorum
anymore. In error cases where quorum is not reached, it still continues
to wait for all requests to finish in an attempt to return a quorum
error.

Additionally this patch moves log lines printed in the parallelize
requests function to debug logging. Calling functions already print the
resulting error(s), so this was previously just noise, even in cases
where requests actually succeeded.

Let me know if you think there should be a changelog entry for this, in reality there is not really a user noticable change, other than less noisy logs and lower latency for requests.

Verification

Tested various failure cases manually with the quickstart script to ensure that errors are still printed appropriately.
Duplicated the existing replication tests to 1) just wait for quorum 2) run tests plus consistency delay as on a test run there should never be any problem, verifying overall correctness.

@bwplotka @krasi-georgiev @metalmatze @squat @kakkoyun

kakkoyun

It'd be nice to mention about the added flag in the changelog.
And also I believe this fixes #2567

kakkoyun · 2020-05-18T10:02:02Z

pkg/receive/handler.go

-					level.Error(h.logger).Log("msg", "storing locally", "err", err, "endpoint", endpoint)
-				}
-				ec <- err
+				ec <- errors.Wrapf(err, "storagin locally, endpoint %v", endpoint)


Suggested change

ec <- errors.Wrapf(err, "storagin locally, endpoint %v", endpoint)

ec <- errors.Wrapf(err, "storing locally, endpoint %v", endpoint)

brancz · 2020-05-18T10:54:46Z

@kakkoyun good point! Added! :)

kakkoyun

🥇

metalmatze

Love those integration test additions!
For a full-on review I would need to check this out locally start diving into the gritty details. From a reviewing point of this, all LGTM! 😊 👍

bwplotka

OK I am generally happy with direction, but I am not fan of current parallelizeRequests syntax, we might work on API a bit.

Essentially it confused me, then also confused you (there is a small bug) ;p Let's maybe find something better..

Also you use a lot this errors.Wrap(nil, .... and tsdbError.MultiError.Add(nil) and it just looks too scary to me. It might be just me, but I feel like it should be antipatterns, mentioned in code style ): It's just scares reader a lot (: For me it's just opportunity for easier errors.

bwplotka · 2020-05-18T16:34:29Z

pkg/receive/handler.go

@@ -324,29 +326,39 @@ func (h *Handler) forward(ctx context.Context, tenant string, r replica, wreq *p
 	}
 	h.mtx.RUnlock()

-	return h.parallelizeRequests(ctx, tenant, replicas, wreqs)
+	n, ec := h.parallelizeRequests(ctx, tenant, replicas, wreqs)


why not just for err := range <-ec? without n?

we need to know the potential number of results so we know when we have reached quorum, it could also return the quorum amount instead, but that would really result in the same code

bwplotka · 2020-05-18T16:38:56Z

pkg/receive/handler.go

+	defer func() {
+		go func() {
+			for {
+				err, more := <-ec


defer cancel() for err := range <-ec {

Would do the work I think.

Ok because you relied on err != nil on caller side.... and you forgot about it here, I would really recommend my suggestion in comment above =D

And yes, I understand this is how we know if it was success or not.. but maybe we can come up with cleaner API

The problem is that we want to drain the channel, and stop when it's actually empty and closed, so we need to have the more returned.

Any suggestions for a better API? The problem is that we need to know in advance how many potential results we would be getting from the channel.

But for ... <-ec has exactly the same semantics, no?

It will stop iterating when it's closed. If empty err, more := <-ec will block as well

bwplotka · 2020-05-18T16:40:16Z

pkg/receive/handler.go

-					level.Error(h.logger).Log("msg", "storing locally", "err", err, "endpoint", endpoint)
-				}
-				ec <- err
+				ec <- errors.Wrapf(err, "storing locally, endpoint %v", endpoint)


I think we should just return if no error?

I think you rely on err != nil on caller side, maybe I would do it here for readability, but not a blocker.

errors.Wrap returns nil if the input err is nil

bwplotka · 2020-05-18T16:40:34Z

pkg/receive/handler.go

 			go func(endpoint string) {
-				ec <- h.replicate(ctx, tenant, wreqs[endpoint])
+				defer wg.Done()
+				ec <- errors.Wrap(h.replicate(ctx, tenant, wreqs[endpoint]), "could not replicate write request")


Should pass error only on error?

errors.Wrap returns nil if the input err is nil

I am totally aware of that but IMO it's extremely confusing and prone to error. Plus adds major overhead on critical path.

bwplotka · 2020-05-18T16:43:40Z

pkg/receive/handler.go

+				continue
+			}
+
+			if uint64(countCause(errs, isNotReady)) >= (h.options.ReplicationFactor+1)/2 {


Maybe worth to keep this important number in some function.. (h.options.ReplicationFactor+1)/2

bwplotka · 2020-05-18T16:44:43Z

pkg/receive/handler.go

+					return nil
+				}
+			}
+			errs.Add(err)


I am so confused, why we are passing err nil to multiError?

(it can flow through above)

nil errors are not actually added

Again: I am totally aware of that but IMO it's extremely confusing and prone to error.

brancz · 2020-05-19T08:29:18Z

Also you use a lot this errors.Wrap(nil, .... and tsdbError.MultiError.Add(nil) and it just looks too scary to me. It might be just me, but I feel like it should be antipatterns, mentioned in code style ): It's just scares reader a lot (: For me it's just opportunity for easier errors.

I don't see anything in the style guide that's violated here.

The only other API that I could think of that wouldn't end up in just a rearrangement of the current code is using two channels, one for errors one for successes, but that would complicate draining them a lot.

bwplotka · 2020-05-19T09:06:44Z

I don't see anything in the style guide that's violated here.

Yea, I am proposing to add that.

Plus only this PR is doing so, 100% of Thanos codebase is not putting nil to multierror and does not wrap nils with message.

bwplotka · 2020-05-19T09:08:03Z

Let me think about API, I totally see the aim of it, maybe we can find something cleaner

bwplotka · 2020-05-19T13:29:52Z

pkg/receive/handler.go

+			return ctx.Err()
+		case err, more := <-ec:
+			if !more {
+				return errs


I was thinking about this case, but I thought that this will never happen (: We either have success or errors quorum kind of in my previous version (:

The attempt here is to return the best possible error at the possible cost of higher latency, for example in a 3x replication, and 2x return tsdb-not-ready/unavailable, and 1 conflict, would end up with a generalized error when in reality a retry is likely to resolve the error.

It's a trade off, either better error reporting or lower latency. Since the request is failing in this case anyways, I prefer better errors over latency.

bwplotka

LGTM, just some question (:

bwplotka · 2020-05-19T13:30:47Z

pkg/receive/handler.go

 	}
-	return errors.Wrap(err, "could not replicate write request")
+	if countCause(err, isConflict) >= quorum {


I still think it should len - quorum

Let's take the example of replication factor 3, which has a quorum factor of 2, and we get 1 conflict error. 3-2=1, so we would be returning conflict, even though write quorum was met.

Ok, then it has to be > len(reqs) - quorum, right? (not >=)

I think we are both right if we assume that quorum is always +1 than half. This is however very depending on quorum value.... This algorithm should never assume things like that. Let's say quorum is 1 for some reason, with replication 3, then this logic will not hold true, vs > len(reqs) - quorum is always correct. (:

This is however very depending on quorum value....

what do you mean by this?

We talked offline. Not a blocker so merging.

This patch modifies receive replication slightly, in that it doesn't always wait for all requests to complete anymore. If quorum amount of replication requests were successful it now does not wait for the remaining request to finish as it's not necessary to reach quorum anymore. In error cases where quorum is not reached, it still continues to wait for all requests to finish in an attempt to return a quorum error. Additionally this patch moves log lines printed in the parallelize requests function to debug logging. Calling functions already print the resulting error(s), so this was previously just noise, even in cases where requests actually succeeded. Signed-off-by: Frederic Branczyk <fbranczyk@gmail.com>

bwplotka · 2020-05-20T06:56:11Z

Thanks!

brancz force-pushed the improved-replication branch 3 times, most recently from f60b63f to f6fb934 Compare May 18, 2020 09:18

kakkoyun reviewed May 18, 2020

View reviewed changes

brancz force-pushed the improved-replication branch from f6fb934 to c3cbbd2 Compare May 18, 2020 10:54

brancz force-pushed the improved-replication branch 6 times, most recently from ce24cc8 to 2258e46 Compare May 18, 2020 14:58

kakkoyun approved these changes May 18, 2020

View reviewed changes

metalmatze reviewed May 18, 2020

View reviewed changes

bwplotka reviewed May 18, 2020

View reviewed changes

brancz force-pushed the improved-replication branch 2 times, most recently from 3de93f8 to 9c72f1e Compare May 19, 2020 08:44

brancz force-pushed the improved-replication branch from 437fe34 to 28f7407 Compare May 19, 2020 13:09

bwplotka reviewed May 19, 2020

View reviewed changes

bwplotka approved these changes May 19, 2020

View reviewed changes

brancz force-pushed the improved-replication branch from 28f7407 to 85b03e2 Compare May 19, 2020 17:26

bwplotka merged commit 929864c into thanos-io:master May 20, 2020

brancz deleted the improved-replication branch May 20, 2020 06:56

brancz mentioned this pull request Jun 9, 2020

Thanos receive replication waits for unnecessary requests to finish #2567

Closed

sepich mentioned this pull request Jun 29, 2020

receive: high cpu when upgrading from 0.12.2 with old data #2793

Closed

sepich mentioned this pull request Jul 4, 2020

receive: fanoutForward leads to holes on graphs #2841

Closed

kakkoyun mentioned this pull request Jul 7, 2020

receive: Track replications #2852

Merged

1 task

	ec <- errors.Wrapf(err, "storagin locally, endpoint %v", endpoint)
	ec <- errors.Wrapf(err, "storing locally, endpoint %v", endpoint)

receive: Only wait for write quorum #2621

receive: Only wait for write quorum #2621

Conversation

brancz commented May 18, 2020 • edited Loading

Changes

Verification

kakkoyun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brancz commented May 18, 2020

kakkoyun left a comment

Choose a reason for hiding this comment

metalmatze left a comment

Choose a reason for hiding this comment

bwplotka left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka May 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka May 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brancz commented May 19, 2020

bwplotka commented May 19, 2020

bwplotka commented May 19, 2020

bwplotka May 19, 2020 • edited Loading

Choose a reason for hiding this comment

brancz May 19, 2020 • edited Loading

Choose a reason for hiding this comment

bwplotka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka May 19, 2020 • edited Loading

Choose a reason for hiding this comment

brancz May 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka commented May 20, 2020

brancz commented May 18, 2020 •

edited

Loading

bwplotka left a comment •

edited

Loading

bwplotka May 19, 2020 •

edited

Loading

bwplotka May 18, 2020 •

edited

Loading

bwplotka May 19, 2020 •

edited

Loading

brancz May 19, 2020 •

edited

Loading

bwplotka May 19, 2020 •

edited

Loading

brancz May 19, 2020 •

edited

Loading