Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: reject requests when quiescing #51566

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

andreimatei
Copy link
Contributor

This patch makes the Store reject request once it's stopper is
quiescing. It also makes quiescing wait until all requests have been
drained. It also makes requests get canceled on quiescing so that they
drain faster.

Before this patch, we didn't seem to have good protection against
requests not running after the stopper has been stopped. We've seen this
in some tests, where requests were racing with the engine closing.
Running after the stopper has stopped is generally pretty undefined
behavior, so let's avoid it.
I think the reason why we didn't see a lot of errors from such races is
that we're stopping the gRPC server even before we start quiescing, so
at least for requests coming from remote nodes we had some draining
window.

A gotcha of this commit is that async tasks spun off by requests will
now get canceled when the requests finish. Before, most request's did
not have a cancelable ctx, so the point was moot. This cancelation
behavior is the right thing to do; callers to, say,
stopper.RunAsyncTasks() are in charge of passing a non-cancelable
context if they don't want cancelation.

Release note: None

Release note: None
Some stopper.RunAsyncTask's want to inherit the caller's cancelation,
some don't. RunAsyncTask inherits. This patch adds a flavor that doesn't
inherit, and uses it for a couple of kvserver tasks.

Besides being the right thing to do, this comes in preparation of the
next commit where the contexts of all request evaluations will be
canceled once the evaluation is done - at which point async tasks that
didn't want to be canceled will hurt.

Release note: None
Command application sometimes inherits a context from a local request.
The local request's ctx might be cancelable. This patch makes the
derived application context not inherit the cancelation. Nobody should
be checking for said cancelation, but making sure that even if they
checked they wouldn't find a cancelation seems like a good safeguard
since the consequences of not applying a command are bad.

This is in preparation of the next commit which makes all requests
evaluate under cancelable contexts.

Release note: None
This patch makes the Store reject request once it's stopper is
quiescing. It also makes quiescing wait until all requests have been
drained. It also makes requests get canceled on quiescing so that they
drain faster.

Before this patch, we didn't seem to have good protection against
requests not running after the stopper has been stopped. We've seen this
in some tests, where requests were racing with the engine closing.
Running after the stopper has stopped is generally pretty undefined
behavior, so let's avoid it.
I think the reason why we didn't see a lot of errors from such races is
that we're stopping the gRPC server even before we start quiescing, so
at least for requests coming from remote nodes we had some draining
window.

A gotcha of this commit is that async tasks spun off by requests will
now get canceled when the requests finish. Before, most request's did
not have a cancelable ctx, so the point was moot. This cancelation
behavior is the right thing to do; callers to, say,
stopper.RunAsyncTasks() are in charge of passing a non-cancelable
context if they don't want cancelation.

Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on!
Also, heads up - I will be out for the rest of the week pretty much, so I hope @nvanbenschoten can help you get this merged if we run out of time today.

Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 2 of 2 files at r3, 1 of 1 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, and @nvanbenschoten)


pkg/kv/kvserver/replica_application_decoder.go, line 147 at r3 (raw file):

			// is not cancelable; nobody should be checking for this ctx's
			// cancelation, but let's not tempt the users.
			cmd.ctx = d.r.AnnotateCtx(opentracing.ContextWithSpan(context.Background(), cmd.sp))

I have mixed feelings about this as well. This will erase profiler labels, and whatever else the context may carry later.

If cancellation were just a property of the context without added semantics, we could just

type uncancelledContext struct {
  context.Context
}

var _ context.Context = (*uncancelledContext)(nil)

func (c *uncanceledContext) Err() error { return nil }
func (c *uncanceledContext) Done() <-chan struct{} { return nil }

but this is just another footgun. For example, the spans in a cancelled context are probably being re-used in another span and you can't touch them. Seeing this, I think we would need a contextutil.Transfer(from, to context.Context) context.Context method that picks out everything that can move over (like tags) and handles that which can't accordingly (fork trace span etc). Idk, there seems to be a bunch of work to do. I guess I'm okay with this in the meantime.


pkg/kv/kvserver/replica_proposal.go, line 235 at r2 (raw file):

	// Compute SHA asynchronously and store it in a map by UUID.

	// Don't inherit the ctx cancelation.

isn't it "cancellation"?


pkg/kv/kvserver/replica_proposal.go, line 236 at r2 (raw file):

	// Don't inherit the ctx cancelation.
	if err := stopper.RunAsyncTaskNoCancelation(ctx, "storage.Replica: computing checksum", func(ctx context.Context) {

I'm lukewarm on this since you could just context.Background() on the callers. I guess this would go away with the options so I don't feel strongly.


pkg/kv/kvserver/store_send.go, line 44 at r4 (raw file):

	ctx context.Context, ba roachpb.BatchRequest,
) (*roachpb.BatchResponse, *roachpb.Error) {
	// Run the request as a task so that it blocks the closing of the stopper

I thought I had cautioned about this approach when we chatted. a) I have slight concerns about the extensive use of WithCancelOnQuiesce in the hot path (@nvanbenschoten can probably tell us whether this matters) which both acquires an exclusive lock and allocates under it. b) we might immediately catch new flakes due to the pervasive addition of aggressive context cancellation. This is good in the long run but bad in the short run. I just want to point this out, we'll need to be vigilant here - I assume you did review all of the tasks in the request path for this.


pkg/util/stop/stopper.go, line 333 at r2 (raw file):

}

// RunAsyncTaskNoCancelation is like RunAsyncTask except that it runs the task

Looking at this again, I think this method is generally a bad idea. You make a new context in the method, but there is no "inventory" of what needs to be carried over into the new context. Instead, this method will silently lose anything not in the log tags. That is not great, I would much rather be explicit about that at the caller. Maybe, with the appropriate weighing of options we would still want the method, but since we're not currently focusing on this API, I would hold off on adding potential footguns to it.


pkg/util/stop/stopper.go, line 337 at r2 (raw file):

//
// TODO(andrei): Another thing that's reasonable for the caller to want is to
// not pass a cancelation, but for the take to get a cancelation on quiescing

task


pkg/util/stop/stopper.go, line 339 at r2 (raw file):

// not pass a cancelation, but for the take to get a cancelation on quiescing
// (WithCancelOnQuiesce). We should refactor this method into a set of options
// to RunAsyncTask.

+1


pkg/util/tracing/tracer.go, line 648 at r3 (raw file):

	if span := opentracing.SpanFromContext(ctx); span != nil {
		if _, noop := span.(*noopSpan); noop {
			// Optimization: avoid ContextWithSpan call if tracing is disabled.

But now you've lost the optimization? You need this check in ForkCtxSpan.

@nvanbenschoten
Copy link
Member

@andreimatei were we aware of grpc.Server.GracefulStop when writing this patch? It seems like gRPC already provides a way to drain active RPCs.

@tbg
Copy link
Member

tbg commented Aug 11, 2020 via email

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, @nvanbenschoten, and @tbg)


pkg/kv/kvserver/replica_application_decoder.go, line 147 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I have mixed feelings about this as well. This will erase profiler labels, and whatever else the context may carry later.

If cancellation were just a property of the context without added semantics, we could just

type uncancelledContext struct {
  context.Context
}

var _ context.Context = (*uncancelledContext)(nil)

func (c *uncanceledContext) Err() error { return nil }
func (c *uncanceledContext) Done() <-chan struct{} { return nil }

but this is just another footgun. For example, the spans in a cancelled context are probably being re-used in another span and you can't touch them. Seeing this, I think we would need a contextutil.Transfer(from, to context.Context) context.Context method that picks out everything that can move over (like tags) and handles that which can't accordingly (fork trace span etc). Idk, there seems to be a bunch of work to do. I guess I'm okay with this in the meantime.

I actually like this uncancelledContext. I don't really see the footgun. You say something about spans being reused, but I don't think that'd ever be the case, at least not without extra synchronization. The simple fact that you know that a ctx is canceled doesn't let you infer anything... Whatever was running under that ctx might still be running. Even when there is some extra synchronization and you know that something has responded to cancelation, we don't reuse spans in non-thread-safe ways because they can always be referenced from async spans (there's an exception for a special immutable span that we always share. Spans generally have internal synchronization where they need it.

So I like this idea and I think we should try it. I'm really bothered by the lack of options in RunAsyncTask - this PR shows 3 examples of tasks that really should not be canceled.
No?

andreimatei added a commit to andreimatei/cockroach that referenced this pull request Aug 14, 2020
This patch makes the Store reject requests once its stopper is
quiescing.

Before this patch, we didn't seem to have good protection against
requests not running after the stopper has been stopped. We've seen this
in some tests, where requests were racing with the engine closing.
Running after the stopper has stopped is generally pretty undefined
behavior, so let's avoid it.
I think the reason why we didn't see a lot of errors from such races is
that we're stopping the gRPC server even before we start quiescing, so
at least for requests coming from remote nodes we had some draining
window.

This is a lighter version of cockroachdb#51566. That patch was trying to run
requests as tasks, so that they properly synchronize with server
shutdown. This patch still allows races between requests that started
evaluating the server started quiescing and server shutdown.

Touches cockroachdb#51544

Release note: None
@andreimatei
Copy link
Contributor Author

The new #52843 is a lighter version of the shutdown part of this PR.

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Aug 17, 2020
…eation

This commit optimizes the Stopper for task creation by ripping out the
existing heavyweight task tracking in production builds. I realized that
my biggest concern with most of the proposals (cockroachdb#52843 and cockroachdb#51566) being
floated to address cockroachdb#51544 was that they bought more into the inefficient
tracking in the Stopper, not that they were doing anything inherently
wrong themselves.

Before this change, creating a task acquired an exclusive mutex and then
wrote to a hashmap. At high levels of concurrency, this would have
become a performance chokepoint. After this change, the cost of
launching a Task is three atomic increments – one to acquire a read
lock, one to register with a WaitGroup, and one to release the read
lock. When no one is draining the Stopper, these are all wait-free
operations, which means that task creation becomes wait-free.

With a change like this, I would feel much more comfortable pushing on
Stopper tasks to solve cockroachdb#51544.
@tbg tbg added the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021
@nvanbenschoten nvanbenschoten removed their request for review September 27, 2022 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
X-noremind Bots won't notify about PRs with X-noremind
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants