apiserver: terminate watch with rate limiting during shutdown #114925

tkashem · 2023-01-09T19:27:29Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #111886 (it addresses termination of active watches during shutdown)

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2023-01-09T19:27:31Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

tkashem · 2023-01-09T19:27:49Z

/ok-to-test

tkashem · 2023-01-09T19:32:45Z

/cc @wojtek-t @marseel

wojtek-t · 2023-01-09T20:02:51Z

@tkashem - I think this is achieving roughly the same thing as: #113546

If so - I think I would prefer Marcel`s one as this is more consistent with our existing shutdown sequence, no?

wojtek-t · 2023-01-09T20:03:07Z

/hold

wojtek-t · 2023-01-09T20:04:41Z

Or actually wait - you're not storing additional info here, so that is more efficient in terms of memory. Let me take a deeper look.

wojtek-t · 2023-01-09T20:21:26Z

staging/src/k8s.io/apiserver/pkg/endpoints/handlers/watch.go

+			// too late, the net/http server has shutdown and
+			// probably has returned an error to the caller
+			return
+		case <-s.ServerTerminationWindow.ShuttingDown():


@tkashem I have a feeling that it should be possible to do that more consistently with the current support for short-running requests.

Currently we have WithWaitGroup that is waiting for short-running requests. And only that is a signal for further steps of graceful shutdown. I think that we should be consistent with that for long-running.

So what I would try is to:

plumb to exact same place (just introduce a second SafeWaitGroup for long-running requests

expose a channel of IsShuttingDown() from the SafeWaitGroup that will trigger exactly this code path

introduce the late-limiter into this SafeWaitGroup and change here to sth like:

s.AcquireToken()` return

[The above suggests that it shouldn't be SafeWaitGroup but a new type being a wrapper around it.]

Now - you additionally don't need to pass the ServerTerminationWindow across the codebase - all you need is to add that to the context in the WithWaitGroup filter.

So finally, this probably no longer belongs to WithWaitGroup, it should be a separate filter doing that.

I think that would be cleaner both than this PR and than @marseel PR...

@tkashem @marseel
thoughts?

I think it would make sense to POC that anyway - I would love to try it, though I probably won't get to it for at least next two weeks...

all you need is to add that to the context in the WithWaitGroup filter.

This sounds good to me, i wired the context

plumb to exact same place (just introduce a second SafeWaitGroup for long-running requests

do we need to keep track of the active watches? currently all active watches return at the same moment when net/http server closes, I thought the goal was to make sure these active watches return with a rate limiter or random sleep.

I think it would make sense to POC that anyway - I would love to try it, though I probably won't get to it for at least next two weeks...

I am happy to tackle it, if you and @marseel think this approach is in the right direction

do we need to keep track of the active watches? currently all active watches return at the same moment when net/http server closes, I thought the goal was to make sure these active watches return with a rate limiter or random sleep.

Yes - the goal is to rate-limit them.
But if there are very few watches and I'm setting a window of say 1m (I expect setting around this number to be most common), I don't want to wait 1m, but rather shut them down quicker, because that would be safe.

So while i don't need to know the exact watches, I need to know their number to know when they are done.

I am happy to tackle it, if you and @marseel think this approach is in the right direction

I'm not 100% convinced because I'm not sure if we won't find something unexpected in the code that would make this harder. But we also won't know until we POC it.

But if there are very few watches and I'm setting a window of say 1m (I expect setting around this number to be most common), I don't want to wait 1m, but rather shut them down quicker, because that would be safe.

Yea, I think it makes sense. We could just rate limit to max(some constant like 100, #watches/timeout) QPS.

I am happy to tackle it, if you and Marcel Zięba think this approach is in the right direction

I personally like your approach more than mine :). Once we have working POC I can run some scale tests to see if everything works as expected on large scale.

I personally like your approach more than mine :). Once we have working POC I can run some scale tests to see if everything works as expected on large scale.

I think due to the fact of not maintaining a single cache of all connections I like it more too. And by addressing my comments, I think we can make it even cleaner, I hope.

So yes - if you have some time Abu, if you could POC it, it would be great !

Thanks @wojtek-t @marseel

Yes, I am going to work on it, I am also going to add active watch requests to the graceful termination unit tests we have and and make sure it works as expected. Once the new unit tests pass you can test it at scale.

sttts · 2023-02-27T10:51:56Z

staging/src/k8s.io/apiserver/pkg/endpoints/handlers/watch.go

 	"k8s.io/apiserver/pkg/util/wsstream"
 )

 // nothing will ever be sent down this channel
 var neverExitWatch <-chan time.Time = make(chan time.Time)

+// nothing will ever be sent down this channel, used when the request
+// context does not have a server shutdown signal associated
+var neverShuttingDownCh <-chan struct{} = make(chan struct{})


out of curiosity, why is this global?

Or even just set nil as the shutdown channel. Nil blocks on select.

yeah, i don't thin we need it, i removed it, thanks

sttts · 2023-02-27T10:54:00Z

staging/src/k8s.io/apiserver/pkg/endpoints/handlers/watch.go

@@ -156,7 +167,8 @@ type WatchServer struct {
 	// used to correct the object before we send it to the serializer
 	Fixup func(runtime.Object) runtime.Object

-	TimeoutFactory TimeoutFactory
+	TimeoutFactory       TimeoutFactory
+	ServerShuttingdownCh <-chan struct{}


nit: ServerShuttingDownCh

sttts · 2023-02-27T11:02:57Z

staging/src/k8s.io/apiserver/pkg/server/genericapiserver.go

@@ -260,6 +266,20 @@ type GenericAPIServer struct {
 	// If enabled, after ShutdownDelayDuration elapses, any incoming request is
 	// rejected with a 429 status code and a 'Retry-After' response.
 	ShutdownSendRetryAfter bool
+
+	// ShutdownWatchTerminationGracePeriod, if set to a positive value,
+	// is the maximum grace period the apiserver will allow for


what does "maximum" mean here? Is there another grace period that might overrule this?

i updated the comments, let me know if it's more clear now

sttts · 2023-02-27T11:06:35Z

staging/src/k8s.io/apiserver/pkg/server/genericapiserver.go

+		// Wait for all active watches to finish
+		grace := s.ShutdownWatchTerminationGracePeriod
+		activeBefore, activeAfter, err := s.WatchRequestWaitGroup.Wait(func(count int) (utilwaitgroup.RateLimiter, context.Context, context.CancelFunc) {
+			// TODO: this is back of the envelope for now


what's written on the envelop? What's the oneliner of intuition here?

comment fixed

sttts · 2023-02-27T11:09:29Z

staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go

@@ -950,6 +1011,7 @@ func newGenericAPIServer(t *testing.T, fAudit *fakeAudit, keepListening bool) *G
 	config, _ := setUp(t)
 	config.ShutdownDelayDuration = 100 * time.Millisecond
 	config.ShutdownSendRetryAfter = keepListening
+	config.ShutdownWatchTerminationGracePeriod = 2 * time.Second


nit: // much smaller than normal 30s grace period for other non-long running requests

we just need to enable watch draining here, so any positive value is fine here, added a comment

sttts · 2023-02-27T11:10:20Z

few nits. Sgtm overall.

tkashem · 2023-02-27T15:11:43Z

@sttts I addressed the feedback, please take a look when you have a moment

wojtek-t · 2023-02-27T15:22:14Z

This looks great.
I was chatting with Stefan in the morning and he confirmed that all of his comments where nits.

So I'm going ahead and tagging it so merge it as soon as possible to get some reasonable soak time.
If needed, we can always adjust things in a follow up if he will consider that the comment updates (which is the only potentially unresolved thing) requires further tweaking.

/lgtm
/approve

/hold cancel

/priority important-soon

k8s-ci-robot · 2023-02-27T15:22:21Z

LGTM label has been added.

Git tree hash: 380cd25e56fe248241b3bbb2d9b8e67f732d3f68

k8s-ci-robot · 2023-02-27T15:22:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tkashem, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~staging/src/k8s.io/apiserver/OWNERS~~ [wojtek-t]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tkashem · 2023-02-27T15:51:29Z

I had to fix a failing unit test from my rename operation, this is the diff:

--- a/staging/src/k8s.io/apiserver/pkg/endpoints/watch_test.go
+++ b/staging/src/k8s.io/apiserver/pkg/endpoints/watch_test.go
@@ -647,9 +647,8 @@ func TestWatchHTTPErrors(t *testing.T) {
                Encoder:         newCodec,
                EmbeddedEncoder: newCodec,
 
-               Fixup:                func(obj runtime.Object) runtime.Object { return obj },
-               TimeoutFactory:       &fakeTimeoutFactory{timeoutCh, done},
-               ServerShuttingdownCh: make(chan struct{}),
+               Fixup:          func(obj runtime.Object) runtime.Object { return obj },
+               TimeoutFactory: &fakeTimeoutFactory{timeoutCh, done},
        }

wojtek-t · 2023-02-27T16:03:28Z

/lgtm

k8s-ci-robot · 2023-02-27T16:03:35Z

LGTM label has been added.

Git tree hash: 7bd7e81c1b315de3226b196b358a6a7842a5212e

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Jan 9, 2023

k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Jan 9, 2023

k8s-ci-robot requested review from roycaihw and sttts January 9, 2023 19:28

k8s-ci-robot added area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 9, 2023

tkashem mentioned this pull request Jan 9, 2023

Graceful shutdown for long-running requests #113546

Closed

k8s-ci-robot requested review from marseel and wojtek-t January 9, 2023 19:32

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2023

wojtek-t reviewed Jan 9, 2023

View reviewed changes

wojtek-t self-assigned this Jan 9, 2023

tkashem force-pushed the watch-termination branch 4 times, most recently from 984a956 to dbf863a Compare January 10, 2023 02:55

sttts reviewed Feb 27, 2023

View reviewed changes

tkashem force-pushed the watch-termination branch from 1f8faef to 31466d7 Compare February 27, 2023 15:03

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 27, 2023

tkashem added 3 commits February 27, 2023 10:49

apiserver: watch gets notified on server shutdown signal

697d967

apiserver: terminate watch with a rate limiter during shutdown

6385b86

apiserver: make watch termination during shutdown configurable

791fcd6

tkashem force-pushed the watch-termination branch from 31466d7 to 791fcd6 Compare February 27, 2023 15:50

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 27, 2023

k8s-ci-robot requested review from sttts and wojtek-t February 27, 2023 15:50

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 27, 2023

k8s-ci-robot merged commit a16fd54 into kubernetes:master Feb 27, 2023

k8s-ci-robot added this to the v1.27 milestone Feb 27, 2023

liggitt mentioned this pull request Mar 8, 2023

Top flake in integration test: TestChangeCRD #116364

Closed

apiserver: terminate watch with rate limiting during shutdown #114925

apiserver: terminate watch with rate limiting during shutdown #114925

Uh oh!

Conversation

tkashem commented Jan 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Jan 9, 2023

Uh oh!

tkashem commented Jan 9, 2023

Uh oh!

tkashem commented Jan 9, 2023

Uh oh!

wojtek-t commented Jan 9, 2023

Uh oh!

wojtek-t commented Jan 9, 2023

Uh oh!

wojtek-t commented Jan 9, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tkashem Jan 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sttts commented Feb 27, 2023

Uh oh!

tkashem commented Feb 27, 2023

Uh oh!

wojtek-t commented Feb 27, 2023

Uh oh!

k8s-ci-robot commented Feb 27, 2023

Uh oh!

k8s-ci-robot commented Feb 27, 2023

Uh oh!

tkashem commented Feb 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wojtek-t commented Feb 27, 2023

Uh oh!

k8s-ci-robot commented Feb 27, 2023

Uh oh!

tkashem commented Jan 9, 2023 •

edited

Loading

tkashem Jan 10, 2023 •

edited

Loading

tkashem commented Feb 27, 2023 •

edited

Loading