New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apiserver: terminate watch with rate limiting during shutdown #114925
Conversation
Skipping CI for Draft Pull Request. |
/ok-to-test |
/hold |
Or actually wait - you're not storing additional info here, so that is more efficient in terms of memory. Let me take a deeper look. |
// too late, the net/http server has shutdown and | ||
// probably has returned an error to the caller | ||
return | ||
case <-s.ServerTerminationWindow.ShuttingDown(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tkashem I have a feeling that it should be possible to do that more consistently with the current support for short-running requests.
Currently we have WithWaitGroup
that is waiting for short-running requests. And only that is a signal for further steps of graceful shutdown. I think that we should be consistent with that for long-running.
So what I would try is to:
- plumb to exact same place (just introduce a second
SafeWaitGroup
for long-running requests - expose a channel of
IsShuttingDown()
from theSafeWaitGroup
that will trigger exactly this code path - introduce the late-limiter into this
SafeWaitGroup
and change here to sth like:
s.AcquireToken()`
return
[The above suggests that it shouldn't be SafeWaitGroup
but a new type being a wrapper around it.]
Now - you additionally don't need to pass the ServerTerminationWindow
across the codebase - all you need is to add that to the context in the WithWaitGroup
filter.
So finally, this probably no longer belongs to WithWaitGroup
, it should be a separate filter doing that.
I think that would be cleaner both than this PR and than @marseel PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to POC that anyway - I would love to try it, though I probably won't get to it for at least next two weeks...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all you need is to add that to the context in the WithWaitGroup filter.
This sounds good to me, i wired the context
plumb to exact same place (just introduce a second SafeWaitGroup for long-running requests
do we need to keep track of the active watches? currently all active watches return at the same moment when net/http server closes, I thought the goal was to make sure these active watches return with a rate limiter or random sleep.
I think it would make sense to POC that anyway - I would love to try it, though I probably won't get to it for at least next two weeks...
I am happy to tackle it, if you and @marseel think this approach is in the right direction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to keep track of the active watches? currently all active watches return at the same moment when net/http server closes, I thought the goal was to make sure these active watches return with a rate limiter or random sleep.
Yes - the goal is to rate-limit them.
But if there are very few watches and I'm setting a window of say 1m (I expect setting around this number to be most common), I don't want to wait 1m, but rather shut them down quicker, because that would be safe.
So while i don't need to know the exact watches, I need to know their number to know when they are done.
I am happy to tackle it, if you and @marseel think this approach is in the right direction
I'm not 100% convinced because I'm not sure if we won't find something unexpected in the code that would make this harder. But we also won't know until we POC it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if there are very few watches and I'm setting a window of say 1m (I expect setting around this number to be most common), I don't want to wait 1m, but rather shut them down quicker, because that would be safe.
Yea, I think it makes sense. We could just rate limit to max(some constant like 100, #watches/timeout) QPS.
I am happy to tackle it, if you and Marcel Zięba think this approach is in the right direction
I personally like your approach more than mine :). Once we have working POC I can run some scale tests to see if everything works as expected on large scale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally like your approach more than mine :). Once we have working POC I can run some scale tests to see if everything works as expected on large scale.
I think due to the fact of not maintaining a single cache of all connections I like it more too. And by addressing my comments, I think we can make it even cleaner, I hope.
So yes - if you have some time Abu, if you could POC it, it would be great !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
984a956
to
dbf863a
Compare
"k8s.io/apiserver/pkg/util/wsstream" | ||
) | ||
|
||
// nothing will ever be sent down this channel | ||
var neverExitWatch <-chan time.Time = make(chan time.Time) | ||
|
||
// nothing will ever be sent down this channel, used when the request | ||
// context does not have a server shutdown signal associated | ||
var neverShuttingDownCh <-chan struct{} = make(chan struct{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity, why is this global?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even just set nil as the shutdown channel. Nil blocks on select.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i don't thin we need it, i removed it, thanks
@@ -156,7 +167,8 @@ type WatchServer struct { | |||
// used to correct the object before we send it to the serializer | |||
Fixup func(runtime.Object) runtime.Object | |||
|
|||
TimeoutFactory TimeoutFactory | |||
TimeoutFactory TimeoutFactory | |||
ServerShuttingdownCh <-chan struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ServerShuttingDownCh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -260,6 +266,20 @@ type GenericAPIServer struct { | |||
// If enabled, after ShutdownDelayDuration elapses, any incoming request is | |||
// rejected with a 429 status code and a 'Retry-After' response. | |||
ShutdownSendRetryAfter bool | |||
|
|||
// ShutdownWatchTerminationGracePeriod, if set to a positive value, | |||
// is the maximum grace period the apiserver will allow for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does "maximum" mean here? Is there another grace period that might overrule this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i updated the comments, let me know if it's more clear now
// Wait for all active watches to finish | ||
grace := s.ShutdownWatchTerminationGracePeriod | ||
activeBefore, activeAfter, err := s.WatchRequestWaitGroup.Wait(func(count int) (utilwaitgroup.RateLimiter, context.Context, context.CancelFunc) { | ||
// TODO: this is back of the envelope for now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's written on the envelop? What's the oneliner of intuition here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment fixed
@@ -950,6 +1011,7 @@ func newGenericAPIServer(t *testing.T, fAudit *fakeAudit, keepListening bool) *G | |||
config, _ := setUp(t) | |||
config.ShutdownDelayDuration = 100 * time.Millisecond | |||
config.ShutdownSendRetryAfter = keepListening | |||
config.ShutdownWatchTerminationGracePeriod = 2 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: // much smaller than normal 30s grace period for other non-long running requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we just need to enable watch draining here, so any positive value is fine here, added a comment
few nits. Sgtm overall. |
1f8faef
to
31466d7
Compare
@sttts I addressed the feedback, please take a look when you have a moment |
This looks great. So I'm going ahead and tagging it so merge it as soon as possible to get some reasonable soak time. /lgtm /hold cancel /priority important-soon |
LGTM label has been added. Git tree hash: 380cd25e56fe248241b3bbb2d9b8e67f732d3f68
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tkashem, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
31466d7
to
791fcd6
Compare
I had to fix a failing unit test from my rename operation, this is the diff:
|
/lgtm |
LGTM label has been added. Git tree hash: 7bd7e81c1b315de3226b196b358a6a7842a5212e
|
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #111886 (it addresses termination of active watches during shutdown)
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: