New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix apiserver shutdown in integration tests #110000
Fix apiserver shutdown in integration tests #110000
Conversation
/hold This depends on #109978 - otherwise it increases time to run integration tests by 3-5 seconds for each test. |
// Additionally wait for preshutdown hooks to also be finished, as some of them need | ||
// to send API calls to clean up after themselves (e.g. lease reconcilers removing | ||
// itself from the active servers). | ||
<-preShutdownHooksHasStoppedCh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think this is okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we want to block twice on preShutdownHooks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not blocking twice - it's blocking different thing:
- here we're ensuring that we won't "activate" the waitgroup filter (which is effectively rejecting with RetryAfter all incoming requests)
- the one you linked above is ensuring that preshutdown hooks will block closing the HTTP server closure
(1) [what I'm adding here] should happen earlier.
So what you're suggesting is that the one that is existing is no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not suggesting, it was curiosity about the possible consequences of the changes, there are many goroutines and channels here and I'm afraid we deadlock somehow ...
how hard is to unit test this change?
kubernetes/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
Line 431 in 076168b
func TestPreShutdownHooks(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing tests are failing - which is good actually :)
So yes - we need to modify them to actually test what we need (which I guess is roughly what you're asking for - because we need to cover this usecase).
I will work on it in upcoming days (though not today).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember the details, but that change was required because it was closing the Listener .... with the Listener closed, the port is closed, not new requests ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was closing one of these channels before the preshutdown hooks
return stoppedCh, listenerStoppedCh, nil |
closing the listener, so no new request can be created because the port was not listening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But closing listener before preshutdown hooks is incorrect - becasue preshutdown hooks may need to send API calls (and some do send).
In real prod scenario where we have this default 70s ShutdownDelayDuration
:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/lifecycle_signals.go#L42
preshutdown hooks are alrady finished when we start rejecting new requests.
But in tests (where we don't use these 70s) - we actually need to ensure that preshutdown hooks had a chance to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestPreShutdownHooks
seems to be broken - I have made some changes that make it functional I believe, check #110026
please note that there are two modes graceful shutdown, the new mode where we don't shutdown the HTTP server until all requests in flight drain can be enabled by shutdown-send-retry-after
.
I think we are discussing the original graceful termination approach here.
But closing listener before preshutdown hooks is incorrect - becasue preshutdown hooks may need to send API calls (and some do send).
yes, that's correct, I think i would add to it by saying that preshutdown hooks should be invoked after requests in flight are drained because in-flight requests might depend on those states cleaned up by the preshutdown hooks, right?.
the current sequence roughly is:
- T0: ShutdownInitiated: TERM signal received
- run pre shutdown hooks
- /readyz starts returning red
- T0+ShutdownDelayDuration:
- wait for preshutdown hooks to finish (this PR accomplishes this)
- reject incoming request(s) here on by invoking
HandlerChainWaitGroup.Wait()
- initiate net HTTP server shutdown
- wait for requests in flight to drain (this includes requests from preshutdown hooks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But closing listener before preshutdown hooks is incorrect - becasue preshutdown hooks may need to send API calls (and some do send).
that was the bug fixed by #108033 :)
the current sequence roughly is:
T0: ShutdownInitiated: TERM signal received
- run pre shutdown hooks
- /readyz starts returning red
T0+ShutdownDelayDuration:
wait for preshutdown hooks to finish (this PR accomplishes this)
reject incoming request(s) here on by invoking
HandlerChainWaitGroup.Wait()
initiate net HTTP server shutdown
wait for requests in flight to drain (this includes requests from preshutdown hook
the last sentence is confusing, since you mention that you've already waited "for preshutdown hooks to finish"
/lgtm |
fdf7086
to
bf9dece
Compare
// | | | ||
// |---------------------------------- | | ||
// | | | | ||
// | (HandlerChainWaitGroup::Wait) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe, now is clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was working on a hackmd a while back - https://hackmd.io/yjCkQzImQs-_qC2yAW9EuA, let me know if you can access it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it needs some more work though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - I can access. We can link it here eventually - for now we can leave what I added here.
/triage accepted |
93bfc4f
to
ff3c15a
Compare
:/ |
93fde5d
to
85aede8
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
85aede8
to
773b501
Compare
@aojea - this is now ready for review /hold cancel |
} | ||
if err := prepared.Run(stopCh); err != nil { | ||
errCh <- err | ||
t.Error(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the last thing in this function anyway.
But it wouldn't hurt either - added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, sorry, you replaced the t.Error(err) by a return in the previous line, since the err is now processed in the other side of the channel.
I was asking if you wanted to replace this t.Error(err) by a return too, but keep it this way, it doesn't hurt
if errCh != nil { | ||
err, ok := <-errCh | ||
if ok && err != nil { | ||
klog.Errorf("Failed to run test server clearly: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit,
klog.Errorf("Failed to run test server clearly: %v", err) | |
klog.Errorf("Failed to shutdown test server clearly: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -209,8 +219,9 @@ func StartTestServer(t Logger, instanceOptions *TestServerInstanceOptions, custo | |||
server.GenericAPIServer.StorageVersionManager = instanceOptions.StorageVersionWrapFunc(server.GenericAPIServer.StorageVersionManager) | |||
} | |||
|
|||
errCh := make(chan error) | |||
errCh = make(chan error) | |||
go func(stopCh <-chan struct{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know why do we have to capture the stopCh in this goroutine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well - it's probably not needed...
You want me to change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, it was a question for me, for learning ...
// We tear it down in the background to ensure that | ||
// pending requests should work fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand this change, before we "guarantee" that new requests come once the apiserver was down, is not going to race this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - can you clarify your comment?
Before the changes I'm making in this PR to make "server.TearDownFn" actually be synchronous, this was only triggerring the shutdown and the lines below were happening in the meantime.
With "server.TearDownFn" being synchronous, when that is finishing, the server is already down. So there is no chance that writing anything to body in line 96 will succeed.
[What wouldn't be happening without this adjustment to test is that the request is hanging, server is waiting to be drained, but finally the requests is timing out and the server is the shutting down.]
In order to make it work, we need to trigger the shutdown here, but actually let it work in the background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ic, the thing that synchronizes with the shutdown is when it start to fail "new requests", so you know it started to shutdown.
Then, it waits 500 ms, write "garbage" and checks that it returns something
Then, you wait for the shutdown to exit
Is there any way to guarantee that the shutdown never finish before we can write?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to guarantee that the shutdown never finish before we can write?
In general the request that we're writing "garbage" to is started before here (line 44). So it already reached apiserver.
Part of the apiserver shutdown sequence is that it won't close the http server, before all requests are actually drained. In our case, it includes that particular request.
The request will not be drained until it finishes, so given we don't finish it before writing to it, the only other way for it to finish is when it times out on the server side.
So our only risk is request timing out on the server side. But the default timeouts are 1m, and I think it's fine to leave it that way.
[FWIW - this is not a regression in this PR - the same failure mode existed before (if we didn't write anything to the body for 1m and it would timeout on the server side.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I just wanted to fully understand the test and the change, thanks for the explanation
// get a leased session | ||
session, err := concurrency.NewSession(rawClient) | ||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
|
||
// then build and use an etcd lock | ||
// this prevents more than one of these api servers from running at the same time | ||
lock := concurrency.NewLocker(session, "kube_integration_etcd_raw") | ||
lock.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it was really needed.
I think the problem is that we have N different implementations every one of them doing things differently.
We need to unify stuff as it is non-sustainable. And none of other implementations is doing stuff like that.
errCh <- err | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in cmd/kube-apiserver/app/testing/testserver.go
the wait.PollImmediate
that waits for the /healthz endpoint checks this channel
select {
case err := <-errCh:
return false, err
default:
}
if PrepareRun doesn't work we can fail faster this way, otherwise I assume we'll fail after waiting for healthz to be ok, because we are returning without checking if PrepareRun() failed, errCh
is only checked in the cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point - added.
if err := kubeAPIServer.GenericAPIServer.PrepareRun().Run(stopCh); err != nil { | ||
t.Error(err) | ||
errCh <- err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as in https://github.com/kubernetes/kubernetes/pull/110000/files#r882448491, this channel is only check on cleanup, we can do a quick check when waiting for healthz to fail fast instead of waiting for the apiserver /healthz endpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point - added
773b501
to
7be6073
Compare
7be6073
to
6f70677
Compare
@aojea - comments applied - PTAL |
lgtm with one question about the shutdown test change |
/lgtm |
Ref #108483
/kind bug
/priority important-longterm
/sig api-machinery