Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix apiserver shutdown in integration tests #110000

Merged
merged 1 commit into from May 26, 2022

Conversation

wojtek-t
Copy link
Member

Ref #108483

NONE

/kind bug
/priority important-longterm
/sig api-machinery

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 12, 2022
@wojtek-t
Copy link
Member Author

/hold

This depends on #109978 - otherwise it increases time to run integration tests by 3-5 seconds for each test.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 12, 2022
// Additionally wait for preshutdown hooks to also be finished, as some of them need
// to send API calls to clean up after themselves (e.g. lease reconcilers removing
// itself from the active servers).
<-preShutdownHooksHasStoppedCh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think this is okay

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not blocking twice - it's blocking different thing:

  1. here we're ensuring that we won't "activate" the waitgroup filter (which is effectively rejecting with RetryAfter all incoming requests)
  2. the one you linked above is ensuring that preshutdown hooks will block closing the HTTP server closure

(1) [what I'm adding here] should happen earlier.
So what you're suggesting is that the one that is existing is no longer needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not suggesting, it was curiosity about the possible consequences of the changes, there are many goroutines and channels here and I'm afraid we deadlock somehow ...
how hard is to unit test this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing tests are failing - which is good actually :)

So yes - we need to modify them to actually test what we need (which I guess is roughly what you're asking for - because we need to cover this usecase).

I will work on it in upcoming days (though not today).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember the details, but that change was required because it was closing the Listener .... with the Listener closed, the port is closed, not new requests ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was closing one of these channels before the preshutdown hooks

return stoppedCh, listenerStoppedCh, nil

closing the listener, so no new request can be created because the port was not listening

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But closing listener before preshutdown hooks is incorrect - becasue preshutdown hooks may need to send API calls (and some do send).

In real prod scenario where we have this default 70s ShutdownDelayDuration:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/lifecycle_signals.go#L42
preshutdown hooks are alrady finished when we start rejecting new requests.
But in tests (where we don't use these 70s) - we actually need to ensure that preshutdown hooks had a chance to run.

Copy link
Contributor

@tkashem tkashem May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestPreShutdownHooks seems to be broken - I have made some changes that make it functional I believe, check #110026

please note that there are two modes graceful shutdown, the new mode where we don't shutdown the HTTP server until all requests in flight drain can be enabled by shutdown-send-retry-after.

I think we are discussing the original graceful termination approach here.

But closing listener before preshutdown hooks is incorrect - becasue preshutdown hooks may need to send API calls (and some do send).

yes, that's correct, I think i would add to it by saying that preshutdown hooks should be invoked after requests in flight are drained because in-flight requests might depend on those states cleaned up by the preshutdown hooks, right?.

the current sequence roughly is:

  • T0: ShutdownInitiated: TERM signal received
    • run pre shutdown hooks
    • /readyz starts returning red
  • T0+ShutdownDelayDuration:
  • wait for preshutdown hooks to finish (this PR accomplishes this)
  • reject incoming request(s) here on by invoking HandlerChainWaitGroup.Wait()
  • initiate net HTTP server shutdown
  • wait for requests in flight to drain (this includes requests from preshutdown hooks)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But closing listener before preshutdown hooks is incorrect - becasue preshutdown hooks may need to send API calls (and some do send).

that was the bug fixed by #108033 :)

the current sequence roughly is:

  • T0: ShutdownInitiated: TERM signal received

    • run pre shutdown hooks
    • /readyz starts returning red
  • T0+ShutdownDelayDuration:

  • wait for preshutdown hooks to finish (this PR accomplishes this)

  • reject incoming request(s) here on by invoking HandlerChainWaitGroup.Wait()

  • initiate net HTTP server shutdown

  • wait for requests in flight to drain (this includes requests from preshutdown hook

the last sentence is confusing, since you mention that you've already waited "for preshutdown hooks to finish"

@p0lyn0mial
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 12, 2022
@aojea
Copy link
Member

aojea commented May 12, 2022

/assign @tkashem

we went through this a few months ago

#108033

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 12, 2022
// | |
// |---------------------------------- |
// | | |
// | (HandlerChainWaitGroup::Wait) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, now is clear

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was working on a hackmd a while back - https://hackmd.io/yjCkQzImQs-_qC2yAW9EuA, let me know if you can access it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it needs some more work though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - I can access. We can link it here eventually - for now we can leave what I added here.

@leilajal
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 12, 2022
@wojtek-t wojtek-t force-pushed the fix_shutdown_sequence branch 3 times, most recently from 93bfc4f to ff3c15a Compare May 24, 2022 07:08
@aojea
Copy link
Member

aojea commented May 24, 2022

k8s.io/kubernetes/test/integration/controlplane: TestGracefulShutdown expand_more

:/

@wojtek-t
Copy link
Member Author

k8s.io/kubernetes/test/integration/controlplane: TestGracefulShutdown expand_more

@aojea - this is expected, this PR depends on #110026 (and I didn't add that explicitly to this PR)

@wojtek-t wojtek-t force-pushed the fix_shutdown_sequence branch 2 times, most recently from 93fde5d to 85aede8 Compare May 25, 2022 17:17
@wojtek-t wojtek-t changed the title Fix genericapiserver to wait for preshutdown hooks before rejecting all calls Fix apiserver shutdown in integration tests May 25, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wojtek-t
Copy link
Member Author

@aojea - this is now ready for review

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 26, 2022
}
if err := prepared.Run(stopCh); err != nil {
errCh <- err
t.Error(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the last thing in this function anyway.
But it wouldn't hurt either - added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, sorry, you replaced the t.Error(err) by a return in the previous line, since the err is now processed in the other side of the channel.
I was asking if you wanted to replace this t.Error(err) by a return too, but keep it this way, it doesn't hurt

if errCh != nil {
err, ok := <-errCh
if ok && err != nil {
klog.Errorf("Failed to run test server clearly: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit,

Suggested change
klog.Errorf("Failed to run test server clearly: %v", err)
klog.Errorf("Failed to shutdown test server clearly: %v", err)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -209,8 +219,9 @@ func StartTestServer(t Logger, instanceOptions *TestServerInstanceOptions, custo
server.GenericAPIServer.StorageVersionManager = instanceOptions.StorageVersionWrapFunc(server.GenericAPIServer.StorageVersionManager)
}

errCh := make(chan error)
errCh = make(chan error)
go func(stopCh <-chan struct{}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know why do we have to capture the stopCh in this goroutine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well - it's probably not needed...

You want me to change this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, it was a question for me, for learning ...

Comment on lines +73 to +74
// We tear it down in the background to ensure that
// pending requests should work fine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand this change, before we "guarantee" that new requests come once the apiserver was down, is not going to race this way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - can you clarify your comment?

Before the changes I'm making in this PR to make "server.TearDownFn" actually be synchronous, this was only triggerring the shutdown and the lines below were happening in the meantime.

With "server.TearDownFn" being synchronous, when that is finishing, the server is already down. So there is no chance that writing anything to body in line 96 will succeed.
[What wouldn't be happening without this adjustment to test is that the request is hanging, server is waiting to be drained, but finally the requests is timing out and the server is the shutting down.]

In order to make it work, we need to trigger the shutdown here, but actually let it work in the background.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ic, the thing that synchronizes with the shutdown is when it start to fail "new requests", so you know it started to shutdown.
Then, it waits 500 ms, write "garbage" and checks that it returns something
Then, you wait for the shutdown to exit

Is there any way to guarantee that the shutdown never finish before we can write?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to guarantee that the shutdown never finish before we can write?

In general the request that we're writing "garbage" to is started before here (line 44). So it already reached apiserver.

Part of the apiserver shutdown sequence is that it won't close the http server, before all requests are actually drained. In our case, it includes that particular request.

The request will not be drained until it finishes, so given we don't finish it before writing to it, the only other way for it to finish is when it times out on the server side.

So our only risk is request timing out on the server side. But the default timeouts are 1m, and I think it's fine to leave it that way.

[FWIW - this is not a regression in this PR - the same failure mode existed before (if we didn't write anything to the body for 1m and it would timeout on the server side.]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I just wanted to fully understand the test and the change, thanks for the explanation

Comment on lines -121 to -130
// get a leased session
session, err := concurrency.NewSession(rawClient)
if err != nil {
t.Fatal(err)
}

// then build and use an etcd lock
// this prevents more than one of these api servers from running at the same time
lock := concurrency.NewLocker(session, "kube_integration_etcd_raw")
lock.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it was really needed.

I think the problem is that we have N different implementations every one of them doing things differently.
We need to unify stuff as it is non-sustainable. And none of other implementations is doing stuff like that.

Comment on lines +154 to +155
errCh <- err
return
Copy link
Member

@aojea aojea May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in cmd/kube-apiserver/app/testing/testserver.go the wait.PollImmediate that waits for the /healthz endpoint checks this channel

		select {
		case err := <-errCh:
			return false, err
		default:
		}

if PrepareRun doesn't work we can fail faster this way, otherwise I assume we'll fail after waiting for healthz to be ok, because we are returning without checking if PrepareRun() failed, errCh is only checked in the cleanup

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - added.

if err := kubeAPIServer.GenericAPIServer.PrepareRun().Run(stopCh); err != nil {
t.Error(err)
errCh <- err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as in https://github.com/kubernetes/kubernetes/pull/110000/files#r882448491, this channel is only check on cleanup, we can do a quick check when waiting for healthz to fail fast instead of waiting for the apiserver /healthz endpoint

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - added

@wojtek-t
Copy link
Member Author

@aojea - comments applied - PTAL

@aojea
Copy link
Member

aojea commented May 26, 2022

lgtm with one question about the shutdown test change

@aojea
Copy link
Member

aojea commented May 26, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 26, 2022
@k8s-ci-robot k8s-ci-robot merged commit eb37a5d into kubernetes:master May 26, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone May 26, 2022
@wojtek-t wojtek-t added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed kind/bug Categorizes issue or PR as related to a bug. labels May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants