Shutdown servers when resetting cluster membership by claudiubelu · Pull Request #295 · canonical/microcluster

claudiubelu · 2024-11-29T08:35:56Z

In the k8s-snap, we're seeing a significant amount of EOF errors client-side, typically when operations such as k8s bootstrap or k8s join-cluster are run and an error occurs server-side.

When closing a listener, the server won't accept any new connections. However, we should still allow for the current connections / requests to finish. Calling server.Shutdown allows us to wait for them to finish. We'll be doing this only if drainConnectionsTimeout has been set for the server.

Note that in the case we fail to bootstrap / join the cluster, we'll be resetting a few things, including the cluster membership. This includes the HTTPS and unix socket servers we have open by closing them.

However, we cannot gracefully shutdown the servers, as there's at least one connection that is still open: the bootstrap / join request. Forcing the connection to close before we're able to write the request response will result in the client getting an EOF error, and no information regarding the failure.

Running the revert actions in a goroutine will address this issue: while the revert happens, we'll be able to return and write the HTTP response and then close the connection, finally allowing the Servers to gracefully shutdown, and the clients to be happy. As a bonus, this will also significantly reduce the amount of time spent by the bootstrap / join requests in case of failures.

roosterfish · 2024-11-29T08:53:24Z

Hi, thanks for the PR. Have you seen #198?
The idea is to add a new DrainConnections field to the Server struct which allows to configure what you are proposing in this PR.

Could you please make the functionality dependent on this field?
You should be able to pass another flag into NewNetwork() when creating the endpoint so that when calling Close() on it you have the necessary information available.

claudiubelu · 2024-12-03T12:25:49Z

Hi, thanks for the PR. Have you seen #198? The idea is to add a new DrainConnections field to the Server struct which allows to configure what you are proposing in this PR.

Could you please make the functionality dependent on this field? You should be able to pass another flag into NewNetwork() when creating the endpoint so that when calling Close() on it you have the necessary information available.

Hmm, there might also be the question of how long to wait for connections to drain. Currently, we're waiting 5 seconds, but that could be configurable. We could have DrainConnectionsTimeout, and setting it to 0 would disable this aspect. Wdyt?

roosterfish · 2024-12-03T13:11:13Z

We could have DrainConnectionsTimeout, and setting it to 0 would disable this aspect. Wdyt?

Sounds good to me. By default (in case DrainConnectionsTimeout is 0) I would use the endpoints ctx which is present for both types Network and Socket when calling Shutdown(ctx) on the server.
In your changes you currently use context.Background(). Only if DrainConnectionsTimeout != 0 I would derive a new context with timeout from the endpoint's ctx.

claudiubelu · 2024-12-04T16:48:22Z

We could have DrainConnectionsTimeout, and setting it to 0 would disable this aspect. Wdyt?

Sounds good to me. By default (in case DrainConnectionsTimeout is 0) I would use the endpoints ctx which is present for both types Network and Socket when calling Shutdown(ctx) on the server. In your changes you currently use context.Background(). Only if DrainConnectionsTimeout != 0 I would derive a new context with timeout from the endpoint's ctx.

Added the DrainConnectionsTimeout, though it's been added in the daemon.Args as well, which is to be used for core servers.

Also, the original intention was to solve the EOF issue we've been seeing around, and I've included a fix for that as well.

roosterfish

Thank you for this.
Good call also allowing to set the timeout for the core listeners.

Please have a look at my few comments.

roosterfish · 2024-12-11T08:57:09Z

 			if err != nil {
 				return err
 			}
+			err = d.endpoints.Shutdown()


Shouldn't we rather extend Down() instead of adding another Shutdown()? As we are already looping through the listeners in Down() you can also call endpoint.ShutdownServer() from there.

Indeed. That was my original implementation, as you can see from the fact that this separation was introduced in the second commit (will squash before merging).

However, the reason behind this is a bit complicated. First of all, Down() shutting down the listeners is a good idea, it allows us to no longer accept new connections, and it would be best to call it early in such cases. But we can't necessarily close down the servers just yet, as this may be called due to a request (e.g.: if bootstrap fails (aka during a request), we're "resetting" the cluster). Because this happens during a request, we cannot gracefully complete the request and shutdown the server on the same thread (or at least with how microcluster's code flows), which is why reverter.Fail() was goroutined. If we forcefully shutdown the servers while a connection is still in progress, the client will get an EOF and with no explanation of what actually happens (it doesn't get the error we're supposed to send back, which is the original reason this PR was sent).

We could go back to how I've done this previously, and having Down() also shutdown the servers as you suggest, but the same thread response + shutdown issue I've mentioned could still occur, like it does in clusterMemberPut (the listeners are stopped before writing the reply).

If I understand it right this is now fixed as you run StopListeners within the reExec go routine?

roosterfish · 2024-12-11T09:01:37Z

+		// Running the revert actions in a goroutine will address this issue: while the revert happens,
+		// we'll be able to return and write the HTTP response and then close the connection, finally
+		// allowing the Servers to gracefully shutdown, and the clients to be happy.
+		go reverter.Fail()


Is this working just because the sending of the response is faster than the revert? I fear we require some form of synchronization so that it works consistently.

Can we do something like this to flush the response beforehand?

Indeed, we're doing this to allow the request to actually finish and the response to be written back before closing down the server. The reverter also includes shutting down the server, so we need to finish our request beforehand.

The ManualResponse bit from clusterMemberPut is interesting. Note that resetClusterMember is called before actually getting to write any response. If resetClusterMember would result in the servers to shutdown before getting to write the response, it would result in another EOF error. There is a way in which the ManualResponse may help... though it will be a bit odd: stop the servers in the ManualResponse.

Same as #295 (comment)

roosterfish

Can you please again squash the commits to clean up the history? Please also check my comment on the endpoint ctx.

claudiubelu · 2024-12-16T12:43:58Z

Can you please again squash the commits to clean up the history? Please also check my comment on the endpoint ctx.

Squashed the commits.

I won't change the defer part though, I think it's fine as it is. According to the golang documentation, "A defer statement defers the execution of a function until the surrounding function returns.", which means that the context gets canceled after the Close method returns. We have to cancel the context in case an errors occurs, and if no error occurs, we'd have to write something like this:

err = shutdownServer(n.ctx, n.server, n.drainConnectionsTimeout)
if err != nil {
  n.cancel()
  //
}

n.cancel()

We'd have that call duplicated, and it would have been worse if there were more error cases. Having a defer simplifies this. Plus, In a lot of official golang documentation defers are used, including examples with contexts: https://pkg.go.dev/context#WithTimeout

roosterfish · 2024-12-16T13:29:10Z

Thanks for squashing.

We'd have that call duplicated

That's a good point so let's keep the defer where it is

A defer statement defers the execution of a function until the surrounding function returns.

This is right but doesn't account for the call to shutdownServers which happens afterwards. You can run this example which will always print hello from ff2 before hello from ff which got deferred:

ff := func() string {
    return "hello from ff"
}

ff2 := func() string {
    defer fmt.Println("hello from ff2")
    return ff()
}

fmt.Println(ff2())

I would propose to just run shutdownServers outside of the return to prevent this situation and just do return nil at the end.

roosterfish · 2024-12-16T14:05:54Z

Sorry maybe I wasn't clear enough with the squashing. I saw your comment here and thought you will squash only the changes which are anyway reverted in one of the follow up commits.

We group the changes by package (indicated as prefix in the commit message). Meaning your changes to the daemon's config would be in a commit message like "internal/daemon: Adding DrainConnectionsTimeout field to config" and so on.

claudiubelu · 2024-12-16T14:46:14Z

Thanks for squashing.

We'd have that call duplicated

That's a good point so let's keep the defer where it is

A defer statement defers the execution of a function until the surrounding function returns.

This is right but doesn't account for the call to shutdownServers which happens afterwards. You can run this example which will always print hello from ff2 before hello from ff which got deferred:
ff := func() string {
    return "hello from ff"
}

ff2 := func() string {
    defer fmt.Println("hello from ff2")
    return ff()
}

fmt.Println(ff2())
I would propose to just run shutdownServers outside of the return to prevent this situation and just do return nil at the end.

Yes, it does print hello from ff2 before hello from ff, and that is to be expected, because the defer's "surrounding function" mentioned in the golang documentation is actually the anonymous assigned to ff2, not the function in which ff2 is defined.

Sorry maybe I wasn't clear enough with the squashing. I saw your comment here and thought you will squash only the changes which are anyway reverted in one of the follow up commits.

We group the changes by package (indicated as prefix in the commit message). Meaning your changes to the daemon's config would be in a commit message like "internal/daemon: Adding DrainConnectionsTimeout field to config" and so on.

Done.

roosterfish

You are right on the defer, my example doesn't make sense as it returns the value which changes the effective order of the prints. Forget about it :)

Almost there, just two smaller comments.

claudiubelu · 2024-12-17T09:17:24Z

Seeing these errors in the logs now:

Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=info msg="Stopping REST API handler - closing https socket" address="172.17.201.20:6400"
Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=error msg="Failed to start server" err="accept tcp 172.17.201.20:6400: use of closed network connection"
Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=info msg="Stopping REST API handler - closing socket" socket=/var/snap/k8s/common/var/lib/k8sd/state/control.socket
Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=error msg="Failed to start server" err="accept unix /var/snap/k8s/common/var/lib/k8sd/state/control.socket: use of closed network connection"

Amended Serve from socket.go and network.go to address this issue. Could you review those changes? The logs are now (tested on a cherry-pick to v2 branch for k8s-snap):

Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Stopping REST API handler - closing socket" socket=/var/snap/k8s/common/var/lib/k8sd/state/control.socket
Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Received shutdown signal - stopped serving unix socket listener"
Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Stopping REST API handler - closing https socket" address="172.17.201.20:6400"
Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Received shutdown signal - stopped serving https socket listener"

roosterfish · 2024-12-18T15:47:42Z

Could you review those changes?

Hi, I have ran some extended tests and it looks like we face two other issue here.

In case you set DrainConnectionsTimeout on one of the listeners (core or extended), if you CTRL+C (kill -SIGINT <pid>) any of the daemons, you will see the following error:

ERROR  [2024-12-18T16:30:09+01:00] Failed to gracefully shutdown server          err="context canceled"

This is because as soon as the daemon receives the signal, it cancels its main shutdownCtx context.
As the endpoints (socket/network) context is derived from this one, it gets cancelled too which means the call to server.Shutdown() can never wait and errors out immediately.

So we have to detach shutdownServer from the endpoint's context to allow waiting for x seconds to drain the connections.

We should not return the err from Shutdown back to the caller. Otherwise you might see errors like Error: Daemon stopped with error: context canceled which are misleading as only the draining of the connections timed out.
But returning the error from the fallback Close() seems legit as this should work and returning the error in case it fails can be valuable.

claudiubelu · 2024-12-19T15:51:24Z

Could you review those changes?

Hi, I have ran some extended tests and it looks like we face two other issue here.
1. In case you set `DrainConnectionsTimeout` on one of the listeners (core or extended), if you CTRL+C (`kill -SIGINT <pid>`) any of the daemons, you will see the following error:
ERROR  [2024-12-18T16:30:09+01:00] Failed to gracefully shutdown server          err="context canceled"
This is because as soon as the daemon receives the signal, it cancels its main shutdownCtx context. As the endpoints (socket/network) context is derived from this one, it gets cancelled too which means the call to server.Shutdown() can never wait and errors out immediately.

So we have to detach shutdownServer from the endpoint's context to allow waiting for x seconds to drain the connections.

Right. I initially had context.Background used in this case. Amended commit.

2. We should not return the [`err`](https://github.com/canonical/microcluster/pull/295/files#diff-34c26aa9a2a857700db2ae6c687e4156109197dbe1de7611d2ef5aeecccf8389R37) from `Shutdown` back to the caller. Otherwise you might see errors like `Error: Daemon stopped with error: context canceled` which are misleading as only the draining of the connections timed out.

If we are going to use a separated context (context.Background), we shouldn't be seeing this type of error (context canceled) (which is not related to the operation timing out). Instead, if the operation times out (context deadline exceeded), or any other error occurs, it's still something we should return, handle, and log accordingly. If an error occurs, I would want to see it at least.

roosterfish

Thanks for the efforts, LGTM!

roosterfish · 2025-01-07T08:17:02Z

@claudiubelu wanted to merge now but saw the commits aren't signed. Could you please sign them?

Adds configuration option for draining server connections before shutting it down. Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

Adds configuration option for draining core server connections before shutting it down. Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

When closing a listener, the server won't accept any new connections. However, we should still allow for the current connections / requests to finish. Calling ``server.Shutdown`` allows us to wait for them to finish. We'll be doing this only if drainConnectionsTimeout has been set for the server. Note that server.Serve always returns an error, and it will return http.ErrServerClosed if server.Shutdown or server.Close was called, and net.ErrClosed if the listener was closed. Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

Note that in the case we fail to bootstrap / join the cluster, we'll be resetting a few things, including the cluster membership. This includes the HTTPS and unix socket servers we have open by closing them. However, we cannot gracefully shutdown the servers, as there's at least one connection that is still open: the bootstrap / join request. Forcing the connection to close before we're able to write the request response will result in the client getting an EOF error, and no information regarding the failure. Shutting down the server in a goroutine will address this issue: we'll be able to return and write the HTTP response and then close the connection, finally allowing the Servers to gracefully shutdown, and the clients to be happy. Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

claudiubelu · 2025-01-08T09:40:52Z

@claudiubelu wanted to merge now but saw the commits aren't signed. Could you please sign them?

Done, Sorry, I forgot to sign them. :)

Backports #295.

This addresses a regression introduced with #295. Before you were able to close endpoints (listeners) without closing the underlying server. As part of allowing graceful shutdown of servers (and listeners), we added code to implicitly shutdown the listeners underlying server for proper cleanup. However it looks the code always assumed the underlying server never gets closed which allows instructing a remote Microcluster member to join the cluster without closing the connection that was used to instruct this member to join the cluster. The PR introduces another flag that explicitly allows shutting down an endpoint's (listener's) underlying server. We can do this in most of the cases but should not do it in the following two: * Close the server when a remote client instructed to join an existing cluster * Close the server when restarting one of the extension servers. This allows keeping active connections alive.

claudiubelu force-pushed the shutdown-server branch 2 times, most recently from e80421f to bc4314c Compare November 29, 2024 08:47

claudiubelu force-pushed the shutdown-server branch 3 times, most recently from 0902544 to 87196d3 Compare December 4, 2024 16:30

claudiubelu force-pushed the shutdown-server branch 2 times, most recently from 7aa3265 to 62764f9 Compare December 5, 2024 10:40

claudiubelu changed the title ~~Shutdown servers when closing the listeners~~ Shutdown servers when reseting cluster membership Dec 5, 2024

claudiubelu changed the title ~~Shutdown servers when reseting cluster membership~~ Shutdown servers when resetting cluster membership Dec 5, 2024

roosterfish requested changes Dec 11, 2024

View reviewed changes

claudiubelu force-pushed the shutdown-server branch from b1d7762 to 5cb95f0 Compare December 11, 2024 23:37

roosterfish requested changes Dec 12, 2024

View reviewed changes

Comment thread internal/endpoints/network.go

Comment thread internal/endpoints/network.go

Comment thread internal/rest/resources/cluster.go Outdated

claudiubelu force-pushed the shutdown-server branch from 41bbf5d to d26cd86 Compare December 12, 2024 15:04

roosterfish requested changes Dec 16, 2024

View reviewed changes

Comment thread internal/endpoints/network.go

claudiubelu force-pushed the shutdown-server branch from d26cd86 to 69c7a9d Compare December 16, 2024 12:36

claudiubelu force-pushed the shutdown-server branch from 69c7a9d to b6c2a37 Compare December 16, 2024 14:46

roosterfish requested changes Dec 16, 2024

View reviewed changes

Comment thread internal/endpoints/network.go Outdated

Comment thread internal/endpoints/socket.go Outdated

Comment thread internal/endpoints/util.go Outdated

claudiubelu force-pushed the shutdown-server branch from b6c2a37 to b526e5d Compare December 16, 2024 15:23

claudiubelu marked this pull request as draft December 17, 2024 08:21

claudiubelu marked this pull request as ready for review December 17, 2024 09:17

claudiubelu force-pushed the shutdown-server branch from b526e5d to bf7c167 Compare December 19, 2024 15:51

roosterfish approved these changes Dec 20, 2024

View reviewed changes

claudiubelu added 5 commits January 8, 2025 09:39

rest: Adds DrainConnectionsTimeout for Server

165f9f5

Adds configuration option for draining server connections before shutting it down. Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

internal/daemon: Adds DrainConnectionsTimeout for Server

d0431e1

Adds configuration option for draining core server connections before shutting it down. Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

internal/state: Updates comments for StopListeners and Stop

a4b91df

Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>

claudiubelu force-pushed the shutdown-server branch from bf7c167 to a4b91df Compare January 8, 2025 09:39

roosterfish merged commit d50fa50 into canonical:v3 Jan 8, 2025

roosterfish mentioned this pull request Jan 8, 2025

Allow draining connections of additional servers #198

Closed

claudiubelu mentioned this pull request Jan 8, 2025

Backport shutdown servers when resetting cluster membership (v2) #309

Merged

roosterfish added a commit that referenced this pull request Jan 13, 2025

Backport shutdown servers when resetting cluster membership (v2) (#309)

5bfc384

Backports #295.

claudiubelu mentioned this pull request Feb 12, 2025

Adds DrainConnectionsTimeout microcluster configuration canonical/k8s-snap#867

Merged

roosterfish mentioned this pull request Mar 18, 2025

Endpoints: Fix implicit shutdown #353

Merged

Conversation

claudiubelu commented Nov 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roosterfish commented Nov 29, 2024

Uh oh!

claudiubelu commented Dec 3, 2024

Uh oh!

roosterfish commented Dec 3, 2024

Uh oh!

claudiubelu commented Dec 4, 2024

Uh oh!

roosterfish left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

roosterfish Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

claudiubelu Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

roosterfish Dec 16, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

roosterfish Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

roosterfish Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

claudiubelu Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

roosterfish Dec 16, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

roosterfish left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claudiubelu commented Dec 16, 2024

Uh oh!

roosterfish commented Dec 16, 2024

Uh oh!

roosterfish commented Dec 16, 2024

Uh oh!

claudiubelu commented Dec 16, 2024

Uh oh!

roosterfish left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claudiubelu commented Dec 17, 2024

Uh oh!

roosterfish commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claudiubelu commented Dec 19, 2024

Uh oh!

roosterfish left a comment

Choose a reason for hiding this comment

Uh oh!

roosterfish commented Jan 7, 2025

Uh oh!

claudiubelu commented Jan 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

claudiubelu commented Nov 29, 2024 •

edited

Loading

roosterfish commented Dec 18, 2024 •

edited

Loading