Skip to content

Shutdown servers when resetting cluster membership#295

Merged
roosterfish merged 5 commits into
canonical:v3from
claudiubelu:shutdown-server
Jan 8, 2025
Merged

Shutdown servers when resetting cluster membership#295
roosterfish merged 5 commits into
canonical:v3from
claudiubelu:shutdown-server

Conversation

@claudiubelu
Copy link
Copy Markdown
Contributor

@claudiubelu claudiubelu commented Nov 29, 2024

In the k8s-snap, we're seeing a significant amount of EOF errors client-side, typically when operations such as k8s bootstrap or k8s join-cluster are run and an error occurs server-side.

When closing a listener, the server won't accept any new connections. However, we should still allow for the current connections / requests to finish. Calling server.Shutdown allows us to wait for them to finish. We'll be doing this only if drainConnectionsTimeout has been set for the server.

Note that in the case we fail to bootstrap / join the cluster, we'll be resetting a few things, including the cluster membership. This includes the HTTPS and unix socket servers we have open by closing them.

However, we cannot gracefully shutdown the servers, as there's at least one connection that is still open: the bootstrap / join request. Forcing the connection to close before we're able to write the request response will result in the client getting an EOF error, and no information regarding the failure.

Running the revert actions in a goroutine will address this issue: while the revert happens, we'll be able to return and write the HTTP response and then close the connection, finally allowing the Servers to gracefully shutdown, and the clients to be happy. As a bonus, this will also significantly reduce the amount of time spent by the bootstrap / join requests in case of failures.

@claudiubelu claudiubelu force-pushed the shutdown-server branch 2 times, most recently from e80421f to bc4314c Compare November 29, 2024 08:47
@roosterfish
Copy link
Copy Markdown
Contributor

Hi, thanks for the PR. Have you seen #198?
The idea is to add a new DrainConnections field to the Server struct which allows to configure what you are proposing in this PR.

Could you please make the functionality dependent on this field?
You should be able to pass another flag into NewNetwork() when creating the endpoint so that when calling Close() on it you have the necessary information available.

@claudiubelu
Copy link
Copy Markdown
Contributor Author

Hi, thanks for the PR. Have you seen #198? The idea is to add a new DrainConnections field to the Server struct which allows to configure what you are proposing in this PR.

Could you please make the functionality dependent on this field? You should be able to pass another flag into NewNetwork() when creating the endpoint so that when calling Close() on it you have the necessary information available.

Hmm, there might also be the question of how long to wait for connections to drain. Currently, we're waiting 5 seconds, but that could be configurable. We could have DrainConnectionsTimeout, and setting it to 0 would disable this aspect. Wdyt?

@roosterfish
Copy link
Copy Markdown
Contributor

We could have DrainConnectionsTimeout, and setting it to 0 would disable this aspect. Wdyt?

Sounds good to me. By default (in case DrainConnectionsTimeout is 0) I would use the endpoints ctx which is present for both types Network and Socket when calling Shutdown(ctx) on the server.
In your changes you currently use context.Background(). Only if DrainConnectionsTimeout != 0 I would derive a new context with timeout from the endpoint's ctx.

@claudiubelu claudiubelu force-pushed the shutdown-server branch 3 times, most recently from 0902544 to 87196d3 Compare December 4, 2024 16:30
@claudiubelu
Copy link
Copy Markdown
Contributor Author

We could have DrainConnectionsTimeout, and setting it to 0 would disable this aspect. Wdyt?

Sounds good to me. By default (in case DrainConnectionsTimeout is 0) I would use the endpoints ctx which is present for both types Network and Socket when calling Shutdown(ctx) on the server. In your changes you currently use context.Background(). Only if DrainConnectionsTimeout != 0 I would derive a new context with timeout from the endpoint's ctx.

Added the DrainConnectionsTimeout, though it's been added in the daemon.Args as well, which is to be used for core servers.

Also, the original intention was to solve the EOF issue we've been seeing around, and I've included a fix for that as well.

@claudiubelu claudiubelu force-pushed the shutdown-server branch 2 times, most recently from 7aa3265 to 62764f9 Compare December 5, 2024 10:40
@claudiubelu claudiubelu changed the title Shutdown servers when closing the listeners Shutdown servers when reseting cluster membership Dec 5, 2024
@claudiubelu claudiubelu changed the title Shutdown servers when reseting cluster membership Shutdown servers when resetting cluster membership Dec 5, 2024
Copy link
Copy Markdown
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this.
Good call also allowing to set the timeout for the core listeners.

Please have a look at my few comments.

Comment thread rest/rest.go
Comment thread internal/endpoints/util.go Outdated
Comment thread internal/endpoints/util.go Outdated
Comment thread internal/daemon/daemon.go Outdated
if err != nil {
return err
}
err = d.endpoints.Shutdown()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we rather extend Down() instead of adding another Shutdown()? As we are already looping through the listeners in Down() you can also call endpoint.ShutdownServer() from there.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. That was my original implementation, as you can see from the fact that this separation was introduced in the second commit (will squash before merging).

However, the reason behind this is a bit complicated. First of all, Down() shutting down the listeners is a good idea, it allows us to no longer accept new connections, and it would be best to call it early in such cases. But we can't necessarily close down the servers just yet, as this may be called due to a request (e.g.: if bootstrap fails (aka during a request), we're "resetting" the cluster). Because this happens during a request, we cannot gracefully complete the request and shutdown the server on the same thread (or at least with how microcluster's code flows), which is why reverter.Fail() was goroutined. If we forcefully shutdown the servers while a connection is still in progress, the client will get an EOF and with no explanation of what actually happens (it doesn't get the error we're supposed to send back, which is the original reason this PR was sent).

We could go back to how I've done this previously, and having Down() also shutdown the servers as you suggest, but the same thread response + shutdown issue I've mentioned could still occur, like it does in clusterMemberPut (the listeners are stopped before writing the reply).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it right this is now fixed as you run StopListeners within the reExec go routine?

Comment thread internal/daemon/daemon.go Outdated
Comment thread internal/rest/resources/control.go Outdated
// Running the revert actions in a goroutine will address this issue: while the revert happens,
// we'll be able to return and write the HTTP response and then close the connection, finally
// allowing the Servers to gracefully shutdown, and the clients to be happy.
go reverter.Fail()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this working just because the sending of the response is faster than the revert? I fear we require some form of synchronization so that it works consistently.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do something like this to flush the response beforehand?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we're doing this to allow the request to actually finish and the response to be written back before closing down the server. The reverter also includes shutting down the server, so we need to finish our request beforehand.

The ManualResponse bit from clusterMemberPut is interesting. Note that resetClusterMember is called before actually getting to write any response. If resetClusterMember would result in the servers to shutdown before getting to write the response, it would result in another EOF error. There is a way in which the ManualResponse may help... though it will be a bit odd: stop the servers in the ManualResponse.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #295 (comment)

Comment thread internal/endpoints/network.go
Comment thread internal/endpoints/network.go
Comment thread internal/rest/resources/cluster.go Outdated
Copy link
Copy Markdown
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please again squash the commits to clean up the history? Please also check my comment on the endpoint ctx.

Comment thread internal/endpoints/network.go
@claudiubelu
Copy link
Copy Markdown
Contributor Author

Can you please again squash the commits to clean up the history? Please also check my comment on the endpoint ctx.

Squashed the commits.

I won't change the defer part though, I think it's fine as it is. According to the golang documentation, "A defer statement defers the execution of a function until the surrounding function returns.", which means that the context gets canceled after the Close method returns. We have to cancel the context in case an errors occurs, and if no error occurs, we'd have to write something like this:

err = shutdownServer(n.ctx, n.server, n.drainConnectionsTimeout)
if err != nil {
  n.cancel()
  //
}

n.cancel()

We'd have that call duplicated, and it would have been worse if there were more error cases. Having a defer simplifies this. Plus, In a lot of official golang documentation defers are used, including examples with contexts: https://pkg.go.dev/context#WithTimeout

@roosterfish
Copy link
Copy Markdown
Contributor

Thanks for squashing.

We'd have that call duplicated

That's a good point so let's keep the defer where it is

A defer statement defers the execution of a function until the surrounding function returns.

This is right but doesn't account for the call to shutdownServers which happens afterwards. You can run this example which will always print hello from ff2 before hello from ff which got deferred:

ff := func() string {
    return "hello from ff"
}

ff2 := func() string {
    defer fmt.Println("hello from ff2")
    return ff()
}

fmt.Println(ff2())

I would propose to just run shutdownServers outside of the return to prevent this situation and just do return nil at the end.

@roosterfish
Copy link
Copy Markdown
Contributor

Sorry maybe I wasn't clear enough with the squashing. I saw your comment here and thought you will squash only the changes which are anyway reverted in one of the follow up commits.

We group the changes by package (indicated as prefix in the commit message). Meaning your changes to the daemon's config would be in a commit message like "internal/daemon: Adding DrainConnectionsTimeout field to config" and so on.

@claudiubelu
Copy link
Copy Markdown
Contributor Author

Thanks for squashing.

We'd have that call duplicated

That's a good point so let's keep the defer where it is

A defer statement defers the execution of a function until the surrounding function returns.

This is right but doesn't account for the call to shutdownServers which happens afterwards. You can run this example which will always print hello from ff2 before hello from ff which got deferred:

ff := func() string {
    return "hello from ff"
}

ff2 := func() string {
    defer fmt.Println("hello from ff2")
    return ff()
}

fmt.Println(ff2())

I would propose to just run shutdownServers outside of the return to prevent this situation and just do return nil at the end.

Yes, it does print hello from ff2 before hello from ff, and that is to be expected, because the defer's "surrounding function" mentioned in the golang documentation is actually the anonymous assigned to ff2, not the function in which ff2 is defined.

Sorry maybe I wasn't clear enough with the squashing. I saw your comment here and thought you will squash only the changes which are anyway reverted in one of the follow up commits.

We group the changes by package (indicated as prefix in the commit message). Meaning your changes to the daemon's config would be in a commit message like "internal/daemon: Adding DrainConnectionsTimeout field to config" and so on.

Done.

Copy link
Copy Markdown
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right on the defer, my example doesn't make sense as it returns the value which changes the effective order of the prints. Forget about it :)

Almost there, just two smaller comments.

Comment thread internal/endpoints/network.go Outdated
Comment thread internal/endpoints/socket.go Outdated
Comment thread internal/endpoints/util.go Outdated
@claudiubelu claudiubelu marked this pull request as draft December 17, 2024 08:21
@claudiubelu
Copy link
Copy Markdown
Contributor Author

Seeing these errors in the logs now:

Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=info msg="Stopping REST API handler - closing https socket" address="172.17.201.20:6400"
Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=error msg="Failed to start server" err="accept tcp 172.17.201.20:6400: use of closed network connection"
Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=info msg="Stopping REST API handler - closing socket" socket=/var/snap/k8s/common/var/lib/k8sd/state/control.socket
Dec 16 23:14:13 ubuntu k8s.k8sd[1038456]: time="2024-12-16T23:14:13Z" level=error msg="Failed to start server" err="accept unix /var/snap/k8s/common/var/lib/k8sd/state/control.socket: use of closed network connection"

Amended Serve from socket.go and network.go to address this issue. Could you review those changes? The logs are now (tested on a cherry-pick to v2 branch for k8s-snap):

Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Stopping REST API handler - closing socket" socket=/var/snap/k8s/common/var/lib/k8sd/state/control.socket
Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Received shutdown signal - stopped serving unix socket listener"
Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Stopping REST API handler - closing https socket" address="172.17.201.20:6400"
Dec 17 09:12:30 ubuntu k8s.k8sd[1159247]: time="2024-12-17T09:12:30Z" level=info msg="Received shutdown signal - stopped serving https socket listener"

@claudiubelu claudiubelu marked this pull request as ready for review December 17, 2024 09:17
@roosterfish
Copy link
Copy Markdown
Contributor

roosterfish commented Dec 18, 2024

Could you review those changes?

Hi, I have ran some extended tests and it looks like we face two other issue here.

  1. In case you set DrainConnectionsTimeout on one of the listeners (core or extended), if you CTRL+C (kill -SIGINT <pid>) any of the daemons, you will see the following error:
ERROR  [2024-12-18T16:30:09+01:00] Failed to gracefully shutdown server          err="context canceled"

This is because as soon as the daemon receives the signal, it cancels its main shutdownCtx context.
As the endpoints (socket/network) context is derived from this one, it gets cancelled too which means the call to server.Shutdown() can never wait and errors out immediately.

So we have to detach shutdownServer from the endpoint's context to allow waiting for x seconds to drain the connections.

  1. We should not return the err from Shutdown back to the caller. Otherwise you might see errors like Error: Daemon stopped with error: context canceled which are misleading as only the draining of the connections timed out.
    But returning the error from the fallback Close() seems legit as this should work and returning the error in case it fails can be valuable.

@claudiubelu
Copy link
Copy Markdown
Contributor Author

Could you review those changes?

Hi, I have ran some extended tests and it looks like we face two other issue here.

1. In case you set `DrainConnectionsTimeout` on one of the listeners (core or extended), if you CTRL+C (`kill -SIGINT <pid>`) any of the daemons, you will see the following error:
ERROR  [2024-12-18T16:30:09+01:00] Failed to gracefully shutdown server          err="context canceled"

This is because as soon as the daemon receives the signal, it cancels its main shutdownCtx context. As the endpoints (socket/network) context is derived from this one, it gets cancelled too which means the call to server.Shutdown() can never wait and errors out immediately.

So we have to detach shutdownServer from the endpoint's context to allow waiting for x seconds to drain the connections.

Right. I initially had context.Background used in this case. Amended commit.

2. We should not return the [`err`](https://github.com/canonical/microcluster/pull/295/files#diff-34c26aa9a2a857700db2ae6c687e4156109197dbe1de7611d2ef5aeecccf8389R37) from `Shutdown` back to the caller. Otherwise you might see errors like `Error: Daemon stopped with error: context canceled` which are misleading as only the draining of the connections timed out.

If we are going to use a separated context (context.Background), we shouldn't be seeing this type of error (context canceled) (which is not related to the operation timing out). Instead, if the operation times out (context deadline exceeded), or any other error occurs, it's still something we should return, handle, and log accordingly. If an error occurs, I would want to see it at least.

Copy link
Copy Markdown
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the efforts, LGTM!

@roosterfish
Copy link
Copy Markdown
Contributor

@claudiubelu wanted to merge now but saw the commits aren't signed. Could you please sign them?

Adds configuration option for draining server connections before
shutting it down.

Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>
Adds configuration option for draining core server connections before
shutting it down.

Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>
When closing a listener, the server won't accept any new connections.
However, we should still allow for the current connections / requests to
finish. Calling ``server.Shutdown`` allows us to wait for them to
finish. We'll be doing this only if drainConnectionsTimeout has been set
for the server.

Note that server.Serve always returns an error, and it will return
http.ErrServerClosed if server.Shutdown or server.Close was called, and
net.ErrClosed if the listener was closed.

Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>
Note that in the case we fail to bootstrap / join the cluster, we'll be
resetting a few things, including the cluster membership. This includes
the HTTPS and unix socket servers we have open by closing them.

However, we cannot gracefully shutdown the servers, as there's at least
one connection that is still open: the bootstrap / join request. Forcing
the connection to close before we're able to write the request response
will result in the client getting an EOF error, and no information
regarding the failure.

Shutting down the server in a goroutine will address this issue: we'll be able
to return and write the HTTP response and then close the connection, finally
allowing the Servers to gracefully shutdown, and the clients to be happy.

Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>
Signed-off-by: Claudiu Belu <claudiu.belu@canonical.com>
@claudiubelu
Copy link
Copy Markdown
Contributor Author

@claudiubelu wanted to merge now but saw the commits aren't signed. Could you please sign them?

Done, Sorry, I forgot to sign them. :)

@roosterfish roosterfish merged commit d50fa50 into canonical:v3 Jan 8, 2025
roosterfish added a commit that referenced this pull request Mar 26, 2025
This addresses a regression introduced with
#295.

Before you were able to close endpoints (listeners) without closing the
underlying server. As part of allowing graceful shutdown of servers (and
listeners), we added code to implicitly shutdown the listeners
underlying server for proper cleanup.

However it looks the code always assumed the underlying server never
gets closed which allows instructing a remote Microcluster member to
join the cluster without closing the connection that was used to
instruct this member to join the cluster.

The PR introduces another flag that explicitly allows shutting down an
endpoint's (listener's) underlying server. We can do this in most of the
cases but should not do it in the following two:

* Close the server when a remote client instructed to join an existing
cluster
* Close the server when restarting one of the extension servers. This
allows keeping active connections alive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants