Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3699] Nio|EpollEventLoopGroup.shutdownGracefully() needs to gracefully... #3706

Closed
wants to merge 1 commit into from

Conversation

normanmaurer
Copy link
Member

... close connections.

Motivation:

Currently when calling Nio|EpollEventLoopGroup.shutdownGracefully() all active Channels will be closed right away. This is not what should happen for a graceful shutdown.

Modifications:

Only close Channels after quite period is complete.

Result:

Correct behavior when call shutdownGracefully()

…lly close connections.

Motivation:

Currently when calling Nio|EpollEventLoopGroup.shutdownGracefully() all active Channels will be closed right away. This is not what should happen for a graceful shutdown.

Modifications:

Only close Channels after quite period is complete.

Result:

Correct behavior when call shutdownGracefully()
@normanmaurer
Copy link
Member Author

@trustin please check

@normanmaurer
Copy link
Member Author

@Scottmitch also please check

@normanmaurer normanmaurer self-assigned this Apr 29, 2015
@normanmaurer normanmaurer added this to the 4.0.28.Final milestone Apr 29, 2015
@trustin
Copy link
Member

trustin commented Apr 30, 2015

TL;DR - We don't need to fix this because it is working as intended.

The intention of closeAll() is to make sure to close all channels when an event loop is shutting down so that the channels do not produce new events anymore. If we move closeAll() after confirmShutdown() returns true, the channel events related with the closure may not be processed correctly (e.g. rejected) because an event loop will not accept new tasks once confirmShutdown() returns true.

#3699 says the connections are closed before in-flight messages are handled. Actually, that's an intended behavior. Why? If we waited until all incoming messages are handled, it will always take as much time as the grace period, and we will still not be able to ensure all messages are handled. i.e. we solved nothing really.

I think it's better for a user to communicate with the remote peers to schedule the shutdown and call shutdownGracefully() when it's ready to terminate the event loop.

/cc @fantayeneh

@normanmaurer
Copy link
Member Author

@trustin true ... let me close as won't fix...

@fantayeneh
Copy link

@trustin I am still confused about shutdownGracefully() documentation then. Can you please shed some light on the purpose of quietPeriod param. If the closeAll() happens right away why the need for the quite period.

The current doc says as follows.

Unlike shutdown(), graceful shutdown ensures that no tasks are submitted for 'the quiet period' 
(usually a couple seconds) before it shuts itself down. 
If a task is submitted during the quiet period, it is guaranteed to be accepted and the quiet 
period will start over.

Thanks

@pyr
Copy link

pyr commented Nov 4, 2022

To support graceful shutdown of applications, the usual workflow is as follows:

  1. Stop listening for new requests
  2. Wait for on-going tasks to be cleared from the event loop (finishing serving in-flight requests)
  3. Stop related executors and shut down service

This enables blue-green type deployments to be easily put together. If I am reading @trustin's comment correctly, .shutdownGracefully does not allow for (2.) to happen since it will cancel ongoing work on the event loop.

Is my understanding correct and if so, what would be the recommended way to gracefully stop a Netty based proxy which may have in-flight requests waiting for responses from downstream services before shutting down?

cc @normanmaurer for insight (for context: we are trying to get clj-commons/aleph to do the right thing when closing down)

@pyr
Copy link

pyr commented Nov 8, 2022

Following-up here in case someone lands here. It seems that the best way to approach this is to keep track of connections with a ChannelGroup (by adding another ChannelHandler to the pipeline which acts during channelActive and channelInactive). After closing the listening channel, ChannelGroup::newCloseFuture can be called and waited upon, once realized, the rest of the shutdown procedure can be carried through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants