Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1412 Wait for async close tasks before close completes #1435

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

MiErnst
Copy link

@MiErnst MiErnst commented Jul 5, 2017

#1412 Implemented a count down latch and called some API methods to close resources of netty.

Copy link
Contributor

@slandelle slandelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!
Could you please have a look at my question?


//see https://github.com/netty/netty/issues/2084#issuecomment-44822314
try {
ThreadDeathWatcher.awaitInactivity(5, TimeUnit.SECONDS);
Copy link
Contributor

@slandelle slandelle Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the await activity be aligned with AsyncHttpClientConfig#getShutdownTimeout or AsyncHttpClientConfig#getShutdownQuietPeriod instead of being hard coded?

See https://github.com/AsyncHttpClient/async-http-client/blob/master/client/src/main/java/org/asynchttpclient/netty/channel/ChannelManager.java#L309

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess aligning it with AsyncHttpClientConfig#getShutdownTimeout makes absolutely sense. The quietPeriod seems to have different meaning and I guess it is inappropriate at this place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


//see https://github.com/netty/netty/issues/2084#issuecomment-44822314
try {
GlobalEventExecutor.INSTANCE.awaitInactivity(5, TimeUnit.SECONDS);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

// Ignore
}

closeLatch.countDown();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In finally block?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes could be done to cover the corner cases.

@@ -310,6 +324,13 @@ public void close() {
.addListener(future -> doClose());
} else
doClose();

try {
closeLatch.await();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No timeout for this await?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, timeout could be reused!

@@ -23,7 +23,7 @@ public AsyncHttpClientState(AtomicBoolean closed) {
this.closed = closed;
}

public boolean isClosed() {
public boolean isClosedOrClosingIsTriggered() {
return closed.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's best to rename closed into closeTriggered.

@@ -597,7 +597,7 @@ public void replayRequest(final NettyResponseFuture<?> future, FilterContext fc,
}

public boolean isClosed() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename method too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and attached another commit to the pull request!

@twz123
Copy link
Contributor

twz123 commented Jul 5, 2017

I'm just wondering... Will this interfere with other parts of an application that also use Netty? So let's say I close an AHC instance but have some other Netty-based component, e.g. a server component - or just another AHC instance - that is still in use. To me, it sounds like those global helper methods would then just run into a timeout.

@slandelle
Copy link
Contributor

Yeah, actually I think @twz123 is right and this would also break for people who spawn and shutdown AHC instances at runtime without stopping the application :(

@MiErnst
Copy link
Author

MiErnst commented Jul 6, 2017

@twz123 thanks for the check but I can't see any interference to other instances of AHC or netty. If there is still activity on the threads the awaitInactivity calls triggered in the close method of the AHC client should run into the timeout and that's it. The close returns successfully without any exception. Maybe the close then needs a while till it returns but it should not affect other instances or did I miss something?

The awaitInactivity implementations call join on a thread but this only waits for the thread to die and does not trigger an active kill or did I miss something here? Javadoc of join: "Waits at most millis milliseconds for this thread to die."

If we could not clarify this today we have to discuss this another time because I'm out of office for four weeks now (starting tomorrow). Sorry for this inconvenience.

@twz123
Copy link
Contributor

twz123 commented Jul 6, 2017

Hey @MiErnst,

should not affect other instances

Okay, that's true. Maybe I should have written in passive voice "is affected by other parts of the app that use Netty" instead of "interferes with". My caveat is exactly what you wrote:

close then needs a while till it returns

So the blocking behavior of close only works as expected when no other parts of the app are currently using Netty. This is certainly not a complete show-stopper but boils down to "just wait some amount of time and just assume that all resources have been closed concurrently by then" for those cases where Netty is still in use.

From my point of view, there are several possibilities now:

  1. "Live with it!" 😄 Then I'd add a prominent note to the Javadoc of close that explains that behavior.
  2. Have a second, blocking version of close that does exactly what you've implemented. Maybe with some boolean blocking parameter, or returning some Future that can be waited on by the caller if desired. So people can choose according to their use case.
  3. Find a way that only waits for resources to be closed that actually belong to the AHC instance being closed. I have the wary feeling that this isn't possible, since those global helper methods are obviously there for a reason.
  4. Don't call the static Netty methods inside close at all. Since they're static, one could just call them manually if desired, after the close call on the AHC instance itself.

}

public boolean isClosed() {
return closed.get();
public boolean isCloseTriggered() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with the old name?

try {
ThreadDeathWatcher.awaitInactivity(config.getShutdownTimeout(), TimeUnit.MILLISECONDS);
} catch(InterruptedException t) {
// Ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-assert interrupted status

try {
GlobalEventExecutor.INSTANCE.awaitInactivity(config.getShutdownTimeout(), TimeUnit.MILLISECONDS);
} catch(InterruptedException t) {
// Ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-assert interrupted status

try {
closeLatch.await(config.getShutdownTimeout(), TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
// Ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-assert interrupted status

@zhan-ge
Copy link

zhan-ge commented Sep 4, 2017

Hi guys, is there a plan to release these commits?
In my scenario I have to close the specified client during runtime, therefore, I urgently need this. Thanks a lot!

@He-Pin
Copy link

He-Pin commented Sep 4, 2017

@zhan-ge There was changes request,so won't be any soon.and AHC has its own release cycle.

@slandelle
Copy link
Contributor

slandelle commented Sep 4, 2017

Frankly, I'm not fond of this implementation.
Netty is just everywhere nowadays.
People using Netty for multiple components are probably at least 50% of AHC users, eg:

  • server side HTTP layer
  • whatever NoSQL database client (Cassandra, MongoDB...)
  • new AWS async client
  • multiple AHC instances

So having a mechanism based on this GlobalEventExecutor.INSTANCE singleton seems broken to me.

Why not keeping track of flying requests? Like having an AsyncHttpClient that would be a decorator, that would increment an AtomicInteger when executing requests, would register a Listener to decrement on completion, and that could switch to closing state (reject new requests, await for current flying requests and then close).

@MiErnst
Copy link
Author

MiErnst commented Sep 26, 2017

Hey together,

sorry for the circumstances that I wasn’t able to work on this pull request.

@twz123
I like your comment and I would like to answer to your mentioned options:

  1. “Live with it”, yes this is possible but I think your considerations are right.
  2. I really like this idea. I added a commit to the pull request with a second method. I have rewritten the fix to use CompletableFutures and there is only one “get” call which waits till it times out or the close completes. I have tested it in our application and it always completes within one second when calling the new method. I don’t return the future because the only thing the client could do is to call the get method itself with the client config and it’s timeout settings in mind. From an API designer perspective, I think the return type of the new method should be void.
  3. Finding a way that only waits for resources to be closed that actually belong to the AHC instance is now the default behavior of calling the existing method close. The new method closeAndAwaitInactivity additionally waits for the shared resources.
  4. Yes, this is possible but I think that other AHC users could benefit from a fix and finding these hidden API methods was a pain. :-)

As you can see in the implementation I changed the return type of the ChannelManager’s close method. Although this method is public, the instance of the ChannelManager seems only to managed by the DefaultAsyncHttpClient and is not returned by any public method of this client, so I think this internal API change should not be a problem.

The current close method also waits for the ChannelManager to complete successfully. If the eventLoopGroup is going to shutdown gracefully, the close will wait till this method completes but I think this is OK because it’s a resource created by the AHC. To prevent the amount of TimeoutExceptions in the travis build, the system property org.asynchttpclient.shutdownTimeout in the surefire plugin configuration has to be increased to 1500 because the gracefull shutdown needs at least 1.2 seconds to complete. I don’t know if this affects the travis build time to much so it is currently not part of the push request.

I don’t think the currently failed test is related to this implementation. Can’t see what the cause is.
Don’t hesitate to provide feedback for the new implementation.

@slandelle
I can’t see how keeping track of flying requests might help to wait on this shared resources of netty. A decorator would only help if this decorator is implemented around the netty stack and AHC as well as all other applications which use netty would use this decorator. I don’t think this is a realistic fix. By the new implementation, all netty resources which are not shared between AHC clients, are closed by calling close or the new method. The new method additionally waits for the shared resources of netty. With the new method users can choose which close method they want to use.

@transamericamoon
Copy link

Can you guys just change it to this and be done with it, I think by the time they are closing the pool waiting sync vs async does not matter that much:

 private void doClose() {
    openChannels.close().awaitUninterruptibly();
    channelPool.destroy();
  }

@slandelle slandelle force-pushed the master branch 4 times, most recently from 6ea11f4 to f8fab66 Compare February 7, 2020 12:25
@TomGranot TomGranot added this to To do in Triage Mar 17, 2021
@TomGranot TomGranot moved this from Triage to In progress in Triage Mar 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Triage
  
In progress
Development

Successfully merging this pull request may close these issues.

None yet

7 participants