Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log stacktraces of threads that failed to terminate on shutdown within timeout in ExecutorProvider #9840

Merged
merged 5 commits into from
Mar 9, 2021

Conversation

jerrypeng
Copy link
Contributor

@jerrypeng jerrypeng commented Mar 9, 2021

Motivation

Log stacktraces of threads that failed to terminate on shutdown within timeout in ExecutorProvider so that we have more insights on where threads are getting stuck especially for message listener threads that run user code.

Example of output when an executor failed to shutdown within timeout

15:54:42.861 [main] ERROR org.apache.pulsar.client.util.ExecutorProvider - Failed to terminate executor with pool name pulsar-external-listener within timeout. The following are stack traces of still running threads.

"pulsar-external-listener-39-1"  prio=5 tid=151 TIMED_WAITING
java.lang.Thread.State: TIMED_WAITING
        at java.lang.Thread.sleep(Native Method)
        at org.apache.pulsar.client.api.SimpleProducerConsumerTest$3.received(SimpleProducerConsumerTest.java:3326)
        at org.apache.pulsar.client.impl.ConsumerBase.callMessageListener(ConsumerBase.java:880)
        at org.apache.pulsar.client.impl.ConsumerBase.lambda$triggerListener$6(ConsumerBase.java:860)
        at org.apache.pulsar.client.impl.ConsumerBase$$Lambda$759/2081957035.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

@jerrypeng jerrypeng added this to the 2.8.0 milestone Mar 9, 2021
@jerrypeng jerrypeng requested review from merlimat and sijie March 9, 2021 00:00
@jerrypeng jerrypeng self-assigned this Mar 9, 2021
@@ -71,7 +75,9 @@ public void shutdownNow() {
executors.forEach(executor -> {
executor.shutdownNow();
try {
executor.awaitTermination(10, TimeUnit.SECONDS);
if(!executor.awaitTermination(10, TimeUnit.SECONDS)) {
log.error("Failed to terminate executor with pool name {} within timeout. The following are stack traces of still running threads.\n{}", poolName, getThreadDump());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use log.warn here?

@@ -791,7 +791,7 @@ private static EventLoopGroup getEventLoopGroup(ClientConfigurationData conf) {
return EventLoopUtil.newEventLoopGroup(conf.getNumIoThreads(), threadFactory);
}

private static ThreadFactory getThreadFactory(String poolName) {
public static ThreadFactory getThreadFactory(String poolName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it makes more sense to just move this static method over to the other class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed not needed anymore

import static com.google.common.base.Preconditions.checkNotNull;
import com.google.common.collect.Lists;
import io.netty.util.concurrent.DefaultThreadFactory;
import javafx.util.Pair;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use javafx import, rather commons.lang3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@jerrypeng jerrypeng merged commit d1a6393 into apache:master Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants