Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add functionality for an automatic thread dump upon thread pool exhaustion #137

Open
wants to merge 2 commits into
base: 2.4
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
9 changes: 9 additions & 0 deletions src/main/java/org/jboss/threads/EnhancedQueueExecutor.java
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
import java.util.concurrent.ThreadFactory;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicIntegerFieldUpdater;
import java.util.concurrent.atomic.AtomicLongFieldUpdater;
Expand Down Expand Up @@ -278,6 +279,11 @@ public final class EnhancedQueueExecutor extends AbstractExecutorService impleme
*/
volatile Runnable terminationTask;

/**
* If the thread pool has reached its max and generated a thread dump.
*/
private AtomicBoolean reachedMax = new AtomicBoolean();

// =======================================================
// Statistics fields and counters
// =======================================================
Expand Down Expand Up @@ -1514,6 +1520,9 @@ int tryAllocateThread(final float growthResistance) {
oldSize = currentSizeOf(oldStat);
if (oldSize >= maxSizeOf(oldStat)) {
// max threads already reached
if (reachedMax.compareAndSet(false, true)) {
ThreadDumpUtil.threadDump(maxSizeOf(oldStat));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be unconditional. Thread pools reaching capacity is not an error condition, it just means that the behavior changes from submitting tasks to new or waiting threads, to instead adding the task to a queue that the worker threads read from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the condition? This has a boolean to only log on the first occurrence of exhaustion. But if we'd accept such a thread dump utility, I think there should be some further configuration required to enable it for the pool if desired. So this would then prompt another change primarily in XNIO to enable this thread dumping for its task pool where such dumping would be desired. Such a dump would likely not be desired from the MSC thread pool upon its exhaustion for instance.

While the behavior of queuing upon is exhaustion is the expected behavior, the reality for customers using JBoss and support engineers assisting them is that this is universally a critical issue state for the XNIO task thread pool used by undertow as that begins to immediately degrade response times if not lead to full request unresponsiveness as well if already processing requests are not quickly completing. We often see multiple cases a day requiring diagnosis of such thread exhaustion induced by many varied issues and most often not from a conventional deadlock. Further details about this is recorded at https://issues.redhat.com/browse/EAPSUP-1325.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition as you suggest should be a configuration switch, and I'd also propose that there should be either two switches or a single three-state switch to separately configure whether a warning is logged when the thread pool queues, and separately whether the warning includes the thread dump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved this to make it configurable. Let me know any suggestions.

}
return AT_NO;
}
if (oldSize >= coreSizeOf(oldStat) && oldSize > 0) {
Expand Down
19 changes: 19 additions & 0 deletions src/main/java/org/jboss/threads/Messages.java
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,25 @@ interface Messages extends BasicLogger {
@LogMessage(level = Logger.Level.ERROR)
void taskSubmitFailed(@Cause RejectedExecutionException e, Runnable task);

@Message(id = 15, value = "Thread pool has reached %d max threads in use. Performance may be impacted. Thread dump:\n"+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These messages probably shouldn't have IDs like this; a thread dump should probably just go directly to the output, or be formatted otherwise like a normal JDK thread dump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, IDs have previously been assigned to auto generated thread dump to logging functions like in eap/wildfly-core itself:

https://github.com/aogburn/wildfly-core/blob/b3cfcb741c8ced604d8895de48ec920fc130057d/controller/src/main/java/org/jboss/as/controller/logging/ControllerLogger.java#L3710

Or in activemq-artemis:

https://github.com/apache/activemq-artemis/blob/main/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQServerLogger.java#L901

Message ids can be quite helpful to support and I and others use them in simple tooling (https://github.com/aogburn/yala) to quickly search and handle critical known message ids or sum up key information in logging. So for a usability perspective, I'd request that it have an ID.

I think to be truly formatted exactly like a normal JDK thread dump it would probably require being generated through normal JDK utilities, like forking an external process that's calling the JDK jstack command if available (then a dump can be missed if someone has not installed the jdk-devel package or is running a JRE).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then let's make it one big message with one ID instead of multiple. Loggers can accept a StringBuilder as a string argument, if that is helpful.

Copy link
Contributor Author

@aogburn aogburn Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to just one message id for the thread dump messages. I did leave one additional id to handle any exception from a dump attempt. Is that suitable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmlloyd Just wanted to check if you'd have any further feedback here? Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been ruminating on this. For too long I guess. :) I have a couple more comments.

"*******************************************************************************\n" +
"%s\n===============================================================================\n" +
"End Thread dump\n*******************************************************************************\n")
@LogMessage(level = Logger.Level.WARN)
void exhaustedPoolDump(int maxPoolSize, String dump);

@LogMessage(level = Logger.Level.WARN)
@Message(id = 16, value = "Deadlock detected!\n"+
"*******************************************************************************\n" +
"{0}\n===============================================================================\n" +
"End Deadlock\n*******************************************************************************\n",
format = Message.Format.MESSAGE_FORMAT)
void deadLock(String dump);

@Message(id = 17, value = "Exception thrown during generation of thread dump")
@LogMessage(level = Logger.Level.WARN)
void threadDumpException(@Cause Exception cause);

// validation

@Message(id = 100, value = "Keep-alive may only be set to 0 for this executor type")
Expand Down
127 changes: 127 additions & 0 deletions src/main/java/org/jboss/threads/ThreadDumpUtil.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
/*
* JBoss, Home of Professional Open Source.
*
* Copyright 2022 Red Hat, Inc., and individual contributors
* as indicated by the @author tags.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.jboss.threads;

import java.io.PrintWriter;
import java.io.StringWriter;
import java.lang.management.LockInfo;
import java.lang.management.ManagementFactory;
import java.lang.management.MonitorInfo;
import java.lang.management.ThreadInfo;
import java.lang.management.ThreadMXBean;

final class ThreadDumpUtil {

public static void threadDump(int maxPoolSize) {

try (
StringWriter dumpstr = new StringWriter();
PrintWriter dumpout = new PrintWriter(dumpstr);
StringWriter deadlockstr = new StringWriter();
PrintWriter deadlockout = new PrintWriter(deadlockstr);
) {

ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();

for (ThreadInfo threadInfo : threadMXBean.dumpAllThreads(true, true)) {
dumpout.println(threadInfoToString(threadInfo));
}

Messages.msg.exhaustedPoolDump(maxPoolSize, dumpstr.toString());

long[] deadlockedThreads = threadMXBean.findDeadlockedThreads();

if (deadlockedThreads != null && deadlockedThreads.length > 0) {
for (ThreadInfo threadInfo : threadMXBean.getThreadInfo(deadlockedThreads, true, true)) {
deadlockout.println(threadInfoToString(threadInfo));
}

Messages.msg.deadLock(deadlockstr.toString());
}

} catch (Exception e) {
Messages.msg.threadDumpException(e);
}
}

private static String threadInfoToString(ThreadInfo threadInfo) {
StringBuilder sb = new StringBuilder("\"" + threadInfo.getThreadName() + "\"" +
" Id=" + threadInfo.getThreadId() + " " +
threadInfo.getThreadState());
if (threadInfo.getLockName() != null) {
sb.append(" on " + threadInfo.getLockName());
}
if (threadInfo.getLockOwnerName() != null) {
sb.append(" owned by \"" + threadInfo.getLockOwnerName() +
"\" Id=" + threadInfo.getLockOwnerId());
}
if (threadInfo.isSuspended()) {
sb.append(" (suspended)");
}
if (threadInfo.isInNative()) {
sb.append(" (in native)");
}
sb.append('\n');
int i = 0;
for (; i < threadInfo.getStackTrace().length; i++) {
StackTraceElement ste = threadInfo.getStackTrace()[i];
sb.append("\tat " + ste.toString());
sb.append('\n');
if (i == 0 && threadInfo.getLockInfo() != null) {
Thread.State ts = threadInfo.getThreadState();
switch (ts) {
case BLOCKED:
sb.append("\t- blocked on " + threadInfo.getLockInfo());
sb.append('\n');
break;
case WAITING:
sb.append("\t- waiting on " + threadInfo.getLockInfo());
sb.append('\n');
break;
case TIMED_WAITING:
sb.append("\t- waiting on " + threadInfo.getLockInfo());
sb.append('\n');
break;
default:
}
}

for (MonitorInfo mi : threadInfo.getLockedMonitors()) {
if (mi.getLockedStackDepth() == i) {
sb.append("\t- locked " + mi);
sb.append('\n');
}
}
}

LockInfo[] locks = threadInfo.getLockedSynchronizers();
if (locks.length > 0) {
sb.append("\n\tNumber of locked synchronizers = " + locks.length);
sb.append('\n');
for (LockInfo li : locks) {
sb.append("\t- " + li);
sb.append('\n');
}
}
sb.append('\n');
return sb.toString();
}

}