-
Notifications
You must be signed in to change notification settings - Fork 244
thread leakage checker and memory usage reporter #1226 #1452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jiajunwang
merged 6 commits into
apache:master
from
kaisun2000:threadleak_mem_check_#1226
Oct 9, 2020
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
83b031e
fix #1226
4e8b571
address review comments from Meng and Lei
976ffbb
address hz's comments
e67334e
address JJ's comments
71f8f06
address one more logging.
5d0d97e
added todo to remove system.out and once we achieve 0 thread
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
220 changes: 220 additions & 0 deletions
220
helix-core/src/test/java/org/apache/helix/ThreadLeakageChecker.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,220 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.helix; | ||
|
|
||
| import java.util.ArrayList; | ||
| import java.util.Arrays; | ||
| import java.util.HashMap; | ||
| import java.util.HashSet; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Set; | ||
| import java.util.function.Predicate; | ||
| import java.util.stream.Collectors; | ||
|
|
||
| import org.apache.helix.common.ZkTestBase; | ||
|
|
||
|
|
||
| public class ThreadLeakageChecker { | ||
kaisun2000 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| private static ThreadGroup getRootThreadGroup() { | ||
| ThreadGroup candidate = Thread.currentThread().getThreadGroup(); | ||
| while (candidate.getParent() != null) { | ||
| candidate = candidate.getParent(); | ||
| } | ||
| return candidate; | ||
| } | ||
|
|
||
| private static List<Thread> getAllThreads() { | ||
| ThreadGroup rootThreadGroup = getRootThreadGroup(); | ||
| Thread[] threads = new Thread[32]; | ||
kaisun2000 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| int count = rootThreadGroup.enumerate(threads); | ||
| while (count == threads.length) { | ||
| threads = new Thread[threads.length * 2]; | ||
| count = rootThreadGroup.enumerate(threads); | ||
| } | ||
| return Arrays.asList(Arrays.copyOf(threads, count)); | ||
| } | ||
|
|
||
| private static final String[] ZKSERVER_THRD_PATTERN = | ||
| {"SessionTracker", "NIOServerCxn", "SyncThread:", "ProcessThread"}; | ||
| private static final String[] ZKSESSION_THRD_PATTERN = | ||
| new String[]{"ZkClient-EventThread", "ZkClient-AsyncCallback", "-EventThread", "-SendThread"}; | ||
| private static final String[] FORKJOIN_THRD_PATTERN = new String[]{"ForkJoinPool"}; | ||
| private static final String[] TIMER_THRD_PATTERN = new String[]{"time"}; | ||
| private static final String[] TASKSTATEMODEL_THRD_PATTERN = new String[]{"TaskStateModel"}; | ||
jiajunwang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| /* | ||
| * The two threshold -- warning and limit, are mostly empirical. | ||
| * | ||
| * ZkServer, current version has only 4 threads. In case later version use more, we the limit to 100. | ||
| * The reasoning is that these ZkServer threads are not deemed as leaking no matter how much they have. | ||
| * | ||
| * ZkSession is the ZkClient and native Zookeeper client we have. ZkTestBase has 12 at starting up time. | ||
| * Thus, if there is more than that, it is the test code leaking ZkClient. | ||
| * | ||
| * ForkJoin is created by using parallel stream or similar Java features. This is out of our control. | ||
| * Similar to ZkServer. The limit is to 100 while keep a small _warningLimit. | ||
| * | ||
| * Timer should not happen. Setting limit to 2 not 0 mostly because even when you cancel the timer | ||
| * thread, it may take some not deterministic time for it to go away. So give it some slack here | ||
| * | ||
| * Also note, this ThreadLeakage checker depends on the fact that tests are running sequentially. | ||
| * Otherwise, the report is not going to be accurate. | ||
| */ | ||
| private static enum ThreadCategory { | ||
kaisun2000 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ZkServer("zookeeper server threads", 4, 100, ZKSERVER_THRD_PATTERN), | ||
| ZkSession("zkclient/zooKeeper session threads", 12, 12, ZKSESSION_THRD_PATTERN), | ||
| ForkJoin("fork join pool threads", 2, 100, FORKJOIN_THRD_PATTERN), | ||
| Timer("timer threads", 0, 2, TIMER_THRD_PATTERN), | ||
| TaskStateModel("TaskStateModel threads", 0, 0, TASKSTATEMODEL_THRD_PATTERN), | ||
| Other("Other threads", 0, 2, new String[]{""}); | ||
|
|
||
| private String _description; | ||
| private List<String> _pattern; | ||
| private int _warningLimit; | ||
| private int _limit; | ||
|
|
||
| public String getDescription() { | ||
| return _description; | ||
| } | ||
|
|
||
| public Predicate<String> getMatchPred() { | ||
| if (this.name() != ThreadCategory.Other.name()) { | ||
| Predicate<String> pred = target -> { | ||
| for (String p : _pattern) { | ||
| if (target.toLowerCase().contains(p.toLowerCase())) { | ||
| return true; | ||
| } | ||
| } | ||
| return false; | ||
| }; | ||
| return pred; | ||
| } | ||
|
|
||
| List<Predicate<String>> predicateList = new ArrayList<>(); | ||
| for (ThreadCategory threadCategory : ThreadCategory.values()) { | ||
| if (threadCategory == ThreadCategory.Other) { | ||
| continue; | ||
| } | ||
| predicateList.add(threadCategory.getMatchPred()); | ||
| } | ||
| Predicate<String> pred = target -> { | ||
| for (Predicate<String> p : predicateList) { | ||
| if (p.test(target)) { | ||
| return false; | ||
| } | ||
| } | ||
| return true; | ||
| }; | ||
|
|
||
| return pred; | ||
| } | ||
|
|
||
| public int getWarningLimit() { | ||
| return _warningLimit; | ||
| } | ||
|
|
||
| public int getLimit() { | ||
| return _limit; | ||
| } | ||
|
|
||
| private ThreadCategory(String description, int warningLimit, int limit, String[] patterns) { | ||
| _description = description; | ||
| _pattern = Arrays.asList(patterns); | ||
| _warningLimit = warningLimit; | ||
| _limit = limit; | ||
| } | ||
| } | ||
|
|
||
| public static boolean afterClassCheck(String classname) { | ||
| ZkTestBase.reportPhysicalMemory(); | ||
| // step 1: get all active threads | ||
| List<Thread> threads = getAllThreads(); | ||
| System.out.println(classname + " has active threads cnt:" + threads.size()); | ||
|
|
||
| // step 2: categorize threads | ||
| Map<String, List<Thread>> threadByName = null; | ||
| Map<ThreadCategory, Integer> threadByCnt = new HashMap<>(); | ||
| Map<ThreadCategory, Set<Thread>> threadByCat = new HashMap<>(); | ||
| try { | ||
| threadByName = threads. | ||
| stream(). | ||
| filter(p -> p.getThreadGroup() != null && p.getThreadGroup().getName() != null | ||
| && ! "system".equals(p.getThreadGroup().getName())). | ||
| collect(Collectors.groupingBy(p -> p.getName())); | ||
| } catch (Exception e) { | ||
| System.out.println("filtering thread failure with exception:" + e.getStackTrace()); | ||
| } | ||
|
|
||
| threadByName.entrySet().stream().forEach(entry -> { | ||
| String key = entry.getKey(); // thread name | ||
| Arrays.asList(ThreadCategory.values()).stream().forEach(category -> { | ||
| if (category.getMatchPred().test(key)) { | ||
| Integer count = threadByCnt.containsKey(category) ? threadByCnt.get(category) : 0; | ||
| threadByCnt.put(category, count + entry.getValue().size()); | ||
| Set<Thread> thisSet = threadByCat.getOrDefault(category, new HashSet<>()); | ||
| thisSet.addAll(entry.getValue()); | ||
| threadByCat.put(category, thisSet); | ||
| } | ||
| }); | ||
| }); | ||
|
|
||
| // todo: We should make the following System.out as LOG.INfO once we achieve 0 thread leakage. | ||
| // todo: also the calling point of this method would fail the test | ||
| // step 3: enforce checking policy | ||
| boolean checkStatus = true; | ||
| for (ThreadCategory threadCategory : ThreadCategory.values()) { | ||
| int limit = threadCategory.getLimit(); | ||
| int warningLimit = threadCategory.getWarningLimit(); | ||
|
|
||
| Integer categoryThreadCnt = threadByCnt.get(threadCategory); | ||
| if (categoryThreadCnt != null) { | ||
| boolean dumpThread = false; | ||
| if (categoryThreadCnt > limit) { | ||
| checkStatus = false; | ||
| System.out.println( | ||
| "Failure " + threadCategory.getDescription() + " has " + categoryThreadCnt + " thread"); | ||
| dumpThread = true; | ||
| } else if (categoryThreadCnt > warningLimit) { | ||
| System.out.println( | ||
| "Warning " + threadCategory.getDescription() + " has " + categoryThreadCnt + " thread"); | ||
| dumpThread = true; | ||
| } else { | ||
| System.out.println(threadCategory.getDescription() + " has " + categoryThreadCnt + " thread"); | ||
| } | ||
| if (!dumpThread) { | ||
| continue; | ||
| } | ||
| // print first 100 thread names | ||
| int i = 0; | ||
| for (Thread t : threadByCat.get(threadCategory)) { | ||
| System.out.println(i + " thread:" + t.getName()); | ||
| i++; | ||
| if (i == 100) { | ||
kaisun2000 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| System.out.println(" skipping the rest"); | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return checkStatus; | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.