Describe the bug
When running GCToolkit against a larger G1GC log file it appears that the regex used to parse out the decorator tags is causing a performance hit.
Using the sample with a 58MB log file, the timing of the maven run returns:
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:06 min
After running some profiling, the poorly performing lines in question are (com.microsoft.gctoolkit.parser.jvm.Decorators):
Matcher tagMatcher = UnifiedLoggingTokens.TAGS.matcher(line);
if (tagMatcher.find()) {
numberOfDecorators++;
tags = String.join(",", Arrays.asList(tagMatcher.group(1).trim().split(",")));
}
If these lines are commented out and the sample is re-run against the same file, the result is significantly (10x) better:
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 7.869 s
To Reproduce
Steps to reproduce the behavior:
- Using sample application, run it against attached file.
largegc.zip
Describe the bug
When running GCToolkit against a larger G1GC log file it appears that the regex used to parse out the decorator tags is causing a performance hit.
Using the sample with a 58MB log file, the timing of the maven run returns:
After running some profiling, the poorly performing lines in question are (com.microsoft.gctoolkit.parser.jvm.Decorators):
If these lines are commented out and the sample is re-run against the same file, the result is significantly (10x) better:
To Reproduce
Steps to reproduce the behavior:
largegc.zip