Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove garbage creation introduced by LOG4J2-2301 #251

Merged
merged 1 commit into from
Jan 14, 2019

Conversation

mprusakov-rbc
Copy link
Contributor

After upgrading to 2.11.1 we have started seeing garbage being generated here:

Stack Trace TLABs Total TLAB Size(bytes) Pressure(%)
java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal, Object) line: 481 10 3,638,864 56.192
java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal$ThreadLocalMap, ThreadLocal, Object) line: 298 10 3,638,864 56.192
java.lang.ThreadLocal.setInitialValue() line: 184 10 3,638,864 56.192
java.lang.ThreadLocal.get() line: 170 10 3,638,864 56.192
org.apache.logging.log4j.core.async.AsyncLoggerConfig.log(LogEvent, LoggerConfig$LoggerConfigPredicate) line: 91 10 3,638,864 56.192

The purpose of this patch is to fix this.

After upgrading to 2.11.1 we have started seeing garbage being generated here:

Stack Trace	TLABs	Total TLAB Size(bytes)	Pressure(%)
java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal, Object) line: 481	10	3,638,864	56.192
   java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal$ThreadLocalMap, ThreadLocal, Object) line: 298	10	3,638,864	56.192
      java.lang.ThreadLocal.setInitialValue() line: 184	10	3,638,864	56.192
         java.lang.ThreadLocal.get() line: 170	10	3,638,864	56.192
            org.apache.logging.log4j.core.async.AsyncLoggerConfig.log(LogEvent, LoggerConfig$LoggerConfigPredicate) line: 91	10	3,638,864	56.192

The purpose of this patch is to fix this.
@carterkozak
Copy link
Contributor

Thank you for reporting! I'm surprised that this wasn't caught by GcFreeMixedSyncAyncLoggingTest, we should fix the bug which allowed this regression to occur.

@remkop
Copy link
Contributor

remkop commented Jan 11, 2019

Yes, that is worrisome. Would be great if we can make the test fail, to confirm that the patch fixes the issue.

@garydgregory
Copy link
Member

garydgregory commented Jan 12, 2019 via email

@carterkozak
Copy link
Contributor

carterkozak commented Jan 12, 2019

This issue is a bit difficult to test for because it requires several threads to log. There's allocation if a ThreadLocal set operation resizes the internal threadlocal map hash table.

@garydgregory
Copy link
Member

Can't this be testing with a single thread? Maybe we need to add a package private methods to ease testing?

@mprusakov-rbc
Copy link
Contributor Author

First of all, thank you for spending time looking at my patch,

I can confirm that I have tested it by patching this one file and allocations do disappear. Taking in account that the build has passed I have every confidence that the patch works.

Frankly, I wasn't aware of the test mentioned above. If defaults are used garbage does get generated even in 2.11.0 by the disruptor so i can look into creating a generic test but then i will have to change the defaults which i am sure people will generally be unhappy about as let's be honest there is very little Java developers who care about garbage...

Having said that, would it be possible to submit this patch and tackle the test separately? Many thanks in advance.

@carterkozak
Copy link
Contributor

Can't this be testing with a single thread?

I don't believe we can, the issue is that changes in ThreadLocal map size (which can only change by 1 on the current thread, which doesn't trigger the issue) result in allocation of new Entry objects if rehashing/expungeStaleEntry has nulled the current value after a hash collision.

would it be possible to submit this patch and tackle the test separately?

I would be okay with merging the fix before we put together a test (pending other devs agreement, of course) because:

  1. The test will be significantly more complex than the issue
  2. We understand why this is happening
  3. I have audited the codebase for this pattern, and don't see other occurrences in gc-free code.

@jvz
Copy link
Member

jvz commented Jan 14, 2019

The @PerformanceSensitive annotation can be used to mark the code as something that should be looked at. We don't have any tooling around that annotation currently, but it's useful for catching regressions like this later on when modifying that code.

@remkop
Copy link
Contributor

remkop commented Jan 14, 2019

Carter, thanks for the clarification. I don’t object to merging the fix first.

@carterkozak
Copy link
Contributor

I've filed this ticket for tracking: https://issues.apache.org/jira/browse/LOG4J2-2533

asfgit pushed a commit that referenced this pull request Jan 14, 2019
@carterkozak carterkozak merged commit b3a910d into apache:master Jan 14, 2019
@carterkozak
Copy link
Contributor

Thank you for your contribution @mprusakov-rbc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants