Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LOG4J2-2816: Handle Disruptor event translation exceptions #488

Conversation

jacobrshields
Copy link

LOG4J2-2816

Problem

If RingBufferLogEventTranslator#translateTo throws an exception for any reason, Disruptor's RingBuffer#translateAndPublish will still publish the sequence in a finally block (source).

In such a case, the "untranslated" and unpopulated event will later be consumed by RingBufferLogEventHandler, since its sequence was published. However, the event will be missing all values, including asyncLogger. This causes a NullPointerException to be thrown during event handling in RingBufferLogEvent#execute:

AsyncLogger error handling event seq=0, value='org.apache.logging.log4j.core.async.RingBufferLogEvent@7bb6d06c': java.lang.NullPointerException: null
java.lang.NullPointerException
        at org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:154)
        at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:46)
        at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:29)
        at com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:168)
        at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125)
        at java.lang.Thread.run(Thread.java:748)

Solution

Log4j needs to handle the case where an exception is thrown by RingBufferLogEventTranslator#translateTo.

The Disruptor documentation makes it clear that once a slot is claimed in the ring buffer, the sequence must be published. Otherwise the state of the Disruptor can be corrupted (source). That is why RingBuffer#translateAndPublish is designed the way it is, and why its usage is recommended over more manual methods.

In order to handle such exceptions, then, it seems like the EventHandler must be responsible for checking that the event it is handling is sufficiently populated. Such an approach is also suggested by this Disruptor GitHub issue discussion. This PR implements that approach by adding an isPopulated property to RingBufferLogEvent.

Tests

The new test in AsyncLoggerEventTranslationExceptionTest fails on release-2.x and passes on this branch.

@birdayz
Copy link

birdayz commented Jul 22, 2021

Can we get this merged? We're hitting the same issue.

@vy
Copy link
Member

vy commented Jul 23, 2021

Thanks so much @jacobrshields! Tests also look great too. I have merged your changes into release-2.x. I will port this to master too. Looking forward to your next contribution!

@vy vy closed this Jul 23, 2021
@jacobrshields
Copy link
Author

@vy Thank you for merging this in, I appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants