Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce GC pressure in CompositeJsonEncoder and *LogstashTcpSocketAppender #461

Closed
brenuart opened this issue Dec 30, 2020 · 4 comments · Fixed by #472
Closed

Reduce GC pressure in CompositeJsonEncoder and *LogstashTcpSocketAppender #461

brenuart opened this issue Dec 30, 2020 · 4 comments · Fixed by #472
Milestone

Comments

@brenuart
Copy link
Collaborator

brenuart commented Dec 30, 2020

CompositeJsonEncoder implements the Encoder interface and therefore must return the encoded event as a byte array.
The implementation makes use of an intermediate ByteArrayOutputStream to collect the various parts produced by the formatter and prefix/suffix encoders. When done, the result is returned as a byte array.

A new ByteArrayOutputStream is initialised for every log event. It starts with an initial size of about 1Kb (+ prefix/suffix length) by default and grows if the formatter produces a larger output. If we are lucky and the initial size is large enough, this process allocates 2 byte arrays and 2 memory copy. If the buffer needs to grow, a new one is allocated (larger) and the content of the previous is copied into it. We end up with 3 allocations and 3 copy operations.

This process is repeated for every event and imposes an extra overhead on the garbage collector.

Most of the time, the caller will write the output of the Encoder into an output stream. In this case, using an intermediate byte array isn't the most efficient design (well, I know, this is how Logback's Encoder interface is designed :-( ...

But maybe we could do better... I was thinking about introducing a new StreamingEncoder interface similar to this:

public interface StreamingEncoder {
  void encode(Event event, OutputStream stream) throws IOException;
}

CompositeJsonEncoder can be easily modified to implement this new interface alongside the existing byte[] encode(event) method. Then we can adapt AbstractLogstashTcpSocketAppender around lines L598-L602 to tell the encoder to write directly into the output stream if it implements the new StreamingEncoder interface.

This would be highly efficient while preserving support for "legacy" encoders.

I made a first POC with this idea and everything looks OK.
What do you think ?
Do you see other areas/classes that could be optimised using a similar technique?

@philsttr
Copy link
Collaborator

<rant>
It's funny you mention this, because this was one of my biggest complaints in logback 1.2. Between logback 1.1 and 1.2, the Encoder interface changed to return a byte[], instead of being able to write to an internal stream. Besides not being backwards compatible (which is another complaint), encoders in logback 1.2 cannot be as efficient as logback 1.1. :(
</rant>

Your idea sounds intriguing. Can you submit the POC as draft PR so I can take a deeper look?

@brenuart
Copy link
Collaborator Author

brenuart commented Dec 31, 2020 via email

@philsttr
Copy link
Collaborator

philsttr commented Jan 1, 2021

Btw, do we still have to support Logback 1.0.x?

1.0 no
1.1 yes, but support can potentially be dropped in a future major release of logstash-logback-encoder #465

brenuart added a commit to brenuart/logstash-logback-encoder that referenced this issue Jan 5, 2021
…f returning a byte array

Introduce a new (internal) StreamingEncoder interface to be implemented by Encoders that supports writing directly into the output stream instead of returning their results as a byte array.

Update both the AbstractLogstashTcpSocketAppender and the CompositeJsonEncoder to support this new interface. This should hoppefully reduce the amount of short-lived byte arrays created for each log event.

See logfellow#461 for more information.
brenuart added a commit to brenuart/logstash-logback-encoder that referenced this issue Jan 5, 2021
…f returning a byte array

Introduce a new (internal) StreamingEncoder interface to be implemented by Encoders that supports writing directly into the output stream instead of returning their results as a byte array.

Update both the AbstractLogstashTcpSocketAppender and the CompositeJsonEncoder to support this new interface. This should hoppefully reduce the amount of short-lived byte arrays created for each log event.

See logfellow#461 for more information.
@philsttr
Copy link
Collaborator

philsttr commented Jan 9, 2021

Closing this issue. Continue discussion on PR #472

@philsttr philsttr closed this as completed Jan 9, 2021
@philsttr philsttr modified the milestone: 7.0 Jan 9, 2021
brenuart added a commit to brenuart/logstash-logback-encoder that referenced this issue Jan 9, 2021
…f returning a byte array

Introduce a new (internal) StreamingEncoder interface to be implemented by Encoders that supports writing directly into the output stream instead of returning their results as a byte array.

Update both the AbstractLogstashTcpSocketAppender and the CompositeJsonEncoder to support this new interface. This should hoppefully reduce the amount of short-lived byte arrays created for each log event.

See logfellow#461 for more information.
philsttr added a commit that referenced this issue Jul 5, 2021
Reduce memory allocations by writing directly into the output stream (#461)
@philsttr philsttr added this to the 7.0 milestone Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants