Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
ShallowEtagHeaderFilter should use a more efficiently dynamically resizing buffer than ResizableByteArrayOutputStream [SPR-12081] #16697
ShallowEtagHeaderFilter buffers the response in ResizableByteArrayOutputStream. When it needs to grows the response, it uses ResizableByteArrayOutputStream.resize(int):
ResizableByteArrayOutputStream then creates a new buffer, copies the old buffer into the new buffer, then (implicitly) releases the old buffer.
It would be more efficient to use a linked list of buffers. The resize operation would create a new empty buffer and add it to the linked list of buffers. This approach results in less copying so less overall memory use (reducing the frequently of problems such as #15482 when due to buffer resizing).
For an example of this approach in use today, see Grails' StreamCharBuffer: https://github.com/grails/grails-core/blob/355a031bfacbdc94c60b4a8fe4131a500c8833cb/grails-encoder/src/main/groovy/org/grails/buffer/StreamCharBuffer.java Note that this approach is also in use in MyFaces, see https://issues.apache.org/jira/browse/MYFACES-3450
Juergen Hoeller commented
I suppose this is only really inefficient when the content length has not been specified? Since with a specified content length, we'll use an appropriately sized buffer in the first place...
I'm just wondering how important this is for the 4.1.x line. We'll start 4.2 work pretty soon and this kind of rearchitecting in the details would be a fine fit there from my perspective.
Agreed - this is only important when the content length is not specified. However, for most responses, the content length isn't known ahead of time. I think that only when serving static content is the length known, for typical dynamic responses (gsp, jsp, velocity, freemarker, raw response.write, etc) there's no way to know so a lot of resizing happens. These types of responses are also the least cacheable by caching proxies so they're the most likely to be processed by the application server making this issue that much more important.
I'd love to have the non-API change parts made for 4.1.x. But if that's not possible, I'll still be more than happy to see the improvement made for 4.2.x. :-)
Brian Clozel commented
This is something we can work on, for sure.
Initial results show that the current implementation performs equally or better that the proposal.
For 25Kb responses
For 100Kb responses
This benchmark (like many of those) may be flawed, so don't hesitate to point issues there.
Brian Clozel sorry for the delayed response - it took me a while to figure out why your test got such different results than what I expected. Turns out it's a rather small thing that makes a measurable difference.
In your test, you do:
which causes FastByteArrayOutputStream to do a "safe" byte array. It creates a new byte array, copies all the buffered data into it, releases all the previously buffered data, uses the newly created by array as the new buffer, then copies the buffer and returns it. This copying then returning the buffer is the slow part - toByteArrayUnsafe() returns the internal buffer instead of copying it. If you use baos.toByteArrayUnsafe() you'll get different results with FastByteArrayOutputStream beating ResizableByteArrayOutputStream with a fixed size and roughly tying for the unknown size.
In my pull request, toByteArray() is only used in the first commit in order to be safe (we can't rely on classes that extend ShallowEtagHeaderFilter and override isEligibleForEtag or generateETagHeaderValue to not modify the byte). So the lower performance your jmh test highlights would be present if only the first commit is merged.
However, the second commit eliminates this problem. toByteArray() is never called; the OutputStream is read as a stream (instead of a byte). So it's both safe and fast.
Finally, I think that it's much more realistic for little bits of data to be written many times rather than one bit of data written once. So I added these 2 tests:
Using these tests, FastByteArrayOutputStream beats ResizableByteArrayOutputStream by ~50%. It even wins when using toByteArray() instead of toByteArrayUnsafe().