Limit the max length of decoded content in `Decoding{Service,Client}` #4564

ikhoon · 2022-12-07T14:18:04Z

Motivation:

ServerBuilder.maxRequestLength() and
ClientBuilder.maxResponseLength() can limit the max content at the session layer.

armeria/core/src/main/java/com/linecorp/armeria/server/ServerBuilder.java

Lines 1657 to 1662 in ec7c425

    
                * Sets the maximum allowed length of the content decoded at the session layer. 
        
                * e.g. the content length of an HTTP request. 
        
                * 
        
                * @param maxRequestLength the maximum allowed length. {@code 0} disables the length limit. 
        
                */ 
        
               public ServerBuilder maxRequestLength(long maxRequestLength) {

The length-limited compressed content could be a humongous object if the original content was compressed with a high ratio. Such big objects may cause OutOfMemoryError even if the content length is limited at the session layer. For that reason, it is necessary to check the length of an output content while decompressing the compressed content.

Additionally, this PR also fixes #4469.

Note:

ZlibDecoder is able to limit the size at the decoder level. Unfortunately, there is no such option BrotliDecoder, but output data is chunked into 4MiB if it exceeds 4MiB. We can't sophisticatedly enforce the size of output but 4MiB seems an acceptable size.

Modifications:

Added a new API that takes the max length of a decompressed object to StreamDecoderFatory.
Fixed StreamDecoder to raise ContentTooLargeException if the total length of an object exceeds a certain limitation.
- Fixed ZlibStreamDecoder to hand over the given maxLength to ZlibDecoder.
- Fixed AbstractStreamDecoder to limit the content length after decoding.
Added com.aayushatharva.brotli4j:native-osx-aarch64 as an optional dependency.
Removed obsolete client.encoding.StreamDecoderFactories and client.encoding.ZlibStreamDecoder which can be replaced with the same classes in common.encoding.
Made ContentTooLargeExeption take a cause which is most likely a DecompressionException raised by ZlibDecoder.
Fixed DefaultServerErrorHandler to return 413 Request Entity Too Large for ContentTooLargeExeption.
Replaced ByteArrayOutputStream with ByteBufOutputStream in HttpEncodedResponse for zero-copy.
A Content-Encoding is removed by DecodingClient and DecodingService when compressed content is to be decompressed.

Result:

DecodingClient and DecodingService can now limit the maximum length of decompressed data.
Fixes Consider removing content-encoding header in DecodingService and DecodingClient #4469

Motivation: `ServerBuilder.maxRequestLength()` and `ClientBuilder.maxResponseLength()` can limit the max content at the session layer. https://github.com/line/armeria/blob/ec7c4251994f4b3b0089f95904dc1a68e5e9ce30/core/src/main/java/com/linecorp/armeria/server/ServerBuilder.java#L1657-L1662 The length-limited compressed content could be a humongous object if the original content was compressed with a high ratio. Such big objects may cause `OutOfMemoryError` even if the content length is limited at the session layer. For that reason, it is necessary to check the length of an output content while decompressing the compressed content. Additionally, this PR also fixes line#4469. Note: `ZlibDecoder` is able to limit the size at the decoder level. Unfortunately, there is no such option `BrotliDecoder`, but an output data is chunked into 4MiB if it exceeds 4MiB. We can't sophicately enforce the size of output but 4MiB seems a acceptable size. Modifications: - Added a new API that takes the max length of a decompressed object to `StreamDecoderFatory`. - Fixed `StreamDecoder` to raise `ContentTooLargeException` if the total length of an object exceeds a certain limitation. - Fixed `ZlibStreamDecoder` to hand over the given `maxLength` to `ZlibDecoder`. - Fixed `AbstractStreamDecoder` to limit the content length after decoding. - Added `com.aayushatharva.brotli4j:native-osx-aarch64` as an optional dependency. - Removed obsolete `client.encoding.StreamDecoderFactories` and `client.encoding.ZlibStreamDecoder` which can be replaced with the same classes in `common.encoding`. - Made `ContentTooLargeExeption` take an cause which is most likely a `DecompressionException` raised by `ZlibDecoder`. - Fixed `DefaultServerErrorHandler` to return `413 Request Entity Too Large` for `ContentTooLargeExeption`. - Replaced `ByteArrayOutputStream` with `ByteBufOutputStream` in `HttpEncodedResponse` for zero-copy. - A `Content-Encoding` is removed by `DecodingClient` and `DecodingService` when a compressed content is to be decompressed. Result: - `DecodingClient` and `DecodingService` can now limit the maximum length of decompressed data. - Fixes line#4469

minwoox

Looks great! Left some small suggestions. 😉

minwoox · 2022-12-09T06:18:27Z

core/src/main/java/com/linecorp/armeria/common/encoding/AbstractStreamDecoder.java

+            decoder.writeInbound(obj.byteBuf());
+        } catch (DecompressionException ex) {
+            final String message = ex.getMessage();
+            if (message != null && message.startsWith("Decompression buffer has reached maximum size:")) {


A small suggestion. 😄
How about filing an issue to the upstream for introducing a specific exception type for this so that we don't rely on the message?

minwoox · 2022-12-09T07:23:32Z

core/src/main/java/com/linecorp/armeria/common/encoding/AbstractStreamDecoder.java

+                decoded.release();
+            }
+            newBuf.release();
+            throw ContentTooLargeException.builder()


Could you check the call path of this method whether we throw this exception where we shouldn't?
It seems like AbstractStreamDecoder.finish() is called in onComplete and the exception isn't caught.

Nice point. We need to call delegate.onError() if beforeError() raises an exception.
Let me check possible cases.

Fixed to handle ContentTooLargeException in:

beforeComplete(): Halt onComplete() and call onError() instead

beforeError(): Create a CompositeException and set both the original exception and ContentTooLargeException

onCancellation(): Since the stream was canceled, just warn the error message.

minwoox · 2022-12-09T07:29:00Z

core/src/main/java/com/linecorp/armeria/common/encoding/ZlibStreamDecoder.java

+        if (noJdkZlibDecoder) {
+            return new JZlibDecoder(wrapper, maxLength);
+        } else {
+            return new JdkZlibDecoder(wrapper, true, maxLength);


What's the true mean? It seems like the default value is false in JdkZkipDecoder.

The original API we used ZlibCodecFactory.newZlibDecoder() internally set decompressConcatenated to true.
https://github.com/netty/netty/blob/4.1/codec/src/main/java/io/netty/handler/codec/compression/ZlibCodecFactory.java#L124

netty/netty@6ff48dc
It seems like the Gzip stream could be concatenation of multiple streams.

minwoox · 2022-12-09T07:37:18Z

core/src/main/java/com/linecorp/armeria/server/encoding/HttpEncodedResponse.java

@@ -92,7 +101,7 @@ protected HttpObject filter(HttpObject obj) {
                return obj;
            }

-            encodedStream = new ByteArrayOutputStream();
+            encodedStream = new ByteBufOutputStream(alloc.buffer());


How about specifying the initial capacity using the value from content-length header?

By the way, most text data will be shrunk to 5~8 times. So I'm not sure what ratio is a sensible value. Let me set content-length / 3 as the initial value.

I changed the initial value to set content-length / 2. A compression ratio would be higher than 50%.

jrhee17

Overall looks good 👍 Left a minor question

core/src/main/java/com/linecorp/armeria/common/encoding/StreamDecoderFactory.java

jrhee17 · 2022-12-12T05:54:04Z

core/src/main/java/com/linecorp/armeria/server/encoding/HttpEncoders.java

@@ -41,6 +41,11 @@ final class HttpEncoders {

    private static final Encoder.Parameters BROTLI_PARAMETERS = new Encoder.Parameters().setQuality(4);

+    static {
+        // Invoke to load Brotli native binary.
+        Brotli.isAvailable();


Question) Why is it necessary to pre-load this class here?

It would be a defect of brotli4j that a Brotli native library is not initialized when BrotliOutputStream is created.
It is not a problem because Brotli.isAvailable() is called before brotli codec is applied in the request call path.

However, when I tried to test brotli codec with HttpEncoders, the test failed because Brotli native library wasn't loaded. I added this for the sake of the convenience of testing.

However, when I tried to test brotli codec with HttpEncoders, the test failed because Brotli native library wasn't loaded. I added this for the sake of the convenience of testing.

That's interesting. Which test failed without this?

You can see the brotli initialization error in HttpEncodedResponseTest.shouldReleaseEncodedStreamOnError() after removing Brotli.isAvailable().

Ah ha, I get it. determineEncoding which calls Brotli.isAvailable() is called ahead so it was no big deal when the code is used in production. 😉

minwoox

Thanks, @ikhoon!

minwoox · 2022-12-19T06:24:35Z

core/src/main/java/com/linecorp/armeria/common/encoding/ZlibStreamDecoder.java

+        if (noJdkZlibDecoder) {
+            return new JZlibDecoder(wrapper, maxLength);
+        } else {
+            return new JdkZlibDecoder(wrapper, true, maxLength);


netty/netty@6ff48dc
It seems like the Gzip stream could be concatenation of multiple streams.

jrhee17

Looks great! Thanks @ikhoon ! 🙇 👍 🙇

minwoox · 2022-12-20T01:27:48Z

🎉 🎉 🎉

ikhoon added the improvement label Dec 7, 2022

ikhoon added this to the 1.21.0 milestone Dec 7, 2022

ikhoon requested review from jrhee17, minwoox and trustin as code owners December 7, 2022 14:18

ikhoon added 3 commits December 7, 2022 23:29

Clean up

4490301

Remove content-length as well

88d3629

Maybe fix a leak

4978f17

minwoox reviewed Dec 9, 2022

View reviewed changes

jrhee17 reviewed Dec 12, 2022

View reviewed changes

ikhoon added 2 commits December 16, 2022 00:48

Handle a exception thrown by decoder.finish()

f75c8e0

Fix leaks in tests

4fb067e

minwoox approved these changes Dec 19, 2022

View reviewed changes

jrhee17 approved these changes Dec 19, 2022

View reviewed changes

minwoox merged commit dd49448 into line:master Dec 20, 2022

ikhoon deleted the decoding-limit branch May 25, 2023 10:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit the max length of decoded content in `Decoding{Service,Client}` #4564

Limit the max length of decoded content in `Decoding{Service,Client}` #4564

ikhoon commented Dec 7, 2022

minwoox left a comment

minwoox Dec 9, 2022

ikhoon Dec 15, 2022

minwoox Dec 9, 2022

ikhoon Dec 15, 2022

ikhoon Dec 19, 2022

minwoox Dec 9, 2022

ikhoon Dec 15, 2022

minwoox Dec 19, 2022

minwoox Dec 9, 2022

ikhoon Dec 15, 2022

ikhoon Dec 15, 2022

ikhoon Dec 15, 2022 •

edited

Loading

jrhee17 left a comment

jrhee17 Dec 12, 2022

ikhoon Dec 15, 2022 •

edited

Loading

minwoox Dec 19, 2022

ikhoon Dec 19, 2022

minwoox Dec 19, 2022

minwoox left a comment

minwoox Dec 19, 2022

jrhee17 left a comment

minwoox commented Dec 20, 2022

	* Sets the maximum allowed length of the content decoded at the session layer.
	* e.g. the content length of an HTTP request.
	*
	* @param maxRequestLength the maximum allowed length. {@code 0} disables the length limit.
	*/
	public ServerBuilder maxRequestLength(long maxRequestLength) {

Limit the max length of decoded content in Decoding{Service,Client} #4564

Limit the max length of decoded content in Decoding{Service,Client} #4564

Conversation

ikhoon commented Dec 7, 2022

minwoox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ikhoon Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

jrhee17 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ikhoon Dec 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

minwoox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jrhee17 left a comment

Choose a reason for hiding this comment

minwoox commented Dec 20, 2022

Limit the max length of decoded content in `Decoding{Service,Client}` #4564

Limit the max length of decoded content in `Decoding{Service,Client}` #4564

ikhoon Dec 15, 2022 •

edited

Loading

ikhoon Dec 15, 2022 •

edited

Loading