-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit the max length of decoded content in Decoding{Service,Client}
#4564
Conversation
Motivation: `ServerBuilder.maxRequestLength()` and `ClientBuilder.maxResponseLength()` can limit the max content at the session layer. https://github.com/line/armeria/blob/ec7c4251994f4b3b0089f95904dc1a68e5e9ce30/core/src/main/java/com/linecorp/armeria/server/ServerBuilder.java#L1657-L1662 The length-limited compressed content could be a humongous object if the original content was compressed with a high ratio. Such big objects may cause `OutOfMemoryError` even if the content length is limited at the session layer. For that reason, it is necessary to check the length of an output content while decompressing the compressed content. Additionally, this PR also fixes line#4469. Note: `ZlibDecoder` is able to limit the size at the decoder level. Unfortunately, there is no such option `BrotliDecoder`, but an output data is chunked into 4MiB if it exceeds 4MiB. We can't sophicately enforce the size of output but 4MiB seems a acceptable size. Modifications: - Added a new API that takes the max length of a decompressed object to `StreamDecoderFatory`. - Fixed `StreamDecoder` to raise `ContentTooLargeException` if the total length of an object exceeds a certain limitation. - Fixed `ZlibStreamDecoder` to hand over the given `maxLength` to `ZlibDecoder`. - Fixed `AbstractStreamDecoder` to limit the content length after decoding. - Added `com.aayushatharva.brotli4j:native-osx-aarch64` as an optional dependency. - Removed obsolete `client.encoding.StreamDecoderFactories` and `client.encoding.ZlibStreamDecoder` which can be replaced with the same classes in `common.encoding`. - Made `ContentTooLargeExeption` take an cause which is most likely a `DecompressionException` raised by `ZlibDecoder`. - Fixed `DefaultServerErrorHandler` to return `413 Request Entity Too Large` for `ContentTooLargeExeption`. - Replaced `ByteArrayOutputStream` with `ByteBufOutputStream` in `HttpEncodedResponse` for zero-copy. - A `Content-Encoding` is removed by `DecodingClient` and `DecodingService` when a compressed content is to be decompressed. Result: - `DecodingClient` and `DecodingService` can now limit the maximum length of decompressed data. - Fixes line#4469
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left some small suggestions. 😉
decoder.writeInbound(obj.byteBuf()); | ||
} catch (DecompressionException ex) { | ||
final String message = ex.getMessage(); | ||
if (message != null && message.startsWith("Decompression buffer has reached maximum size:")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small suggestion. 😄
How about filing an issue to the upstream for introducing a specific exception type for this so that we don't rely on the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
decoded.release(); | ||
} | ||
newBuf.release(); | ||
throw ContentTooLargeException.builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check the call path of this method whether we throw this exception where we shouldn't?
It seems like AbstractStreamDecoder.finish()
is called in onComplete
and the exception isn't caught.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice point. We need to call delegate.onError()
if beforeError()
raises an exception.
Let me check possible cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed to handle ContentTooLargeException
in:
beforeComplete()
: HaltonComplete()
and callonError()
insteadbeforeError()
: Create aCompositeException
and set both the original exception andContentTooLargeException
onCancellation()
: Since the stream was canceled, just warn the error message.
if (noJdkZlibDecoder) { | ||
return new JZlibDecoder(wrapper, maxLength); | ||
} else { | ||
return new JdkZlibDecoder(wrapper, true, maxLength); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the true
mean? It seems like the default value is false
in JdkZkipDecoder
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original API we used ZlibCodecFactory.newZlibDecoder()
internally set decompressConcatenated
to true.
https://github.com/netty/netty/blob/4.1/codec/src/main/java/io/netty/handler/codec/compression/ZlibCodecFactory.java#L124
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netty/netty@6ff48dc
It seems like the Gzip stream could be concatenation of multiple streams.
@@ -92,7 +101,7 @@ protected HttpObject filter(HttpObject obj) { | |||
return obj; | |||
} | |||
|
|||
encodedStream = new ByteArrayOutputStream(); | |||
encodedStream = new ByteBufOutputStream(alloc.buffer()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about specifying the initial capacity using the value from content-length header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, most text data will be shrunk to 5~8 times. So I'm not sure what ratio is a sensible value. Let me set content-length / 3
as the initial value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the initial value to set content-length / 2
. A compression ratio would be higher than 50%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good 👍 Left a minor question
core/src/main/java/com/linecorp/armeria/common/encoding/StreamDecoderFactory.java
Show resolved
Hide resolved
@@ -41,6 +41,11 @@ final class HttpEncoders { | |||
|
|||
private static final Encoder.Parameters BROTLI_PARAMETERS = new Encoder.Parameters().setQuality(4); | |||
|
|||
static { | |||
// Invoke to load Brotli native binary. | |||
Brotli.isAvailable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) Why is it necessary to pre-load this class here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a defect of brotli4j that a Brotli native library is not initialized when BrotliOutputStream
is created.
It is not a problem because Brotli.isAvailable()
is called before brotli codec is applied in the request call path.
However, when I tried to test brotli codec with HttpEncoders
, the test failed because Brotli native library wasn't loaded. I added this for the sake of the convenience of testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, when I tried to test brotli codec with HttpEncoders, the test failed because Brotli native library wasn't loaded. I added this for the sake of the convenience of testing.
That's interesting. Which test failed without this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see the brotli initialization error in HttpEncodedResponseTest.shouldReleaseEncodedStreamOnError()
after removing Brotli.isAvailable()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ha, I get it. determineEncoding
which calls Brotli.isAvailable()
is called ahead so it was no big deal when the code is used in production. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @ikhoon!
if (noJdkZlibDecoder) { | ||
return new JZlibDecoder(wrapper, maxLength); | ||
} else { | ||
return new JdkZlibDecoder(wrapper, true, maxLength); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netty/netty@6ff48dc
It seems like the Gzip stream could be concatenation of multiple streams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks @ikhoon ! 🙇 👍 🙇
🎉 🎉 🎉 |
Motivation:
ServerBuilder.maxRequestLength()
andClientBuilder.maxResponseLength()
can limit the max content at the session layer.armeria/core/src/main/java/com/linecorp/armeria/server/ServerBuilder.java
Lines 1657 to 1662 in ec7c425
OutOfMemoryError
even if the content length is limited at the session layer. For that reason, it is necessary to check the length of an output content while decompressing the compressed content.Additionally, this PR also fixes #4469.
Note:
ZlibDecoder
is able to limit the size at the decoder level. Unfortunately, there is no such optionBrotliDecoder
, but output data is chunked into 4MiB if it exceeds 4MiB. We can't sophisticatedly enforce the size of output but 4MiB seems an acceptable size.Modifications:
StreamDecoderFatory
.StreamDecoder
to raiseContentTooLargeException
if the total length of an object exceeds a certain limitation.ZlibStreamDecoder
to hand over the givenmaxLength
toZlibDecoder
.AbstractStreamDecoder
to limit the content length after decoding.com.aayushatharva.brotli4j:native-osx-aarch64
as an optional dependency.client.encoding.StreamDecoderFactories
andclient.encoding.ZlibStreamDecoder
which can be replaced with the same classes incommon.encoding
.ContentTooLargeExeption
take a cause which is most likely aDecompressionException
raised byZlibDecoder
.DefaultServerErrorHandler
to return413 Request Entity Too Large
forContentTooLargeExeption
.ByteArrayOutputStream
withByteBufOutputStream
inHttpEncodedResponse
for zero-copy.Content-Encoding
is removed byDecodingClient
andDecodingService
when compressed content is to be decompressed.Result:
DecodingClient
andDecodingService
can now limit the maximum length of decompressed data.content-encoding
header inDecodingService
andDecodingClient
#4469