[AVRO-4247] enforce decompression size limits#3745
Open
steveloughran wants to merge 6 commits intoapache:mainfrom
Open
[AVRO-4247] enforce decompression size limits#3745steveloughran wants to merge 6 commits intoapache:mainfrom
steveloughran wants to merge 6 commits intoapache:mainfrom
Conversation
…b DoS Add maximum decompression size limit in DeflateCodec to prevent OutOfMemoryError when processing maliciously crafted Avro files with high compression ratios (decompression bombs). The limit defaults to 200MB and can be configured via system property: org.apache.avro.limits.decompress.maxLength
….java Thanks! Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
….java Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
- Move MAX_DECOMPRESS_LENGTH initialization to static block (read once at class load) - Add WARNING log for invalid property values (NumberFormatException) - Validate negative and zero values, reject with warning - Add "(bytes)" to error message for clarity - Add quotes around property name in error message Test command: java -Xmx64m -Dorg.apache.avro.limits.decompress.maxLength=1048576 \ -jar avro-tools-1.13.0-SNAPSHOT.jar tojson poc.avro Expected behavior: Exception in thread "main" org.apache.avro.AvroRuntimeException: Decompressed size 1056768 (bytes) exceeds maximum allowed size 1048576. This can be configured by setting the system property 'org.apache.avro.limits.decompress.maxLength'
Change-Id: Ib24c52cdf3234a3805628041946b229b221383ad
* Automatically available to all codecs * Does need an explicit constructor with no limit, used in DataFileWriter * No tests, though that new constructor makes it trivial Note: merged in main as DataFileWriter changes would otherwise stop merging Change-Id: Ifc5b8921a00425df331a4889472b3e78c6677bde
steveloughran
commented
Apr 29, 2026
| private static final long MAX_DECOMPRESS_LENGTH; | ||
|
|
||
| static { | ||
| String prop = System.getProperty(MAX_DECOMPRESS_LENGTH_PROPERTY); |
Author
There was a problem hiding this comment.
could move to SystemLimitException, as that's where the int equivalent lives.
| * @throws IllegalArgumentException if size is negative | ||
| */ | ||
| public NonCopyingByteArrayOutputStream(int size) { | ||
| this(size, MAX_DECOMPRESS_LENGTH); |
Author
There was a problem hiding this comment.
this does change the default operation. Apart from DataFileWriter, it is only ever used in decompressors.
Options
- change the default (here)
- change the code uses to take a limit
- private two arg ctor and a public static creator method
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
#3625 with size limit checks moved into the NonCopyingByteArrayOutputStream
There's a new constructor to
NonCopyingByteArrayOutputStreamto set a size limit, or no limit, and the default constructor now automatically picks up the size set by system property/fallback default.Those choices could be discussed, with options being
org.apache.avro.SystemLimitException, where the int parser lives.AI: No AI was used for this PR.
Verifying this change
Needs tests, if people are happy with the design I can put one in whichever module people would prefer...it's pretty straightforward
Documentation
Does this pull request introduce a new feature? (yes / no)
yes
If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
javadocs