[COMPRESS-726] Add optional decompression bomb protection to CompressorStreamFactory#777
[COMPRESS-726] Add optional decompression bomb protection to CompressorStreamFactory#777tomasilluminati wants to merge 1 commit into
Conversation
…orStreamFactory Adds output-counting guard BombGuardCompressorInputStream that caps decompressed size and an optional mean compression-ratio check with a configurable grace. Off by default, opt-in via CompressorStreamFactory setters. CompressorInputStream now implements InputStreamStatistics. Tests and a changes.xml entry are included.
|
@ppkarwasz COMPRESS-726 PR Opened |
garydgregory
left a comment
There was a problem hiding this comment.
Hi All,
I have some initial thoughts on this line of work:
- Let's loose the whole "bomb" FUD terminology. I might just want to enforce limits. I should be able to do this in one place that also happens to catch possible attacks if an app decides to accept untrusted input. The goal is to enforce a limit, regardless of the input's intent. I have a door on my house to avoid people, animals, weather, and so on from coming in, I don't have a "thief-guard".
- Use a builder for construction with a private ctor(Builder), otherwise, this will be another class that will get one constructor after another over time.
- Why no pull up the feature in the super class or move it to IO's
BoundedInputStream? The code already uses aBoundedInputStreamafter all.
|
@garydgregory If all we want is a decompressed-size limit, IO's BoundedInputStream already does it (setMaxCount plus an onMaxCount that throws). So we don't really need a compress-specific class or a new exception. I'm happy to drop BombGuardCompressorInputStream and DecompressionBombException and just have CompressorStreamFactory wrap the stream in a BoundedInputStream when a max size is set. That's the single place you're describing, it reuses IO, and the bomb wording goes away. The builder point sorts itself out too, since there's no class left to build. The only bit that doesn't fit there is the compression-ratio check, since BoundedInputStream has no idea about ratios and it needs getCompressedCount() from the compressor side. So really the only open question is scope. Just the size limit, or the ratio guard too?
My hunch from your note is size-only, but just say the word and I'll redo it that way. Either way the InputStreamStatistics change stays, the compressed count is handy on its own. Thanks again for the steer. |
|
@tomasilluminati |
|
@garydgregory Nice pointer on setAfterRead, hadn't thought of it. That actually means neither piece needs a new class. The size cap is setMaxCount plus a throwing onMaxCount, and if the ratio ever comes back it can ride along as a setAfterRead consumer on the same BoundedInputStream. So my lean is to keep it minimal, wire an absolute decompressed-size limit into CompressorStreamFactory via BoundedInputStream, drop the bomb class and the exception, and leave the ratio out for now. We can always document it as a recipe later. Happy to let it simmer and hear what @ppkarwasz thinks before I change anything. |
|
@garydgregory , I've got the minimal version ready, absolute decompressed-size limit composed over IO's BoundedInputStream, dropping the extra class and the exception. One note, since the cap is on the real decompressed output, it already covers the faked-payload-size case Piotr raised (it ignores any declared size). @ppkarwasz, given that, do you still want the mean-ratio guard and an on-by-default limit? The ratio fires earlier and needs no tuned number, so I'm happy to fold it back as a second, opt-in check if you want it. Otherwise I'll push the size-only version. |
As with most features, we probably need to provide sensible defaults, so users don't need to modify them. AI gives the following usual compression rates:
Higher compression rates are impossible, unless you compress zeros, so 100× looks like a safe default. When I chose the default ratio for apache/mina#52, I tried to copy what HTTP Server does:
|
I guess, we can deliver a mean-ratio guard in a separate wrapper: only a limited number of users need it. Basically only if the compressed stream comes primarily from untrusted sources (like in a server), users might want to add a mean-ratio check as a fail-fast mechanism to prevent a DoS. However, I think it would be nice if every |
|
To make it clear, I'll move this PR draft since it is not ready as there are a few aspects still being discussed. |
|
@ppkarwasz thanks, that lines up well. A separate opt-in wrapper for the mean-ratio guard sounds right, it keeps the cost at zero for the common case and puts the fail-fast check where untrusted-source users actually want it. Happy to add it that way, following the mod_deflate shape you described (a burst tolerance in output buffers rather than a byte grace), with 100x as the default once it is opt-in. On making every CompressorInputStream inherit from BoundedInputStream to use the declared ZIP/7z size: appealing for catching corrupt archives, but it is a much bigger change than this PR, it touches the base type every format extends and the japicmp surface, and the declared size lives at the archive layer rather than the codec layer, so wiring it cleanly is its own design. I'd keep it out of 726 and open a separate ticket if you want to pursue it. So where I think this lands: absolute decompressed-size cap as the core here (composed over IO's BoundedInputStream), the mean-ratio guard as a separate opt-in wrapper with a 100x default, and the inherit-from-BoundedInputStream idea as its own follow-up. Does that match what you and @garydgregory have in mind before I rework the PR? |
|
Giving defaults like:
is why this should NOT be added as a feature to this class. It's based on guesses at best and will certainly shoot someone in the foot or a more useful body part. |
I think I like that idea. What we are saying today is: If you want to bound decompression, you must compose your Instead, what we would be saying is: Bounding decompression is so important to all possible call sites, that we are no longer offering composition as an option, we are now hard-coding the feature by inheriting it and making it available to all call sites, all the time. You still need to set a boundary in you call site. Doing so would require breaking binary compatibility because the deprecated method |
I like the intent, one honest catch beyond the getCount int/long break you flagged. BoundedInputStream bounds the bytes it reads from its wrapped stream, which for a decompressor is the compressed input side, whereas bomb protection needs to bound the decompressed output. So the inheritance only does the right thing if every format routes its output through the proxy, which is the big part of the change. Before anyone builds it, worth pinning down what the inherited bound counts, decompressed output or the compressed input it proxies. Happy to prototype whichever way you and @ppkarwasz settle on. |
Adds output-counting guard BombGuardCompressorInputStream that caps decompressed size and an optional mean compression-ratio check with a configurable grace. Off by default, opt-in via CompressorStreamFactory setters. CompressorInputStream now implements InputStreamStatistics. Tests and a changes.xml entry are included.