check output buffer bounds in SnappyBlockDecompressor#63910
check output buffer bounds in SnappyBlockDecompressor#63910sahvx655-wq wants to merge 3 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
Updated the description with the problem, the exact overflow path, and the fix. Short version: the large-block length is bounds-checked but each small block's uncompressed length comes from its own snappy header and goes straight into RawUncompress (no capacity arg), so a crafted block can write past the output buffer. The patch checks the per-block length against the remaining output before decompressing. |
| // without a destination-capacity argument, so the header-declared length must be | ||
| // checked against the remaining output buffer to avoid an out-of-bounds write. | ||
| std::size_t available_output_len = output_max_len - (output_ptr - output); | ||
| if (decompressed_small_block_len > available_output_len) { |
There was a problem hiding this comment.
please add beut for this case
There was a problem hiding this comment.
Added a beut in be/test/util/snappy_block_decompressor_test.cpp. It crafts a single-small-block SNAPPYBLOCK stream whose snappy header declares 4096 bytes but hands the decompressor a 64-byte output buffer (with the large-block length set to 1 so the outer check passes and we reach the inner one). Before the fix that path overflowed; now it returns an error. There's also a valid round-trip case so the check doesn't regress normal streams.
Problem
SnappyBlockDecompressor::decompressinbe/src/util/decompressor.cppcan write past the line-reader output buffer when handling a crafted SNAPPYBLOCK stream (e.g. a CSV load).The decompress loop has two levels. The large-block length is read into
remaining_decompressed_large_block_lenand bounds-checked against the remaining output:But inside the inner loop each small block's uncompressed length comes from the per-block snappy header via
snappy::GetUncompressedLength, which is attacker-controlled and independent of the large-block length. That value is then passed straight tosnappy::RawUncompress, which has no destination-capacity argument:So a small block can declare a decompressed length larger than the space actually left in the output buffer, and
RawUncompresswrites out of bounds (heap OOB write past the output buffer).Fix
Before calling
RawUncompress, checkdecompressed_small_block_lenagainst the remaining output space (output_max_len - (output_ptr - output)) and return an error if it doesn't fit, instead of trusting the per-block header.Behavior change
Valid SNAPPYBLOCK streams are unaffected. A malformed/crafted stream that previously overflowed now returns an
InternalErrorreporting the declared length and the available output length.