[v23.2.x] compression: correctly compute the output buffer size for lz4 #15088

vbotbuildovich · 2023-11-22T03:57:57Z

Backport of PR #15059
Fixes: #15087,

Our lz4 frame compressor needs to take a lot of care to avoid monolithic allocations which aren't friendly to seastar memory allocator. To facilitate this the lz4 library provides a helpful method to compute the worst case compressed output size for a given input size, `LZ4F_compressBound`. Before this change, in a very rare scenario, our code throws the following exception: `lz4f_compressupdate error:ERROR_dstMaxSize_tooSmall`. In hindsight, the bug is obvious if you look at the follow code excerpt: ``` // Compress an input chunk and advance cursor. ... // Compute the worst case compressed output size for the next chunk. input_chunk_size = std::min(input_sz - input_cursor, max_input_chunk_size); size_t next_output_size = LZ4F_compressBound(input_chunk_size, &prefs); // Allocate new output buffer and reset cursor if the next ooutput won't // fit in the current one buffer. if (output_sz - output_cursor < next_output_size) { ... } ``` Each time we compress the last chunk of a fragment, `input_chunk_size` will equal to 0. Then we call the `LZ4F_compressBound(0, &prefs)` which for our `prefs` always returns `65543`. Side note/fun fact: The value is particularly large because lz4 does internal data buffering up to 64K, so this value always covers that potentially non-empty buffer. This also means that `LZ4F_compressBound` very often return just 0. Data is only buffered internally, not written to the output buffer. We then check if the hypothetical output of `65543` would fit into our own buffer, go at the beginning of the loop, move to the next non-empty fragment, and call `LZ4F_compressUpdate` on its first chunk. This is where the bug manifests itself. compressUpdate also does call `LZ4F_compressBound` and throws an exception if the output buffer is smaller than its return value. In our case, calling compressBound with a value between 1 <= x <= 65536 always returns `65544`. Just 1 unit larger than the input for a size of 0. If the output buffer has only 65543 bytes free, we get the `lz4f_compressupdate error:ERROR_dstMaxSize_tooSmall` exception. Note: I'm curious to learn why the value is 1 unit larger, but didn't get to it yet. This however is very rare in practice and almost impossible to catch in our tests which use randomized inputs. In particular, these inputs don't compress and lead to outputs of size `65544`, and our maximum output buffer size is set to `128 * 1024`. After writing one compressed output to a buffer this large, only 65528 bytes are free which is lower than the `65543` returned by `LZ4_compressBound` with an input size of 0. For the bug to manifest the following conditions must be met: * The input iobuf must consist of 2 or more fragments. * The prefix (P) byte length must be >= (input_chunk_size = 64K). Followed at least one more fragment. * The prefix (P), when passed to `LZ4_compressUpdate`, must compress in such a way that the output buffer is left with exactly 65543 available bytes. This means finding the sequence of a specific length such that a flush is triggered at the right moment. The fix simply moves the `LZ4F_compressBound` calculation right before `LZ4F_compressUpdate`, guaranteeing that both methods are called with the same value for `input_chunk_size`. Fixes: redpanda-data#15024 (cherry picked from commit b1e1576)

nvartolomei · 2023-11-22T10:11:10Z

Failed ducktape test is [v23.2.x] CI Failure (failed to wait until status condition) in PartitionBalancerTest.test_maintenance_mode #15092. That test doesn't seem to enter any lz4 related code paths so it is completely unrelated failure. If it did, we'd expect to see topics created with lz4 compression, some lz4 related exceptions, or a shard being stuck in an infinite loop caused by changes to lz4 compression.
Failed unit test cloud_storage_rpfixture is known failure related to [v23.2.x] compression: correctly compute the output buffer size for lz4 #15088

vbotbuildovich added this to the v23.2.x-next milestone Nov 22, 2023

vbotbuildovich added the kind/backport PRs targeting a stable branch label Nov 22, 2023

vbotbuildovich requested a review from dotnwat November 22, 2023 03:57

github-actions bot added the area/redpanda label Nov 22, 2023

piyushredpanda modified the milestones: v23.2.x-next, v23.2.17 Nov 22, 2023

piyushredpanda marked this pull request as ready for review November 22, 2023 04:02

piyushredpanda approved these changes Nov 22, 2023

View reviewed changes

nvartolomei self-requested a review November 22, 2023 10:59

piyushredpanda merged commit ae9a71b into redpanda-data:v23.2.x Nov 22, 2023
22 of 25 checks passed

github-actions bot mentioned this pull request Nov 24, 2023

update redpanda appVersion from v23.2.16 to v23.2.17 redpanda-data/helm-charts#893

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v23.2.x] compression: correctly compute the output buffer size for lz4 #15088

[v23.2.x] compression: correctly compute the output buffer size for lz4 #15088

vbotbuildovich commented Nov 22, 2023

nvartolomei commented Nov 22, 2023 •

edited

Loading

[v23.2.x] compression: correctly compute the output buffer size for lz4 #15088

[v23.2.x] compression: correctly compute the output buffer size for lz4 #15088

Conversation

vbotbuildovich commented Nov 22, 2023

nvartolomei commented Nov 22, 2023 • edited Loading

nvartolomei commented Nov 22, 2023 •

edited

Loading