-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v23.2.x] compaction: do not adjacent compact non self compacted segements #15372
[v23.2.x] compaction: do not adjacent compact non self compacted segements #15372
Conversation
and bumps the log level to info.
Adjacent compaction works by concatenating compaction indexes of adjacent segments. When a segment has transactional batches, we prematurely close the compaction index during segment append because we do not know upfront whether a batch will be committed/aborted. We recompute the index at the time of self compaction as it guarantees all the batches are either committed/aborted. Adjacent compacting a non self compacted index can result in incomplete concatenated index, this fix avoids it.
@@ -527,6 +533,22 @@ ss::future<compaction_result> disk_log_impl::compact_adjacent_segments( | |||
std::vector<ss::lw_shared_ptr<segment>> segments; | |||
std::copy(range.first, range.second, std::back_inserter(segments)); | |||
|
|||
bool all_segments_self_compacted = std::all_of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers, we already check for this condition in do_compact()
// loop until we compact segment or reached end of segments set
while (!_segs.empty()) {
const auto end_it = std::prev(_segs.end());
auto seg_it = std::find_if(
_segs.begin(),
end_it,
[offsets_compactible](ss::lw_shared_ptr<segment>& s) {
return !s->has_appender() && s->is_compacted_segment()
&& !s->finished_self_compaction(). <===
&& offsets_compactible(*s);
});
// nothing to compact
if (seg_it == end_it) {
break;
}
but that is not tight enough as segments may be appended to _segs after the loop breaks and non self compacted segment could be appended to _segs which is then considered for adjacent segment compaction. This is the reason why the bug was exposed in when rolling in a tight loop along with compaction tests.
c225812
to
d07e9f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice work. @andrwng may also be interested, and i think we may have this issue in upstream.
const auto unstable = std::any_of( | ||
range.first, range.second, [&cfg](ss::lw_shared_ptr<segment>& seg) { | ||
return seg->has_appender() || !seg->has_compactible_offsets(cfg); | ||
return !seg->finished_self_compaction() || seg->has_appender() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great find!
config::shard_local_cfg().log_disable_housekeeping_for_tests.set_value( | ||
true); | ||
config::shard_local_cfg().log_segment_ms_min.set_value(1ms); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should either reset these manually at the end of the test or use a scoped_config to set these, otherwise they'll bleed into other tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oo, good point, fixed
rolling and running house keeping in a tight loop provides additional test coverage for more interesting compaction test cases.
d07e9f9
to
bfc9574
Compare
unrelated failure: |
@bharathv did you discover if this exists in upstream or not? |
Adjacent compaction works by concatenating compaction indexes of
adjacent segments. When a segment has transactional batches, we
prematurely close the compaction index during segment append because
we do not know upfront whether a batch will be committed/aborted. We
recompute the index at the time of self compaction as it guarantees all
the batches are either committed/aborted.
Adjacent compacting a non self compacted index can result in incomplete
concatenated index, this fix avoids it.
This issue was exposed by running house keeping in a tight loop with low segment.ms but
has a very low probability of occurrence in practice.
Backports Required
Release Notes
Bug Fixes