Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport #3372 (Delay prefix truncation of offset translator until segments are deleted) #3541

Merged
merged 5 commits into from
Jan 19, 2022

Conversation

ztlpn
Copy link
Contributor

@ztlpn ztlpn commented Jan 19, 2022

Cover letter

Backport #3372 to v21.11.x.

See redpanda-data#3377. Critical
invariant for _start_offset is that it never decreases.

Previously this invariant could be violated in a couple of ways.
* Because _start_offset was updated only at the end of _truncate_prefix
function, if all segments were deleted, external observer could
observe _start_offset going backwards to the previous value after all
segments have been deleted, but before it was updated in memory.
* Because kvstore updates were not serialized, we could never be sure
that we are not overwriting the bigger _start_offset value that gets
written as a part of concurrent prefix_truncate.

To mitigate that we serialize the kvstore updates with a mutex and
update in-memory _start_offset at the beginning of prefix_truncate.

(cherry picked from commit 038d6c8)
It doesn't make much sense for the max_collectible_offset to decrease,
and making it non-decreasing guards against reordered calls to
set_collectible_offset

(cherry picked from commit a78a6c6)
Previously we prefix truncated offset translator immediately after
writing the snapshot, but actual deletion of log segments happened
later in the log gc loop. After this commit deletion of segments
happens immediately and offset translator is prefix truncated after
it. This allows waiting for readers that were reading before the
prefix truncation point before prefix truncating the offset translator.

(cherry picked from commit 2c9e003)
Because we need offset translation information to get offset ranges
of aborted transactions, we extend the reader lifetime to
the part.aborted_transactions call. This way reader will hold on to
the segment locks and delay possible prefix truncation that could
discard needed offset translation info.

(cherry picked from commit bf9af3f)
@jcsp jcsp merged commit d918ba3 into redpanda-data:v21.11.x Jan 19, 2022
@ztlpn ztlpn deleted the v21.11.x-bp-3372 branch November 27, 2023 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants