Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved handling of truncation with ACKS=1 #16475

Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Feb 5, 2024

If an offset was already visible a follower must not be allowed to
truncate it as it may lead to a situation in which an offset is visible
and not readable.

Visible batches has the same replication guarantees as committed batches
as leader still waits for the majority to acknowledge message at given
offset before making it visible to the readers. This makes it possible
not to truncate offsets which were previously visible.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x
  • v23.1.x

Release Notes

Improvements

  • Improved handling of follower fetching offset validation when used with relaxed consistency

@mmaslankaprv
Copy link
Member Author

/dt

@mmaslankaprv mmaslankaprv marked this pull request as ready for review February 5, 2024 17:15
src/v/raft/types.h Outdated Show resolved Hide resolved
@@ -2013,7 +2013,7 @@ consensus::do_append_entries(append_entries_request&& r) {

// section 3
if (request_metadata.prev_log_index < last_log_offset) {
if (unlikely(request_metadata.prev_log_index < _commit_index)) {
if (unlikely(request_metadata.prev_log_index < last_visible_index())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so IIUC the problem is with follower fetching + acks=1 because, on a follower the log can be truncated below high watermark (because of committed offset check) say during recovery and the subsequent read from the follower may result in an out of range read, is that correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly

@@ -2013,7 +2013,7 @@ consensus::do_append_entries(append_entries_request&& r) {

// section 3
if (request_metadata.prev_log_index < last_log_offset) {
if (unlikely(request_metadata.prev_log_index < _commit_index)) {
if (unlikely(request_metadata.prev_log_index < last_visible_index())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use last_visible_index() in line 2020 as well? (not entirely understanding its significance, but seems logical...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, i will change this

src/v/raft/tests/basic_raft_fixture_test.cc Outdated Show resolved Hide resolved
src/v/raft/consensus.cc Outdated Show resolved Hide resolved
@piyushredpanda piyushredpanda added this to the v23.3.5 milestone Feb 7, 2024
Signed-off-by: Michal Maslanka <michal@redpanda.com>
If an offset was already visible a follower must not be allowed to
truncate it as it may lead to a situation in which an offset is visible
and not readable.

Visible batches has the same replication guarantees as committed batches
as leader still waits for the majority to acknowledge message at given
offset before making it visible to the readers. This makes it possible
not to truncate offsets which were previously visible.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Added test checking if an offset that is visible on the leader is not
truncated on the followers.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv mmaslankaprv merged commit 441fe68 into redpanda-data:dev Feb 7, 2024
15 checks passed
@mmaslankaprv mmaslankaprv deleted the raft-acks-1-improvements branch February 7, 2024 17:34
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-16475-v23.2.x-872 remotes/upstream/v23.2.x
git cherry-pick -x ec7b6edd06cce93c046b6b6d8079c33b5aeec8e9 3d9a794c2fa339f540d71f5b212e2a8c500a4dd9 8975291ecca7ed5282260e0394b97d9ac6faf6e8

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants