[DocDB] Retryable requests file can be ahead of committed_op_id during tablet bootstrap #18412

Huqicheng · 2023-07-25T18:34:33Z

Jira Link: DB-7402

Description

On a leader, there's a scenario where the retryable requests file has some op ids that are greater than the committed_op_id stored in the last write batch in WAL:

Majority replicated N write ops (1.1 ~ 1.N+1). And those writes are not applied and committed on leader
Kill both followers
Before leader lease is expired, the leader receives a new write N+1 (1.N+2)
Leader starts to apply pending writes and can apply up to the Nth write and leave N+1 as pending. The retryable requests file also has marked the first Nth writes as replicated and flush it to disk.
Kill the leader and restart it.
The last committed id read from WALs is still 1.1 and treat all write ops as orphaned pending replicates. Since the retryable requests file has stored the first Nth writes, when registering the orphaned pending replicates, it will hit duplicate write error.

If an op id is covered by the retryable requests file, it must be committed. So we can do the following fix:

last_committed_op_id = std::max(
          std::max(last_committed_op_id, read_result.committed_op_id),
          last_op_id_in_retryable_requests);

cc @bogdan

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

rthallamko3 · 2023-07-26T17:58:32Z

This can result in bootstrap failure on the follower. Mitigation is to delete the retryable requests file.

…s file is ahead of committed ops stored in the last write batch Summary: On a leader, there's a scenario where the retryable requests file has some op ids that are less than the committed_op_id stored in the last write batch in WAL: 1. Majority replicated N write ops (1.1 ~ 1.N). The leader knows they have been majority replicated but not yet applied them 2. Kill both followers 3. Before leader lease is expired, the leader receives a new write N+1 (1.N+1) 4. Leader starts to apply pending writes and can apply up to the Nth write and leave N+1 as pending. Retryable requests file also has marked the first Nth writes as replicated. 5. Flush the retryable requests to the disk with last op id as 1.N 6. Kill the leader and restart it. 7. The last committed op index read from WALs is still 0 and treat all write ops as orphaned pending replicates. Since the retryable requests file has stored the first Nth writes, when registering the orphaned pending replicates, it will hit duplicate write error. Since any op id in retryable requests file has to be applied and committed, we should apply all ops with op id less than or equal to last op id in retryable requests file during bootstrap. Test Plan: RetryableRequestTest.PersistedRetryableRequestsWithUncommittedOpId Reviewers: sergei Reviewed By: sergei Subscribers: ybase, qhu, bogdan Differential Revision: https://phorge.dev.yugabyte.com/D27367

…retryable requests file is ahead of committed ops stored in the last write batch Summary: Original commit: f45ec20 / D27367 On a leader, there's a scenario where the retryable requests file has some op ids that are less than the committed_op_id stored in the last write batch in WAL: 1. Majority replicated N write ops (1.1 ~ 1.N). The leader knows they have been majority replicated but not yet applied them 2. Kill both followers 3. Before leader lease is expired, the leader receives a new write N+1 (1.N+1) 4. Leader starts to apply pending writes and can apply up to the Nth write and leave N+1 as pending. Retryable requests file also has marked the first Nth writes as replicated. 5. Flush the retryable requests to the disk with last op id as 1.N 6. Kill the leader and restart it. 7. The last committed op index read from WALs is still 0 and treat all write ops as orphaned pending replicates. Since the retryable requests file has stored the first Nth writes, when registering the orphaned pending replicates, it will hit duplicate write error. Since any op id in retryable requests file has to be applied and committed, we should apply all ops with op id less than or equal to last op id in retryable requests file during bootstrap. Test Plan: RetryableRequestTest.PersistedRetryableRequestsWithUncommittedOpId Reviewers: sergei Reviewed By: sergei Subscribers: bogdan, qhu, ybase Differential Revision: https://phorge.dev.yugabyte.com/D27571

Huqicheng added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Jul 25, 2023

Huqicheng self-assigned this Jul 25, 2023

yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Jul 25, 2023

Huqicheng removed the status/awaiting-triage Issue awaiting triage label Jul 25, 2023

yugabyte-ci added kind/bug This issue is a bug and removed kind/enhancement This is an enhancement of an existing feature labels Jul 26, 2023

Huqicheng changed the title ~~[DocDB] Retryable requests file can be ahead of committed_op_id at follower~~ [DocDB] Retryable requests file can be ahead of committed_op_id during tablet bootstrap Jul 29, 2023

yugabyte-ci added priority/highest Highest priority issue and removed priority/medium Medium priority issue labels Jul 30, 2023

Huqicheng closed this as completed Aug 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DocDB] Retryable requests file can be ahead of committed_op_id during tablet bootstrap #18412

[DocDB] Retryable requests file can be ahead of committed_op_id during tablet bootstrap #18412

Huqicheng commented Jul 25, 2023 •

edited

rthallamko3 commented Jul 26, 2023

[DocDB] Retryable requests file can be ahead of committed_op_id during tablet bootstrap #18412

[DocDB] Retryable requests file can be ahead of committed_op_id during tablet bootstrap #18412

Comments

Huqicheng commented Jul 25, 2023 • edited

Description

Warning: Please confirm that this issue does not contain any sensitive information

rthallamko3 commented Jul 26, 2023

Huqicheng commented Jul 25, 2023 •

edited