Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CDCSDK] Catalog manager can delete entries from cdc_state on new tablets #19746

Closed
yugabyte-ci opened this issue Oct 31, 2023 · 0 comments
Closed
Assignees
Labels
area/cdcsdk CDC SDK jira-originated kind/new-feature This is a request for a completely new feature priority/high High Priority

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented Oct 31, 2023

Jira Link: DB-8590

We have a method in xrepl_catalog_manager named CleanUpCDCStreamsMetadata - this method gets the tablets to keep per stream and then drops the rest of the tablets from cdc_state table considering the table for them has been dropped.

The flow in which the cleanup works is:

  1. GetTablets for a table using table->GetTablets(IncludeInactive::kTrue)
  2. Get all the tablets from cdc_state
  3. If there are any extra tablet in the result of step-2 then delete it from the table

Now the bug here is a race condition where we got some tablets in step-1 and then a tablet got split:

  1. The split will result in the entry for the child tablet to be added in the cdc_state so step-2 will return child entries as well.
  2. We still have the stale list from the result in step-1 i.e. the list where we do not have any child entry
  3. In step-3 above, when deletion will happen, we will consider the child entries eligible for a delete since they are not present in the result of step-1 and thus we will end up deleting rows from the cdc_state table.
@yugabyte-ci yugabyte-ci added area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/low Low priority status/awaiting-triage Issue awaiting triage labels Oct 31, 2023
@yugabyte-ci yugabyte-ci added kind/new-feature This is a request for a completely new feature priority/high High Priority and removed kind/bug This issue is a bug priority/low Low priority status/awaiting-triage Issue awaiting triage labels Nov 3, 2023
siddharth2411 added a commit that referenced this issue Nov 10, 2023
… new tablets

Summary:
CleanUpCDCStreamsMetadata() method is responsible for cleaning up stream metadata from cdc_state table, system catalog for dropped tables.
Method flow:
Step-1: For each received stream, call GetTablets() on all its associated tables and include all hidden tablets as well.
For dropped tables, GetTablets will be empty. (Store table_id in the set drop_stream_table_list)
For all other tables, GetTablets will be non-empty. (Store tablet_id in the set tablets_to_keep_per_stream)
Step-2: Read all entries of the cdc_state table.
For each state table's entry, take the stream-id and get the set of tablets from  tablets_to_keep_per_stream. If the tablet-id of the current state table's entry is not present in this set, delete it from cdc_state table.

Race condition:
Step1 -> tablet split -> step 2
Reason: On tablet split, we add entries for children tablets in the cdc_state table. So, Step 1 will not have children entries since split hasn't happened whereas step 2 will have children entries. Since we didn't receive children tablets entry in GetTablets call, we will infer that those tablet_ids belong to a dropped table and hence we will delete those entries from the cdc_state table.

Fix: Reverse the call order of step 1 & 2 in CleanUpCDCStreamsMetadata flow.
Step 1: Read cdc_state table
Step 2: GetTablets() to compute tablets_to_keep_per_stream and drop_stream_table_list

In the fix, in step-1 (reading cdc_state), since the split hasn't happened yet, we won't have entries for children tables. Step-2 i.e. calling GetTablets() will have entries for children tablets but the logic is to only delete extra entries that are present in cdc_state but not in GetTablets(). So, having extra entries in GetTablets() call is perfectly fine.

Also verified the following:
In the tablet split code path, GetTablets picks up the children tablets before entries for the children tablets are inserted into the cdc_state. So, if we are getting children tablets entries in the cdc_state table, we are bound to get them in the GetTablets call. Hence, writing to cdc_state table at the end (in tablet split thread) and reading it at the start (in cleanup thread) ensures that we read stale entries from the cdc_state table in race conditions described above.
Jira: DB-8590

Test Plan: ./ybd --cxx-test cdcsdk_tablet_split-test --gtest_filter CDCSDKTabletSplitTest.TestCleanUpCDCStreamsMetadataDuringTabletSplit

Reviewers: skumar, vkushwaha, stiwary, asrinivasan, hsunder, jhe

Reviewed By: asrinivasan, hsunder

Subscribers: ycdcxcluster

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D29992
siddharth2411 added a commit that referenced this issue Nov 10, 2023
…om cdc_state on new tablets

Summary:
Original commit: 3063164 / D29992
CleanUpCDCStreamsMetadata() method is responsible for cleaning up stream metadata from cdc_state table, system catalog for dropped tables.
Method flow:
Step-1: For each received stream, call GetTablets() on all its associated tables and include all hidden tablets as well.
For dropped tables, GetTablets will be empty. (Store table_id in the set drop_stream_table_list)
For all other tables, GetTablets will be non-empty. (Store tablet_id in the set tablets_to_keep_per_stream)
Step-2: Read all entries of the cdc_state table.
For each state table's entry, take the stream-id and get the set of tablets from  tablets_to_keep_per_stream. If the tablet-id of the current state table's entry is not present in this set, delete it from cdc_state table.

Race condition:
Step1 -> tablet split -> step 2
Reason: On tablet split, we add entries for children tablets in the cdc_state table. So, Step 1 will not have children entries since split hasn't happened whereas step 2 will have children entries. Since we didn't receive children tablets entry in GetTablets call, we will infer that those tablet_ids belong to a dropped table and hence we will delete those entries from the cdc_state table.

Fix: Reverse the call order of step 1 & 2 in CleanUpCDCStreamsMetadata flow.
Step 1: Read cdc_state table
Step 2: GetTablets() to compute tablets_to_keep_per_stream and drop_stream_table_list

In the fix, in step-1 (reading cdc_state), since the split hasn't happened yet, we won't have entries for children tables. Step-2 i.e. calling GetTablets() will have entries for children tablets but the logic is to only delete extra entries that are present in cdc_state but not in GetTablets(). So, having extra entries in GetTablets() call is perfectly fine.

Also verified the following:
In the tablet split code path, GetTablets picks up the children tablets before entries for the children tablets are inserted into the cdc_state. So, if we are getting children tablets entries in the cdc_state table, we are bound to get them in the GetTablets call. Hence, writing to cdc_state table at the end (in tablet split thread) and reading it at the start (in cleanup thread) ensures that we read stale entries from the cdc_state table in race conditions described above.
Jira: DB-8590

Test Plan: ./ybd --cxx-test cdcsdk_tablet_split-test --gtest_filter CDCSDKTabletSplitTest.TestCleanUpCDCStreamsMetadataDuringTabletSplit

Reviewers: skumar, vkushwaha, stiwary, asrinivasan, hsunder, jhe, xCluster

Reviewed By: asrinivasan

Subscribers: ycdcxcluster, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D30117
@yugabyte-ci yugabyte-ci reopened this Nov 16, 2023
Spikatrix pushed a commit to Spikatrix/yugabyte-db that referenced this issue Nov 29, 2023
…tries from cdc_state on new tablets

Summary:
Original commit: 3063164 / D29992
NOTE: Had to do some manual changes because cherry-pick wasnt succesful due to code refactoring in
latest master.

CleanUpCDCStreamsMetadata() method is responsible for cleaning up stream metadata from cdc_state table, system catalog for dropped tables.
Method flow:
Step-1: For each received stream, call GetTablets() on all its associated tables and include all hidden tablets as well.
For dropped tables, GetTablets will be empty. (Store table_id in the set drop_stream_table_list)
For all other tables, GetTablets will be non-empty. (Store tablet_id in the set tablets_to_keep_per_stream)
Step-2: Read all entries of the cdc_state table.
For each state table's entry, take the stream-id and get the set of tablets from  tablets_to_keep_per_stream. If the tablet-id of the current state table's entry is not present in this set, delete it from cdc_state table.

Race condition:
Step1 -> tablet split -> step 2
Reason: On tablet split, we add entries for children tablets in the cdc_state table. So, Step 1 will not have children entries since split hasn't happened whereas step 2 will have children entries. Since we didn't receive children tablets entry in GetTablets call, we will infer that those tablet_ids belong to a dropped table and hence we will delete those entries from the cdc_state table.

Fix: Reverse the call order of step 1 & 2 in CleanUpCDCStreamsMetadata flow.
Step 1: Read cdc_state table
Step 2: GetTablets() to compute tablets_to_keep_per_stream and drop_stream_table_list

In the fix, in step-1 (reading cdc_state), since the split hasn't happened yet, we won't have entries for children tables. Step-2 i.e. calling GetTablets() will have entries for children tablets but the logic is to only delete extra entries that are present in cdc_state but not in GetTablets(). So, having extra entries in GetTablets() call is perfectly fine.

Also verified the following:
In the tablet split code path, GetTablets picks up the children tablets before entries for the children tablets are inserted into the cdc_state. So, if we are getting children tablets entries in the cdc_state table, we are bound to get them in the GetTablets call. Hence, writing to cdc_state table at the end (in tablet split thread) and reading it at the start (in cleanup thread) ensures that we read stale entries from the cdc_state table in race conditions described above.
Jira: DB-8590

In the unit test, have added include guards since syncpoint in not supported for non-debug builds

Test Plan: ./ybd --cxx-test cdcsdk_tablet_split-test --gtest_filter CDCSDKTabletSplitTest.TestCleanUpCDCStreamsMetadataDuringTabletSplit

Reviewers: skumar, vkushwaha, stiwary, asrinivasan, hsunder, jhe, xCluster

Reviewed By: hsunder

Subscribers: ycdcxcluster, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D30351
jharveysmith pushed a commit that referenced this issue May 24, 2024
…om cdc_state on new tablets

Summary:
Original commit: bb2c7eb9260951721eb3bbed0f42315cbd4ab8b0 / D29992
CleanUpCDCStreamsMetadata() method is responsible for cleaning up stream metadata from cdc_state table, system catalog for dropped tables.
Method flow:
Step-1: For each received stream, call GetTablets() on all its associated tables and include all hidden tablets as well.
For dropped tables, GetTablets will be empty. (Store table_id in the set drop_stream_table_list)
For all other tables, GetTablets will be non-empty. (Store tablet_id in the set tablets_to_keep_per_stream)
Step-2: Read all entries of the cdc_state table.
For each state table's entry, take the stream-id and get the set of tablets from  tablets_to_keep_per_stream. If the tablet-id of the current state table's entry is not present in this set, delete it from cdc_state table.

Race condition:
Step1 -> tablet split -> step 2
Reason: On tablet split, we add entries for children tablets in the cdc_state table. So, Step 1 will not have children entries since split hasn't happened whereas step 2 will have children entries. Since we didn't receive children tablets entry in GetTablets call, we will infer that those tablet_ids belong to a dropped table and hence we will delete those entries from the cdc_state table.

Fix: Reverse the call order of step 1 & 2 in CleanUpCDCStreamsMetadata flow.
Step 1: Read cdc_state table
Step 2: GetTablets() to compute tablets_to_keep_per_stream and drop_stream_table_list

In the fix, in step-1 (reading cdc_state), since the split hasn't happened yet, we won't have entries for children tables. Step-2 i.e. calling GetTablets() will have entries for children tablets but the logic is to only delete extra entries that are present in cdc_state but not in GetTablets(). So, having extra entries in GetTablets() call is perfectly fine.

Also verified the following:
In the tablet split code path, GetTablets picks up the children tablets before entries for the children tablets are inserted into the cdc_state. So, if we are getting children tablets entries in the cdc_state table, we are bound to get them in the GetTablets call. Hence, writing to cdc_state table at the end (in tablet split thread) and reading it at the start (in cleanup thread) ensures that we read stale entries from the cdc_state table in race conditions described above.
Jira: DB-8590

Test Plan: ./ybd --cxx-test cdcsdk_tablet_split-test --gtest_filter CDCSDKTabletSplitTest.TestCleanUpCDCStreamsMetadataDuringTabletSplit

Reviewers: skumar, vkushwaha, stiwary, asrinivasan, hsunder, jhe, xCluster

Reviewed By: asrinivasan

Subscribers: ycdcxcluster, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D30117
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cdcsdk CDC SDK jira-originated kind/new-feature This is a request for a completely new feature priority/high High Priority
Projects
None yet
Development

No branches or pull requests

3 participants