Incomplete cleanup in case of a create stream failure due to a late alter table #20725

yugabyte-ci · 2024-01-22T11:24:35Z

Jira Link: DB-9727

…e to a late alter table Summary: When a create stream fails, a cleanup is performed. The UpdatesPeersAndMetrics background thread is involved in this cleanup. As part of the rollback, the cdc_state table entries associated with the stream have their checkpoint set to OpId::Max. This is the indication to UpdatePeersAndMetrics to perform cleanup for the stream in question. One of the ways in which a create stream could fail is if the ALTER TABLEs that were issued as part of the create stream timeout. However, an ALTER TABLE response that comes late, after the rollback and cleanup have been performed, could put back an entry in the cdc_state table for the same stream with a valid checkpoint. This in turn could once again cause resources to be used. This issue is caused because the Stream metadata is not refreshed from master by UpdatePeersAndMetrics. The solution is for UpdatePeersAndMetrics to refresh the stream metadata from master for a stream that is in the INITIATED state. This would result in UpdatePeersANdMetrics realizing that the stream has been deleted and do the necessary cleanup again. Care is taken to limit the number of metadata refresh calls to master per stream per round of UpdatePeersAndMetrics. Jira: DB-9727 Test Plan: All tests Reviewers: stiwary, skumar Reviewed By: stiwary Subscribers: ycdcxcluster Differential Revision: https://phorge.dev.yugabyte.com/D31882

…ream failure due to a late alter table Summary: **Backport Description** There was one merge conflct in cdc_service.cc **Original Description** Original commit: 054374b / D31882 When a create stream fails, a cleanup is performed. The UpdatesPeersAndMetrics background thread is involved in this cleanup. As part of the rollback, the cdc_state table entries associated with the stream have their checkpoint set to OpId::Max. This is the indication to UpdatePeersAndMetrics to perform cleanup for the stream in question. One of the ways in which a create stream could fail is if the ALTER TABLEs that were issued as part of the create stream timeout. However, an ALTER TABLE response that comes late, after the rollback and cleanup have been performed, could put back an entry in the cdc_state table for the same stream with a valid checkpoint. This in turn could once again cause resources to be used. This issue is caused because the Stream metadata is not refreshed from master by UpdatePeersAndMetrics. The solution is for UpdatePeersAndMetrics to refresh the stream metadata from master for a stream that is in the INITIATED state. This would result in UpdatePeersANdMetrics realizing that the stream has been deleted and do the necessary cleanup again. Care is taken to limit the number of metadata refresh calls to master per stream per round of UpdatePeersAndMetrics. Jira: DB-9727 Test Plan: All tests Reviewers: stiwary, skumar Reviewed By: stiwary Subscribers: ycdcxcluster Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D31921

…e operation via Walsender Summary: ##### Backport Description There was a small merge conflict in the `TestPgReplicationSlot.java` file in the `getTServerFlags` function. This is because https://phorge.dev.yugabyte.com/D33201 has not yet been backported to 2024.1 and I decided not to wait for it. All the other files were merged without conflicts. ##### Original Description(s) Original commits: fddfec6 / D32859 459e6b6 / D33539 8d56bc5 / D33567 ###### [#20726] YSQL: Introduce support for streaming update operation via Walsende Before this revision, we only supported streaming Insert & Delete operations as part of the PG compatible logical replication support. This were added in https://phorge.dev.yugabyte.com/D31997 and https://phorge.dev.yugabyte.com/D32868 respectively. This revision extends the support to also update operations. To do this, the following has been implemented: 1. Decoding update operations and streaming them 2. Ability to get the replica identities stored in the stream metadata from yb-master and use it while decoding update records. It is used to determine whether we want to handle omitted values differently vs NULL values. We only want to do that for YB specific replica identities. --- A key part of this support requires the capability to distinguish between NULL and omitted values for YB specific replica identity values. For example: consider replica identity (record type) `CHANGE`: ``` create table t (a int primary key, b int, c text); insert into t values (1,1, 'abc'); update t set b = 2 where a = 1; // Statement 1 update t set b = 3, c = NULL where a = 1; // Statement 2 ``` In Statement 1, c is not being updated while in statement 2, it is being explicit set to `NULL`. For statement 1, we need to be able to tell the client that we have decided not to send the column value for `c` as it is unchanged. We cannot send `c` as NULL as that would imply that the value was explicitly set to `NULL`. To do this, we will send omitted values as `'u'` which indicates an unchanges toasted value. The PG debezium connector (our fork) will be modified to treat this as a special case. This requirement doesn't arise in replica identity values inherited from PG (DEFAULT, NOTHING, FULL) because you can deduce whether the column is omitted or NULL just through the replica identity value. For e.g., consider the same example as above but with `FULL`. In this case, you'll also get the complete old tuple i.e. none of the columns are omitted. So if a column is sent as NULL, it means that it truly was NULL. Similar deductions apply to `DEFAULT` and `NOTHING` which have no before image except primary keys. --- **Upgrade/Rollback safety:** The logical replication support is disabled via the test flag `ysql_TEST_enable_replication_slot_consumption`. ###### [#21620] YSQL: Handle schema change (DDL) in the walsender for streaming of records Before this revision, we didn't support schema changes in the logical replication protocol implementation. This revision introduces mechanism to handle them. The implementation works by keeping a per table read_time_ht which is the hybrid time at which the Relation descriptor will be read from the PG catalog for that table. The implementation is as follows: 1. At the start of streaming, the record_commit_ht stored in the cdc state table is used. 2. Upon receiving a DDL response for table t - Remember that we received a ddl event for this table in a hash table 3. Upon the first DML on table t - Set the yb_read_time to what we stored in step 2 - Invalidate the relcache - Notify the reorder buffer which then notifies the pgoutput plugin that the schema for table t has changed. This ensures that pgoutput sends the schema to the client with the DML - mark that we have handled the DDL event for the table t in the hash table **Upgrade/Rollback safety:** Feature is disabled under test flag: `ysql_TEST_enable_replication_slot_consumption` ###### [#21686] CDCSDK: Store replica identity of tables created after the creation of the stream Before this revision, the replica identity of tables were stored at the stream creation time and used throughout streaming. This didn't include tables that were created and added to the stream after the creation of the stream. This revision adds logic to store replica identities of such tables as part our background addition logic. Note: With the present implementation, the only replica identity value possible for these tables is CHANGE (default) unless the users are able to sneak in an `ALTER TABLE REPLICA IDENTITY` statement immediately after the table creation and before the background thread handling such tables for CDC runs. Filed #21718 to allow an alternate mechanism for the same. Jira: DB-9727 Test Plan: `./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot'` `./yb_build.sh --java-test 'org.yb.pgsql.TestPgReplicationSlot#testWithDDLs'` Reviewers: asrinivasan, skumar Reviewed By: skumar Subscribers: bogdan, yql, ycdcxcluster, ybase, hsunder Differential Revision: https://phorge.dev.yugabyte.com/D34212

yugabyte-ci added area/cdcsdk CDC SDK jira-originated kind/bug This issue is a bug priority/low Low priority status/awaiting-triage Issue awaiting triage labels Jan 22, 2024

yugabyte-ci assigned asrinivasanyb Jan 23, 2024

yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jan 24, 2024

yugabyte-ci closed this as completed Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete cleanup in case of a create stream failure due to a late alter table #20725

Incomplete cleanup in case of a create stream failure due to a late alter table #20725

yugabyte-ci commented Jan 22, 2024 •

edited by jira bot

Loading

Incomplete cleanup in case of a create stream failure due to a late alter table #20725

Incomplete cleanup in case of a create stream failure due to a late alter table #20725

Comments

yugabyte-ci commented Jan 22, 2024 • edited by jira bot Loading

yugabyte-ci commented Jan 22, 2024 •

edited by jira bot

Loading