Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CDCSDK] Replica Identity Support For CDC #21314

Closed
yugabyte-ci opened this issue Mar 5, 2024 · 0 comments
Closed

[CDCSDK] Replica Identity Support For CDC #21314

yugabyte-ci opened this issue Mar 5, 2024 · 0 comments
Assignees
Labels
area/cdcsdk CDC SDK jira-originated kind/new-feature This is a request for a completely new feature priority/medium Medium priority issue

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented Mar 5, 2024

Jira Link: DB-10218

@yugabyte-ci yugabyte-ci added area/cdcsdk CDC SDK jira-originated kind/new-feature This is a request for a completely new feature priority/medium Medium priority issue labels Mar 5, 2024
Sumukh-Phalgaonkar added a commit that referenced this issue Mar 19, 2024
Summary:
This diff introduces the support to use replica identity in CDC for populating the before image records. Currently in order to get before image of the records, the before image mode (also known as CDCRecordType) needs to be specified at the time of stream creation in the yb-admin command. The CDCRecordType is a stream level construct meaning that all the tables in the stream will have the same CDCRecordType.

Postgres uses replica identity, which is a table level property to get the before image information for records in CDC. This diff (https://phorge.dev.yugabyte.com/D31074) enables the YSQL syntax for using replica identity. In YB CDC, at the time of stream creation, the replica identity for each table in the namespace is obtained and stored in stream metadata. For this purpose a `map<TableId, ReplicaIdentity>` is added to the proto for stream metadata. In the CDCSDK producer, the replica identity for the relevant table is fetched and the before image for the records is populated accordingly. UpdatePeersAndMetrics uses the replica identity of relevant table to see if the retention barrier is required and accordingly sets the `cdc_sdk_safe_time`.

Four replica identity modes are supported as of now:

  - FULL
  - DEFAULT
  - NOTHING
  - CHANGE

### Upgrade/Rollback Safety
The replica identity feature is guarded by a preview flag `ysql_yb_enable_replica_identity` with default value false. In order to use replica identity this flag should be set to true on both master and tserver once the upgrade is complete. A yb-admin command to upgrade the streams from older version to the newer version, containing the replica identity map along with replication slot details will be added in a separate diff.
Jira: DB-10218

Test Plan: ./yb_build.sh --cxx-test integration-tests_cdcsdk_replica_identity-test

Reviewers: skumar, asrinivasan, stiwary

Reviewed By: stiwary

Subscribers: yql, ybase, ycdcxcluster, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D32756
Sumukh-Phalgaonkar added a commit that referenced this issue Mar 21, 2024
Summary:
Original commit: 39aeeed / D32756
This diff introduces the support to use replica identity in CDC for populating the before image records. Currently in order to get before image of the records, the before image mode (also known as CDCRecordType) needs to be specified at the time of stream creation in the yb-admin command. The CDCRecordType is a stream level construct meaning that all the tables in the stream will have the same CDCRecordType.

Postgres uses replica identity, which is a table level property to get the before image information for records in CDC. This diff (https://phorge.dev.yugabyte.com/D31074) enables the YSQL syntax for using replica identity. In YB CDC, at the time of stream creation, the replica identity for each table in the namespace is obtained and stored in stream metadata. For this purpose a `map<TableId, ReplicaIdentity>` is added to the proto for stream metadata. In the CDCSDK producer, the replica identity for the relevant table is fetched and the before image for the records is populated accordingly. UpdatePeersAndMetrics uses the replica identity of relevant table to see if the retention barrier is required and accordingly sets the `cdc_sdk_safe_time`.

Four replica identity modes are supported as of now:

  - FULL
  - DEFAULT
  - NOTHING
  - CHANGE

### Upgrade/Rollback Safety
The replica identity feature is guarded by a preview flag `ysql_yb_enable_replica_identity` with default value false. In order to use replica identity this flag should be set to true on both master and tserver once the upgrade is complete. A yb-admin command to upgrade the streams from older version to the newer version, containing the replica identity map along with replication slot details will be added in a separate diff.
Jira: DB-10218

Test Plan: ./yb_build.sh --cxx-test integration-tests_cdcsdk_replica_identity-test

Reviewers: skumar, asrinivasan, stiwary, xCluster, hsunder

Reviewed By: skumar

Subscribers: bogdan, ycdcxcluster, ybase, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D33377
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cdcsdk CDC SDK jira-originated kind/new-feature This is a request for a completely new feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants