cluster streaming: stream client should be resilient to source cluster topology changes #66722
Labels
A-disaster-recovery
A-tenant-streaming
Including cluster streaming
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-cdc
Milestone
Today in cluster to cluster streaming, the destination cluster communicates with a single node on the source cluster. However, since this is a long-lived job that node may be removed from the cluster. To be resilient to topology changes in the source cluster, the stream client needs to be able to maintain a list of active nodes it can reach.
This will likely be exposed as a service that the stream client (running on the destination cluster) will either periodically poll, or register to receive notifications of node additions/removals from the cluster.
The set of nodes that the destination cluster thinks is active in the source cluster should be persisted in the job so that it can be resumed.
Epic CRDB-18753
Jira issue: CRDB-8201
The text was updated successfully, but these errors were encountered: