Skip to content

2.27.0.0-b439

@artem-mindrov artem-mindrov tagged this 12 Aug 22:52
Summary:
Txn scoped xCluster config can end up getting restarted entirely in case there is a DB with the number of failed tables exceeding the number of healthy ones across the config, and the restart request only contains that DB.

Instead of just checking set sizes, find out if any tables from the config will not require bootstrapping by computing a set difference.

Test Plan:
Create a DR config on top of Txn xCluster with the following DBs
- yugabyte w/ 1 table
- db1 w/ 1 table
- db2 w/ 3 tables

To simulate a failure:
- remove db2 from DR
- make a code change to remove WITH FORCE from the drop DB statement during DeleteKeyspace
- connect to db2 through ysqlsh on the target
- add db2 back so that DeleteKeyspace fails and leaves all the db2's tables in an error state
- restart replication only on db2

No bootstrapping / replication restart should happen on either yugabyte / db1

```
2025-08-11T16:08:12.395Z  [info] a6b2bcbd-8abf-4102-a500-e3701c3dd7f7 EditXClusterConfig.java:313 [TaskPool-14] com.yugabyte.yw.commissioner.tasks.EditXClusterConfig tableIdsDeleteReplication is [] and isRestartWholeConfig is false
2025-08-11T16:08:12.459Z  [info] a6b2bcbd-8abf-4102-a500-e3701c3dd7f7 CreateXClusterConfig.java:567 [TaskPool-14] com.yugabyte.yw.commissioner.tasks.CreateXClusterConfig Creating subtasks to set up replication using bootstrap for tables [00004001000030008000000000004006, 00004001000030008000000000004016, 0000400100003000800000000000400e, 00004001000030008000000000004002, 00004001000030008000000000004012, 0000400100003000800000000000400a] in keyspace db2
```

Reviewers: hzare

Reviewed By: hzare

Subscribers: yugaware

Differential Revision: https://phorge.dev.yugabyte.com/D45969
Assets 2
Loading