Skip to content

2.27.0.0-b223

@hari90 hari90 tagged this 13 Jun 06:05
Summary:
On the target cluster all writes to the user tables are written with the external hybrid time that we get from the source cluster.
For a failover we pick the xCluster safe time and PITR the target database to this consistent time.

DDLs however are not replicated with external hybrid times. We need to execute them, by running the actual PG statement, which results in the rows in pg catalog table (sys_catalog) to have a hybrid time as of the time the DDLs were executed.
For scans on the target this is ok, since we use the latest pg_catalog which is semantically in sync with the xCluster safe time. i.e. even though the rows in pg_catalog have a different hybrid time, they match the data in the user tables as of the xCluster safe time.
The process of executing DDLs and updating the xCluster safe time to account for it does not happen atomically. If the target poller crashes between the DDL and the update to the safe time table then the pg_catalog will be ahead of the xCluster safe time. This leads to some issues during normal scan as well that are tracked by #27071. But they are transient.
However, on a failover we cannot have such inconsistency, so we need to take into account the safe time of all executed DDLs, and ensure the xCluster safe time is semantically equivalent to the state of the pg_catalog.

In order to get a consistent xCluster safe time for the failover PITR, we first pause all pollers and wait for them to persist the last known apply safe time to the safe time table (4e7696f8a3d58505db1e19f254bc76e16178e158/D39975).
This change extends this to the DDL queue poller as well, and ensures the safe time it publishes includes all executed DDLs. This is done in `XClusterDDLQueueHandler::UpdateSafeTimeForPause` using a similar logic `XClusterDDLQueueHandler::ExecuteCommittedDDLs` uses to determine pending DDLs to execute. Instead of executing the DDLs, `UpdateSafeTimeForPause` only gets the max `commit_time` for which all DDLs have been executed and updates the pollers safe time to it. We also persist `last_commit_time_processed` in the safe time batch, to handle poller crashes after all DDLs in the queue is executed.
Jira: DB-12857

Test Plan:
Jenkins: urgent

```
ybd --cxx-test xcluster_ddl_replication-test --gtest_filter "XClusterDDLReplicationFailoverTest.FailoverWithPendingAlterDDLs"
ybd --cxx-test xcluster_ddl_queue_handler-test --gtest_filter "XClusterDDLQueueHandlerMockedTest.GetSafeTimeBetweenDDLProcessing"
```

Reviewers: mlillibridge

Reviewed By: mlillibridge

Subscribers: xCluster, yql

Differential Revision: https://phorge.dev.yugabyte.com/D44588
Assets 2
Loading