Skip to content

2.27.0.0-b152

@mdbridge mdbridge tagged this 28 May 15:30
Summary:
Currently, failed DDLs -- either on the source or target universe -- can cause the DocDB column IDs to diverge across the two universes.

This diff fixes this by rolling back the next DocDB column ID counter when a DDL fails.  This is only done for YSQL databases participating in automatic mode replication.

This rollback of the DocDB column ID counter can cause problems via the tablet metadata del_columns field.  In particular, when we notice that a column currently present in the RocksDB schema is missing in the new schema we are being given, we add its column ID to this field, which is used to determine which columns can be dropped during compaction.  The rollback of an ADD COLUMN will thus mark the column ID temporarily used by the new column as deleted; however, a future ADD COLUMN can reuse that column ID and that column will get incorrectly compacted away.

To prevent this, I introduce a notion of "provisional column IDs", which are column IDs for columns that could be rolled back in the future.  Such IDs are never added to the del_columns field.

Determining which column IDs are provisional is easy:
  * take the minimum of
    * the next column ID we would use for this table
    * the column ID we might roll back to if the current transaction aborts

This information (the first provisional column ID) is passed down to the tablet server via the a new field in the ChangeMetadataRequestPB.  Note this information is also passed across xCluster to the tablet server(s) on the other side.

Note that we have the nice property here that if we accidentally lose the rollback to the tablet server (this can happen in some very rare cases), absolute worst case we fail to compact some columns for a while.

I have done a small number of clangd tidy fixes.

**Upgrade/Rollback safety:**

This diff adds fields to 2 persisted protobufs:
```
~/code/yugabyte-db/src/yb/tablet/operations.proto:106:
message ChangeMetadataRequestPB {
  ...
  // If present, holds the first column ID that may be provisional.  Columns with this ID or later
  // may be rolled back then possibly rolled forward with different types.
  //
  // If missing, then all existing columns may safely be treated as non-provisional.
  optional int32 first_provisional_column_id = 21;
}
```

```
~/code/yugabyte-db/src/yb/master/catalog_entity_info.proto:49:
  message YsqlDdlTxnVerifierStatePB {
    ...
    optional int32 previous_next_column_id = 6;
  }
```

The second case is unproblematic because it is not possible to upgrade to this version while automatic mode is running.  (This is the first stable release with automatic mode available and we have banned upgrading between preview releases with automatic mode on.)  Thus, old code will never see previous_next_column_id present and the new code knows what to do when it's missing (don't attempt to roll back the column ID).

For the first case,
  * the old code will ignore the first_provisional_column_id; this is safe because we cannot have actually rolled back the column ID because automatic mode is not running
  * if the new code sees the field missing, it assumes the columns it already has are non-provisional; this is safe because no rollbacks of column ID could have happened by that point.

Fixes #25787
Jira: DB-15076

Test Plan:
Added a full set of tests for the various cases:
  * backup and restore preserves the next DocDB column ID (currently disabled due to a separate issue)
  * DDL failure on source
    * above but on a partitioned table (this causes 2 tables to get their columns changed in a single DDL)
  * DDL failure on target
  * failover where writes but not the relevant DDLs have been replicated
  * rolling back IDs does not cause columns to be interpreted as deleted

There is not a case for just dropping replication because we are documenting that that doesn't work.

I verified that these tests fail without the fixes in this diff.

```
ybd release --cxx-test xcluster_ddl_replication-test --gtest_filter '*.DocdbNextColumnAboveLastUsedColumn'
ybd release --cxx-test xcluster_ddl_replication-test --gtest_filter '*.FailedSchemaChangeOnSource'
ybd release --cxx-test xcluster_ddl_replication-test --gtest_filter '*.FailedSchemaChangeOnSourceWithPartitioning'
ybd release --cxx-test xcluster_ddl_replication-test --gtest_filter '*.FailedSchemaChangeOnTarget'
ybd release --cxx-test xcluster_ddl_replication-test --gtest_filter '*.ColumnIdsOnFailover'
ybd release --cxx-test xcluster_ddl_replication-test --gtest_filter '*.RollbackPreservesDeletedColumns'
```

Another useful existing test lets us verify that we are actually compacting away deleted columns:
```
ybd release --cxx-test pg_mini-test --gtest_filter '*.TablegroupCompaction' --test-args '--ysql_enable_packed_row=false'
```

Reviewers: xCluster, hsunder, jhe

Reviewed By: jhe

Subscribers: jhe, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D43676
Assets 2
Loading