Skip to content

2.27.0.0-b568

@es1024 es1024 tagged this 15 Sep 20:48
Summary:
Today, transactions are categorized into one of:
1. region-local: these use transaction status tablets replicated solely within the same region as the TServer ("local region") processing the transaction
 - there must be at least one transaction status tablet with a peer on the TServer (leader is preferred)
 - all writes must be replicated within the local region
2. global: these use transaction status tablets from system.transactions

Consider the case where a user may have data that is replicated over multiple regions that are geographically close. For example, in a cluster that spans US, EU, Asia Pacific, there may be a table that is replicated between US-West, US-Central, US-East. When only writing to a table in the US, we should ideally not require replicating commits to EU/Asia Pacific, but any transaction that writes to this table must be global, since region-local requires writes that only replicate within the (singular) local region.

To improve handling of this situation, this revision introduces the concept of tablespace-local(X) localities:
3. tablespace-local(X): use transaction status tablets corresponding to tablespace X
 - there must be at least one transaction status tablet with a peer on the TServer (leader is preferred)
 - all writes must be to tables placed in tablespace X

This revision only allows promotion of tablespace-local(X) to global. A transaction is started as tablespace-local(X) instead of region-local if the newly added `use_tablespace_based_transaction_placement` gflag is true (default false). To mask CREATED heartbeat latency, a transaction pool for each tablespace-local(X) locality is also lazily created as needed. Removal of transaction pools for dropped tablespaces will be done in a future change (#28486).

When table-level object locks are enabled (ignoring shared memory fastpath for simplicity), the first RPC in the transaction is not the Perform, but instead AcquireObjectLocks. There is an existing issue (#28317) where on this path, transactions are not started based on locality of first operation but instead unconditionally started as region-local. Fixing this issue for the new tablespace-local locality is left for the future change that fixes #28317.

The incrementing of transaction table version when enable_tablespace_based_transaction_placement is changed has also been moved to a background task, since it involves a write to sys catalog that is not allowed from the thread running flag callbacks.

**Upgrade/Downgrade Safety**
The test flag `TEST_enable_tablespace_based_transaction_placement` was converted to a kLocalPersisted autoflag (`enable_tablespace_based_transaction_placement`).

The use of tablespace-local localities requires tablespace-tagged status tablets from GetTransactionStatusTablets response to be present, otherwise we revert to global, which is safe to use in older versions. These tablespace-tagged status tablets are not sent in the GetTransactionStatusTablets response until `enable_tablespace_based_transaction_placement` is enabled, so the entire feature is effectively gated behind the auto flag. This is necessary because tablespace-local transactions write a new value for the transaction locality into transaction metadata, which is not recognized by older versions.
Jira: DB-17954

Test Plan:
Added unit tests:
`./yb_build.sh --cxx-test pgwrapper_geo_transactions-test --gtest_filter 'GeoTransactionsTablespaceLocalityTest.*'`

Reviewers: sergei

Reviewed By: sergei

Subscribers: yql, rthallam, svc_phabricator, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D46211
Assets 2
Loading