Skip to content

2.27.0.0-b625

@huapengy huapengy tagged this 01 Oct 02:57
Summary:
In YugabyteDB, conflict resolution for a write request (or) read request which requires lock
acquisition (such as reads in serializable or explicit row locking in PG) has the following steps:

1. Acquire in-memory locks
2. Check intents db for writes with conflicting modes
3. Check regular db for writes that have modified the data (this is skipped for fast-path and serializable transactions)
4. Write provisional records to intents db
5. Write to regular db on transaction commit

Currently, in the step 1-4, we take weak locks on all logical prefixes of the row. For example,
if a row is inserted with pk: ((h1, h2), r1, r2), we take weak locks on (), ((h1, h2)), ((h1, h2), r1) and ((h1, h2), r1, r2).
The weak lock on the prefixes is to ensure that the enclosing parent doc key is not modified.
More information is available [[ https://docs.yugabyte.com/preview/architecture/transactions/isolation-levels/#locking-granularities | here ]].

However, weak locks on prefixes are only needed when a transaction acquires a strong lock
on some prefix, which can only happen when a serializable isolation takes explicit row locks
on a prefix of the pk (i.e., when some but not all pk columns are specified in the where clause).
Given that serializable isolation is not widely used, this diff adds a mode of operation where
locks on intermediate prefixed are avoided. Instead, serializable isolation transactions will
acquire strong modes on the top-level prefix when the full pk is not specified. And RC/ RR
transactions will acquire weak modes on the top-level prefix.

The downside of this mode is that if there are serializable transactions in the workload,
they can cause a lot of contention due to the top-level strong locks. A warning will be
logged in such cases.

Some implementation details:
(1) Introduction of skip_prefix_locks gflag.
A new gflag, skip_prefix_locks, has been introduced. When skip_prefix_lock is
false (current behavior): except the locks on the full-doc key, the weak locks are taken on
all prefixes for both Serializable isolation and Snapshot/RC isolation levels.
When skip_prefix_lock is true: only take locks on the top-level key and the full-doc keys in all
isolation levels, skipping intermediate prefix locks. But for Serializable level, the top-level key
needs to take strong lock if the associated key doesn't contain a primary key.

(2) Supports enable/disable skip_prefix_locks.
Can enable/disable skip_prefix_locks, but it is not safe if there are transactions running at
serializable isolation level. Will support enabling/disabling skip_prefix_locks safely in a separate diff.

(3) To determine whether to lock the top-level key in strong mode or not for serializable isolation,
we store information about whether the a full pk is known for the operation (refer pk_is_known).

(4) The skip prefix locks feature assumes that in Snapshot/RC isolations level a primary key or
empty key is always specified in the request which needs an intents db write (i.e., we never
specify a non-empty prefix of a primary key in the request). Some checks are added to ensure
the assumption should never be broken.

 (5) Performance. With the skip prefix lock feature enabled, the number of keys written to the intents
 db is expected to decrease, which can help improve write performance. Some benchmark numbers:
Bulk load on a regular table with 2 indexes on various column:
2 columns,        bulkload duration reduced  14.2%,      avg tserver latency reduced 117%
6 columns ,       bulkload duration reduced  22%,         avg tserver latency reduced 205%
12 columns,      bulkload duration reduced  60%,         avg tserver latency reduced 469%

**Upgrade/Rollback safety:**
A regular gflag "skip_prefix_locks" is added with default value false. Can only enabled in new installation
for now. In separate diff,  will change to autoflag with type kLocalPersisted to support enable/disable
the feature safely and upgrade.
The upgrade is not affected when the feature is disabled. The field 'skip_prefix_locks' in TransactionMetadataPB
has default value false, so the feature will be disabled automatically if the field is missing in the message.
Support for upgrade with the feature enabled will be added in a separate diff. We already have some existing unit
tests to cover upgrade, will add more special for skip prefix lock in separate diff.
Jira: DB-18373

Test Plan:
Jenkins: urgent

ybd release --cxx-test pgwrapper_pg_row_lock-test
ybd release --cxx-test pg_get_lock_status-test
ybd release --cxx-test doc_key-test
ybd release --cxx-test conflict_resolve_keys_verification-itest  (new test)
ybd release --java-test 'org.yb.cql.TestTransaction'
ybd release --java-test 'org.yb.pgsql.TestPgExplicitLocks'

Reviewers: sergei, pjain, bkolagani, rthallam, patnaik.balivada

Reviewed By: sergei, pjain

Subscribers: ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D45672
Assets 2
Loading