kv/concurrency: increase kv.lock_table.coordinator_liveness_push_dela… · nvanbenschoten/cockroach@7b41b80

Commit

kv/concurrency: increase kv.lock_table.coordinator_liveness_push_dela…

…y to 50ms

This change increases the default value for
`kv.lock_table.coordinator_liveness_push_delay`. Increasing this delay
is a big win for contended workloads because it allows transactions to
queue locally to conflicting locks (in the lockTable's lockWaitQueue)
instead of remotely next to the conflicting lock holders' txn records
(in the txnWaitQueue) in more cases.

This has a big effect on high contention / high concurrency workloads.
As we'll see in the following experiment, YCSB workload A would change
regimes somewhere between 200 and 256 concurrent threads. Operations
would slow down just enough that waiting txns would start performing
liveness pushes and become much less effective at queuing and responding
to transaction commits. This would cause the rest of the operations to
slow down and suddenly everyone was pushing and the front of every
lockWaitQueue was waiting in the txnWaitQueue.

The first group of selects and updates on the left (50/50 ratio, so they
overlap) shows YCSB workload A run with 256 threads and a 10ms liveness
push delay. We expect ~36k qps, but we're only sustaining between 5-15k
qps. When we bump this delay to 50ms on the right, we see much higher
throughput, stabilizing at ~36k. The workload is able to stay in the
efficient regime (no liveness pushes, listening directly to lock state
transitions) for far longer. I've tested up to 512 concurrent workers on
the same workload and never seen us enter the slow regime.

<todo throughput graph>

<todo pushes graph>

This partially addresses an existing TODO:

> We could increase this default to somewhere on the order of the
> transaction heartbeat timeout (5s) if we had a better way to avoid paying
> the cost on each of a transaction's abandoned locks and instead only pay
> it once per abandoned transaction per range or per node. This could come
> in a few different forms, including:
> - a per-store cache of recently detected abandoned transaction IDs
> - a per-range reverse index from transaction ID to locked keys
>
> TODO(nvanbenschoten): increasing this default value.

The finalizedTxnCache (introduced in cockroachdb#49218) gets us part of the way
here. It allows us to pay the liveness push delay cost once per
abandoned transaction per range instead of once per each of an abandoned
transaction's locks. This helped us to feel comfortable increasing the
default delay from the original 10ms to the current 50ms. Still, to feel
comfortable increasing this further, we'd want to improve this cache
(e.g. lifting it to the store level) to reduce the cost to once per
abandoned transaction per store.

To confirm that we're not regressing performance noticeably here, we run
the same tests that we ran in cockroachdb#49218:
```
--- BEFORE
SET CLUSTER SETTING kv.lock_table.coordinator_liveness_push_delay = '10ms';

CREATE TABLE keys (k BIGINT NOT NULL PRIMARY KEY);
BEGIN; INSERT INTO keys SELECT generate_series(1, 10000); ROLLBACK;
SELECT * FROM keys;

  k
-----
(0 rows)

Time: 5.574853s

CREATE TABLE keys2 (k BIGINT NOT NULL PRIMARY KEY);
BEGIN; INSERT INTO keys2 SELECT generate_series(1, 10000); ROLLBACK;
INSERT INTO keys2 SELECT generate_series(1, 10000);

INSERT 10000

Time: 33.155231s

--- AFTER
SET CLUSTER SETTING kv.lock_table.coordinator_liveness_push_delay = '50ms';

CREATE TABLE keys (k BIGINT NOT NULL PRIMARY KEY);
BEGIN; INSERT INTO keys SELECT generate_series(1, 10000); ROLLBACK;
SELECT * FROM keys;

  k
-----
(0 rows)

Time: 5.547465s

CREATE TABLE keys2 (k BIGINT NOT NULL PRIMARY KEY);
BEGIN; INSERT INTO keys2 SELECT generate_series(1, 10000); ROLLBACK;
INSERT INTO keys2 SELECT generate_series(1, 10000);

INSERT 10000

Time: 35.429984s
```

Release note (performance improvement): Transaction liveness pushes are
now delayed by 50ms, up from 10ms. This allows high contention workloads
to sustain high throughput up to much larger concurrency levels.

Loading branch information

nvanbenschoten committed Jun 12, 2020

1 parent 5be564a commit 7b41b80

pkg/kv/kvserver/concurrency/lock_table_waiter.go

-Original file line number
+Diff line change
@@ Expand Up @@
     	// - a per-store cache of recently detected abandoned transaction IDs
     	// - a per-range reverse index from transaction ID to locked keys
     	//
-    	// TODO(nvanbenschoten): increasing this default value.
-*time.Millisecond,
+    	// EDIT: The finalizedTxnCache gets us part of the way here. It allows us to
+    	// pay the liveness push delay cost once per abandoned transaction per range
+    	// instead of once per each of an abandoned transaction's locks. This helped
+    	// us to feel comfortable increasing the default delay from the original
+    	// 10ms to the current 50ms. Still, to feel comfortable increasing this
+    	// further, we'd want to improve this cache (e.g. lifting it to the store
+    	// level) to reduce the cost to once per abandoned transaction per store.
+    	//
+    	// TODO(nvanbenschoten): continue increasing this default value.
+*time.Millisecond,
     )
     // LockTableDeadlockDetectionPushDelay sets the delay before pushing in order to
@@ Expand Down @@

0 comments on commit `7b41b80`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `7b41b80`

Commit

There are no files selected for viewing

0 comments on commit 7b41b80

0 comments on commit `7b41b80`