storage, lock_manager: Use the new lock waiting queue instead of WaiterManager to handle pessimistic lock waking up #13447

MyonKeminta · 2022-09-09T12:41:04Z

What is changed and how it works?

Issue Number: ref #13298

What's Changed:

This PR refactors the implementation of lock waiting and waking up by introducing a new lock waiting queue, without changing the current lock-waiting behavior. This will be part of the work of introducing the new lock waiting model (#13298)

Note that this PR doesn't introduce any optimization to the lock-waiting model. It's a refectory which is the basis of the optimization.

Requires:

storage: Add new implementation of lock waiting queue #13486 : Introduces the new lock waiting facilities, and this PR updates the write path of acquiring lock and releasing lock to make use of them. Some important points are:
storage/lock_manager: Add metrics to the new lock waiting queue #13560 : Adds metrics to the new lock waiting queue.
storage/lock_manager: Avoid stale entries in the new lock waiting queue #13584 : Avoids the stale entry issue by using another implementation of priority queue that supports randomly removing instead of the std BinaryHeap.

WriteResultLockInfo (returned by AcquirePessimisticLock::process_write) carries parameters, which can be used for resuming the request in the future.
WriteResultLockInfo will be converted into LockWaitContext and LockWaitEntry, and then send to both LockManager and the new LockWaitQueues.
When a storage command releases some locks, will return the released locks to Scheduler::process_write, which will then call on_release_locks to pop lock waiting entries from the queues and wake up them asynchronously (to avoid increasing too much latency of the current command).
The LockManager (and its inner module WaiterManager) no longer has the responsibility for waking up waiters, but keeps its functionality of handling timeout and performing deadlock detection. Instead, it has a new remove_lock_wait method to remove a waiter from it.
Waiters in WaiterManager can now be uniquely identified by a LockWaitToken, and the data structure in WaiterManager is therefore changed. Accessing by lock hash and transaction ts is still necessary to handle the result of deadlock detection.

Updates the write path of acquiring lock and releasing lock to make use of the new `LockWaitQueue`. Some important points are:

1. `WriteResultLockInfo` (returned by `AcquirePessimisticLock::process_write`) carries parameters, which can be used for resuming the request in the future.
2. `WriteResultLockInfo` will be converted into `LockWaitContext` and `LockWaitEntry`, and then send to both `LockManager` and the new `LockWaitQueues`.
3. When a storage command releases some locks, will return the released locks to `Scheduler::process_write`, which will then call `on_release_locks` to pop lock waiting entries from the queues and wake up them asynchronously (to avoid increasing too much latency of the current command).
4. The `LockManager` (and its inner module `WaiterManager`) no longer has the responsibility for waking up waiters, but keeps its functionality of handling timeout and performing deadlock detection. Instead, it has a new `remove_lock_wait` method to remove a waiter from it.
5. Waiters in `WaiterManager` can now be uniquely identified by a `LockWaitToken`, and the data structure in `WaiterManager` is therefore changed. Accessing by lock hash and transaction ts is still necessary to handle the result of deadlock detection.

Related changes

PR to update pingcap/docs/pingcap/docs-cn:
Need to cherry-pick to the release branch

Check List

Tests (WIP)

Unit test
Integration test
Manual test (add detailed scripts or steps below)

Side effects

Performance regression
- Consumes more CPU
- Consumes more MEM

Release note

None

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

ti-chi-bot · 2022-09-09T12:41:06Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

cfzjywxk
sticnarf

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

MyonKeminta · 2022-09-09T12:41:51Z

/release

sre-bot · 2022-09-09T13:15:45Z

download tikv at http://fileserver.pingcap.net/download/builds/pingcap/tikv/pr/3ac944a081ef0ab8d4a2b3c1b03ea48c0b946fe1/centos7/tikv-server.tar.gz

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

MyonKeminta · 2022-09-14T06:41:13Z

/release

sre-bot · 2022-09-14T07:22:14Z

download tikv at http://fileserver.pingcap.net/download/builds/pingcap/tikv/pr/a85c3330e935c65ed0213d0489f272312f87262b/centos7/tikv-server.tar.gz

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

MyonKeminta · 2022-09-15T12:16:09Z

/release

sre-bot · 2022-09-15T12:50:40Z

download tikv at http://fileserver.pingcap.net/download/builds/pingcap/tikv/pr/437ee61d213779847e04333f2dc2d9076336dc68/centos7/tikv-server.tar.gz

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

MyonKeminta · 2022-09-16T06:06:01Z

/release

sre-bot · 2022-09-16T06:43:30Z

download tikv at http://fileserver.pingcap.net/download/builds/pingcap/tikv/pr/30fb3954da53cf10f6e2f09429df2c0f448379bb/centos7/tikv-server.tar.gz

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

…ager into a directory Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

…queue

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

…queue

MyonKeminta · 2022-09-19T07:48:47Z

Some test results:

Hot update

branch	threads	qps	min	avg	95	max
this	1	410.96	5.00	7.30	8.58	213.08
this	16	427.75	4.96	112.19	484.44	1258.74
this	64	424.94	5.22	451.48	1561.52	2231.76
this	128	426.06	4.83	899.94	3511.19	4320.65
this	512	410.84	4.94	3715.17	16819.24	20016.79
master	1	402.02	5.45	7.46	8.74	37.34
master	16	401.51	5.51	119.52	475.79	919.10
master	64	405.23	4.80	473.39	1506.29	2981.45
master	128	414.36	5.05	925.11	3040.14	4031.46
master	512	412.20	4.62	3702.80	12163.09	19075.38

Sysbench common

branch	scenario	threads	qps	avg	99	999
this	oltp_read_write	200	82390	2.17	14.6	28.8
master	oltp_read_write	200	83619	2.12	14.2	27.7
this	oltp_read_write	400	86140	4.09	27.8	48.8
master	oltp_read_write	400	86207	4.03	27.3	47.4
this	oltp_read_write	800	84894	8.22	54.3	88.6
master	oltp_read_write	800	84330	8.23	52.6	73.5
this	oltp_insert	200	46839	4.16	13.9	22.3
master	oltp_insert	200	45583	4.28	14.2	21.8
this	oltp_insert	400	49513	7.91	25.6	38.2
master	oltp_insert	400	47189	8.34	40.4	47.0
this	oltp_insert	800	53247	14.8	47.1	57.2
master	oltp_insert	800	49555	15.9	57.1	80.3
this	oltp_update_non_index	200	52848	3.67	12.3	26.1
master	oltp_update_non_index	200	52431	3.71	11.3	24.7
this	oltp_update_non_index	400	62729	6.25	25.0	32.9
master	oltp_update_non_index	400	60274	6.51	26.3	34.6
this	oltp_update_non_index	800	64708	12.2	42.9	64.4
master	oltp_update_non_index	800	65010	12.1	46.2	69.4

TPCC

branch	threads	tps	qps	avg	99	999
this	800	4686	72893	10.7	187	399
master	800	4322	67236	11.7	178	419

(WIP...)

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

…queue

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

sticnarf · 2022-10-18T09:17:37Z

src/server/lock_manager/waiter_manager.rs

    }

    fn is_empty(&self) -> bool {
-        self.wait_table.is_empty()
+        self.waiter_pool.is_empty()
    }

    /// Returns the duplicated `Waiter` if there is.


The comment is stale. And the return value of this function is confusing now.

sticnarf

The rest looks okay to me.

As it's so large a pull request, I'm not confident that code review could find most of the problems. We probably need to rely on further integration tests in this aspect.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

MyonKeminta · 2022-10-21T08:46:05Z

@ekexium @you06 Please help me review this PR if you have time, thanks! Ask me if you have any question.

cfzjywxk

LGTM

cfzjywxk · 2022-10-24T11:57:53Z

src/server/lock_manager/metrics.rs

@@ -60,13 +61,6 @@ lazy_static! {
        exponential_buckets(0.0001, 2.0, 20).unwrap() // 0.1ms ~ 104s
    )
    .unwrap();
-    pub static ref WAIT_TABLE_STATUS_GAUGE: WaitTableStatusGauge = register_static_int_gauge_vec!(


Do any related panels need to be removed accordingly?

I didn't remove that panel, but I put LOCK_WAIT_QUEUE_ENTRIES_GAUGE_VEC to the same panel.

ekexium · 2022-10-19T14:14:20Z

src/server/lock_manager/deadlock.rs

                }
            }
            DetectType::CleanUpWaitFor => {
-                detect_table.clean_up_wait_for(txn_ts, lock.ts, lock.hash)
+                let wait_info = wait_info.unwrap();
+                detect_table.clean_up_wait_for(txn_ts, wait_info.lock_digest)


How about merge them into one line?

ekexium · 2022-10-19T14:17:40Z

src/server/lock_manager/metrics.rs

@@ -12,6 +12,7 @@ make_auto_flush_static_metric! {
            detect,
            clean_up_wait_for,
            clean_up,
+            update_wait_for,


Is it intentionally unused?

Yes, it's unused currently, so does the update_wait_for method of LockManager.

ekexium · 2022-10-24T15:03:55Z

src/storage/mvcc/txn.rs

-    pub(crate) fn unlock_key(&mut self, key: Key, pessimistic: bool) -> Option<ReleasedLock> {
-        let released = ReleasedLock::new(&key, pessimistic);
+    /// Append a modify that unlocks the key. If the lock is removed due to
+    /// committing, a non-zero `commit_ts` need to be provided; otherwise if


Suggested change

/// committing, a non-zero `commit_ts` need to be provided; otherwise if

/// committing, a non-zero `commit_ts` needs to be provided; otherwise if

ekexium · 2022-10-24T16:01:59Z

src/server/lock_manager/mod.rs

-        // requests to detect deadlock, clean up its wait-for entries in the
-        // deadlock detector.
-        if is_pessimistic_txn && self.remove_from_detected(lock_ts) {
-            self.detector_scheduler.clean_up(lock_ts);


Is there an equivalent part of the clean up logic now?

Here the old logic wakes up all waiters waiting for the lock with lock_ts on the specified key, and cleans all edges that waits for the lock_ts from the detector. In new logic, waking-up happens in the new lock waiting queue, and when a lock-waiting request is finished (either canceled or resumed and successfully acquired, which is not yet supported), the remove_lock_wait function in LockManager will be invoked, which then leads to a call to clean_up_wait_for (here)

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

TonsnakeLin · 2022-10-25T09:01:23Z

src/server/lock_manager/config.rs

+            .remove("wake_up_delay_duration")
+            .map(ReadableDuration::from)
+        {
+            info!(


Why only the config wake_up_delay_duration need to print log?

The logs about the two config items mentioned here were previously printed in waiter_manager.rs when handling Task::ChangeConfig message. Now the wake_up_delay_duration is changed to be handled somewhere else, so I printed it here. When changing wait_for_lock_timeout, the log will still be printed at the old place.

TonsnakeLin · 2022-10-25T13:07:16Z

src/server/lock_manager/mod.rs

        }
+        self.waiter_mgr_scheduler.wait_for(


Is there any special meanings that you moved self.waiter_mgr_scheduler.wait_for behind self.detector_scheduler.detect?

No. It seems to be an accident. I'll change it back.

TonsnakeLin · 2022-10-25T13:22:09Z

src/server/lock_manager/waiter_manager.rs

-    /// `Notify` consumes the `Waiter` to notify the corresponding transaction
-    /// going on.
-    fn notify(self) {
+    /// Consumes the `Waiter` to notify the corresponding transaction `going on.


Suggested change

/// Consumes the `Waiter` to notify the corresponding transaction `going on.

/// Consumes the `Waiter` to notify the corresponding transaction going on.

TonsnakeLin · 2022-10-25T13:55:45Z

src/server/lock_manager/waiter_manager.rs

+    }
+
+    fn cancel_for_timeout(self, _skip_resolving_lock: bool) -> KeyLockWaitInfo {
+        let lock_info = self.wait_info.lock_info.clone();


Why do we need to clone lock_info?

If we move it, we will not be able to call another method cancel on self since self is partially moved. And also, the function needs to return a complete KeyLockWaitInfo to be used by the caller.

ekexium

I don't see other problems

ekexium · 2022-10-25T15:21:21Z

src/server/lock_manager/waiter_manager.rs

            detector_scheduler,
            default_wait_for_lock_timeout: cfg.wait_for_lock_timeout,
-            wake_up_delay_duration: cfg.wake_up_delay_duration,
+            // wake_up_delay_duration: cfg.wake_up_delay_duration,


Shall we just remove it?

Yes. I forgot it.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

MyonKeminta · 2022-10-26T03:20:54Z

/merge

ti-chi-bot · 2022-10-26T03:20:56Z

@MyonKeminta: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2022-10-26T03:20:58Z

This pull request has been accepted and is ready to merge.

Commit hash: 77caef7

ti-chi-bot · 2022-10-26T03:21:11Z

@MyonKeminta: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

MyonKeminta added 4 commits September 5, 2022 19:45

add new lock waiting queue

e1c0e5c

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Port new implementation of waiter manager from tikv#12749

89872af

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Fix some potential concurrency issues

eb2e51f

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Make use of the new lock waiting queue in scheduler

3ac944a

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

ti-chi-bot added release-note-none do-not-merge/work-in-progress needs-rebase size/XXL labels Sep 9, 2022

Add tests for lock waiting queue

a85c333

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Allow pushing forward the delayed wake up time

437ee61

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Try not order by legacy wake up cnt

30fb395

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

MyonKeminta added 7 commits September 16, 2022 17:26

Add tests

7e2b632

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Add documents

97ce953

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Add more comments

e6fce44

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Add lock wait queue and lock wait context; refactor storage::lock_man…

af57580

…ager into a directory Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Merge branch 'm/new-lock-waiting-queue-part' into m/new-lock-waiting-…

f899ee1

…queue

Remove commented code; remove debug level change in Cargo.toml

e810deb

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Merge branch 'm/new-lock-waiting-queue-part' into m/new-lock-waiting-…

c36257c

…queue

MyonKeminta added 2 commits September 19, 2022 16:07

Fix lint

38be4b7

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Merge branch 'm/new-lock-waiting-queue-part' into m/new-lock-waiting-…

bae7c54

…queue

Remove obsolete metric

effffc7

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

sticnarf reviewed Oct 18, 2022

View reviewed changes

sticnarf reviewed Oct 20, 2022

View reviewed changes

cfzjywxk requested a review from you06 October 20, 2022 09:00

MyonKeminta added 2 commits October 20, 2022 17:18

Address comments

c0abfdf

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Update grafana dashboard

4785619

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

sticnarf approved these changes Oct 21, 2022

View reviewed changes

ti-chi-bot added the status/LGT1 Status: PR - There is already 1 approval label Oct 21, 2022

cfzjywxk approved these changes Oct 24, 2022

View reviewed changes

ti-chi-bot added status/LGT2 Status: PR - There are already 2 approvals and removed status/LGT1 Status: PR - There is already 1 approval labels Oct 24, 2022

ekexium reviewed Oct 24, 2022

View reviewed changes

Address comments

4005b10

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

TonsnakeLin reviewed Oct 25, 2022

View reviewed changes

cfzjywxk requested a review from ekexium October 25, 2022 14:41

ekexium approved these changes Oct 25, 2022

View reviewed changes

Address comments

77caef7

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

ti-chi-bot added the status/can-merge Status: Can merge to base branch label Oct 26, 2022

Merge branch 'master' into m/new-lock-waiting-queue

e7c4c70

ti-chi-bot merged commit a4dc37b into tikv:master Oct 26, 2022

ti-chi-bot added this to the Pool milestone Oct 26, 2022

MyonKeminta deleted the m/new-lock-waiting-queue branch November 29, 2022 19:07

	/// committing, a non-zero `commit_ts` need to be provided; otherwise if
	/// committing, a non-zero `commit_ts` needs to be provided; otherwise if

	/// Consumes the `Waiter` to notify the corresponding transaction `going on.
	/// Consumes the `Waiter` to notify the corresponding transaction going on.

storage, lock_manager: Use the new lock waiting queue instead of WaiterManager to handle pessimistic lock waking up #13447

storage, lock_manager: Use the new lock waiting queue instead of WaiterManager to handle pessimistic lock waking up #13447

Conversation

MyonKeminta commented Sep 9, 2022 • edited

What is changed and how it works?

Related changes

Check List

Release note

ti-chi-bot commented Sep 9, 2022 • edited

MyonKeminta commented Sep 9, 2022

sre-bot commented Sep 9, 2022

MyonKeminta commented Sep 14, 2022

sre-bot commented Sep 14, 2022

MyonKeminta commented Sep 15, 2022

sre-bot commented Sep 15, 2022

MyonKeminta commented Sep 16, 2022

sre-bot commented Sep 16, 2022

MyonKeminta commented Sep 19, 2022 • edited

Hot update

Sysbench common

TPCC

Choose a reason for hiding this comment

sticnarf left a comment

Choose a reason for hiding this comment

MyonKeminta commented Oct 21, 2022

cfzjywxk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekexium Oct 24, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TonsnakeLin Oct 25, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekexium left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MyonKeminta commented Oct 26, 2022

ti-chi-bot commented Oct 26, 2022

ti-chi-bot commented Oct 26, 2022

ti-chi-bot commented Oct 26, 2022

MyonKeminta commented Sep 9, 2022 •

edited

ti-chi-bot commented Sep 9, 2022 •

edited

MyonKeminta commented Sep 19, 2022 •

edited

ekexium Oct 24, 2022 •

edited

TonsnakeLin Oct 25, 2022 •

edited