Release 2.25.0.0-b300: [#24478] DocDB: Resume contentious waiter on tserver's rpc threadpool · yugabyte/yugabyte-db

2.25.0.0-b300
710c9b1
Choose a tag to compare

Filter

View all tags

2.25.0.0-b300: [#24478] DocDB: Resume contentious waiter on tserver's rpc threadpool

2.25.0.0-b300
710c9b1
Choose a tag to compare

Filter

View all tags

basavaraj29 tagged this 13 Nov 20:15

Summary:
We resume waiters from the wait-queue serially. In the process, when resuming a waiter, we try to re-acquire the shared in-memory locks with a deadline of 100ms. If it fails, the idea was to schedule the resumption on the tserver's rpc threadpool (which is used to serve new incoming reads/writes) so as to introduce some kind of back-pressure on new incoming requests.

In the existing code, we seem to be scheduling contentious waiters on the wrong threadpool. In particular, we were doing
```
messenger_->scheduler().Schedule([serial_waiter](const Status& s) { ... })
```
which uses scheduler's `io_service_` that has a threadpool of size 4 (`FLAGS_io_thread_pool_size`) which is much lesser than the size of rpc threadpool `Messenger::normal_thread_pool_` (controlled by FLAGS_rpc_workers_limit, which defaults to 1024).

Using the io thread pool as opposed to the rpc threadpool could have the following consequences
1. It would not simulate the desired back-pressure on new incoming requests as we aren't consuming any threads from the rpc threadpool
2. If contention exists at various tablets, waiting requests from various tablets get scheduled on the threadpool. And since the size of the io threadpool is limited to 4, it implies that at max of 4 waiters could be scheduled for resumption concurrently. If for some reason contention reduces, the requests scheduled on the io threadpool could face higher latencies in comparison to them being resumed on the rpc threadpool (due to higher capacity of the rpc threadpool).

This revision resolves the bug by scheduling contentious waiters on the tserver's rpc threadpool instead (the one use to serve regular writes).
Jira: DB-13389

Test Plan:
./yb_build.sh tsan --cxx-test='TEST_F(PgWaitQueueContentionStressTest, TestResumeWaitersOnRpcThreadpool) {' --test-args --vmodule=wait_queue=1 -n 10 --tp 1 -- -k

microbenchmark run - https://perf.dev.yugabyte.com/report/view/W3siaXNCYXNlbGluZSI6ZmFsc2UsIm5hbWUiOiJTZWxlY3RlZCBUZXN0LWlkIiwidGVzdF9pZCI6ODI3ODIwMn0seyJpc0Jhc2VsaW5lIjp0cnVlLCJuYW1lIjoiQmFzZWxpbmUiLCJ0ZXN0X2lkIjoiODI4MTQwMiJ9XQ==

Checked `yb_latency_histogram` in pg stats. for most part, the tail latencies seem slightly better. but the benchmark itself has some variance, so wouldn't dig into it much.

Reviewers: sergei, pjain, patnaik.balivada

Reviewed By: sergei

Subscribers: ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D39516

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!