Use Readers-Writer lock for resource group #17131

abhiseksaikia · 2021-12-22T18:21:45Z

There are endpoints like /v1/queryState that takes resource group lock while fetching pathToRoot. This leads to lock contention when there are few thousand queries queued on a cluster. We can use Readers-Writer lock to avoid lock contention.

Test plan -

Run existing unit test
Perform verifier run
Shadow testing

== NO RELEASE NOTE ==

swapsmagic

Adding some unit test to make sure the methods fail when the thread has not acquired a lock beforehand is great. (i.e. around internalStartNext, startInBackground, enqueueQuery etc).

Also for the the shadow run, what exactly we tested? The test should cover the cluster with ~2K queries on it (running + queued), we should check if adjusted queue size metric is still publishing data and not timing out.

swapsmagic · 2021-12-29T21:21:35Z

...o-main/src/main/java/com/facebook/presto/execution/resourceGroups/InternalResourceGroup.java

        }
        else {
            id = new ResourceGroupId(name);
            root = this;
+            this.lock = new ReentrantReadWriteLock();


With default acquisition policy being non-fair where order of entry of read/write lock is unspecified, it make sense to use fair policy instead the default one.
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html

Acquisition order
This class does not impose a reader or writer preference ordering for lock access. However, it does support an optional fairness policy.
Non-fair mode (default)
When constructed as non-fair (the default), the order of entry to the read and write lock is unspecified, subject to reentrancy constraints. A nonfair lock that is continuously contended may indefinitely postpone one or more reader or writer threads, but will normally have higher throughput than a fair lock.
Fair mode
When constructed as fair, threads contend for entry using an approximately arrival-order policy. When the currently held lock is released, either the longest-waiting single writer thread will be assigned the write lock, or if there is a group of reader threads waiting longer than all waiting writer threads, that group will be assigned the read lock.

Adding some unit test to make sure the methods fail when the thread has not acquired a lock beforehand is great. (i.e. around internalStartNext, startInBackground, enqueueQuery etc).

Also for the the shadow run, what exactly we tested? The test should cover the cluster with ~2K queries on it (running + queued), we should check if adjusted queue size metric is still publishing data and not timing out.

My initial shadow run was mainly to make sure there are no deadlock introduced. I will do more testing with both fair and non-fair mode. I was a bit worried to use fair mode as there is an impact on throughput, but I will do more testing to verify this.

Based on testing and also digging deeper into the code, it looks like reader-writer lock is not going to help much here as we have more write locks compared to read locks when the query throughput is high. Also based on testing, metrics (not just adjusted queue size, but all other metrics) are stopped getting published when cpu utilization is consistently ~83% for longer time, so the issue seems to be not due to lock contention. I have converted this PR to draft for future reference in case we want to optimize locking further (places where we feel we can downgrade locking from write to read).

There are endpoints like /v1/queryState endpoint that takes resource group lock while fetching pathToRoot for a given resource group. This leads to lock contention when there are few thousand queries queued on a cluster. We can use Readers-Writer lock to avoid lock contention.

abhiseksaikia force-pushed the asaikia_rg_lock branch from 8385813 to 62e71a6 Compare December 22, 2021 18:22

abhiseksaikia marked this pull request as draft December 22, 2021 18:27

abhiseksaikia changed the title ~~Use Readers-Writer lock for resource group~~ [WIP] Use Readers-Writer lock for resource group Dec 22, 2021

abhiseksaikia changed the title ~~[WIP] Use Readers-Writer lock for resource group~~ Use Readers-Writer lock for resource group Dec 22, 2021

abhiseksaikia marked this pull request as ready for review December 22, 2021 18:46

abhiseksaikia requested review from swapsmagic and tdcmeehan December 22, 2021 18:46

swapsmagic requested changes Dec 29, 2021

View reviewed changes

abhiseksaikia force-pushed the asaikia_rg_lock branch from 62e71a6 to 7ba62a2 Compare January 5, 2022 02:37

abhiseksaikia marked this pull request as draft January 5, 2022 23:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Readers-Writer lock for resource group #17131

Use Readers-Writer lock for resource group #17131

abhiseksaikia commented Dec 22, 2021 •

edited

swapsmagic left a comment

swapsmagic Dec 29, 2021

abhiseksaikia Jan 5, 2022

abhiseksaikia Jan 5, 2022

Use Readers-Writer lock for resource group #17131

Are you sure you want to change the base?

Use Readers-Writer lock for resource group #17131

Conversation

abhiseksaikia commented Dec 22, 2021 • edited

swapsmagic left a comment

Choose a reason for hiding this comment

swapsmagic Dec 29, 2021

Choose a reason for hiding this comment

abhiseksaikia Jan 5, 2022

Choose a reason for hiding this comment

abhiseksaikia Jan 5, 2022

Choose a reason for hiding this comment

abhiseksaikia commented Dec 22, 2021 •

edited