Description
Jira Link: DB-17212
Description
The catalog_manager has a number of thread pools of fixed size; for example:
~/code/yugabyte-db/src/yb/master/catalog_manager.cc:982:
CHECK_OK(ThreadPoolBuilder("CatalogManagerBGTasks").Build(&background_tasks_thread_pool_));
In order for this to work, the tasks placed on those thread pools must not submit and wait for tasks on the same thread pools.
As an example, currently the xCluster setup tasks run on the async_task_pool_ but (indirectly via calling CreateTable) submit and wait for a task to create a table on that same thread pool. If the setup tasks are occupying all the available slots in that pool then those setup tasks cannot succeed. (In practice, there is a timeout preventing a true deadlock.)
(For more info on this example, see #26617: DocDBDDL Replication create_xcluster_checkpoint cmd fails with Timed out waiting for Table Creation if there are multiple DBs)
Of course, this problem generalizes to cycles in the usual way like with normal deadlocks: a task on pool A submits a task on pool B which submits a task on pool A and so on.
One way to solve this problem is to introduce a "locking order": thread pools have a global order and a task on one thread pool can only submit tasks on thread pools later in the order.
An alternative is to just disallow tasks on these thread pools from submitting tasks on any of the catalog_manager thread pools. This would require converting some task to be truly asynchronous; for example the xCluster setup task would have to call a new asynchronous version of create table passing it a callback that would schedule a task to continue its original work.
The final alternative is to make some of the thread pools of unlimited size. This might need some careful study to make sure this doesn't introduce too much parallelism or otherwise cause problems.
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
- I confirm this issue does not contain any sensitive information.