Summary:
Our codebase currently suffers from technical debt related to long-running blocking calls. A particularly problematic manifestation of this occurs when these blocking calls occupy threads that are also required to process the responses for those same calls.
**Example Scenario with YSQL Connections:**
Consider a situation where we have more YSQL connections than available RPC workers. If all workers simultaneously attempt to fetch new table information following a schema change, the system can deadlock. This occurs because the thread pool responsible for processing the table information responses is the same one being blocked by the initial requests. As a result, the entire tserver becomes unresponsive.
**Solution Implemented:**
We addressed this issue by redesigning the table information fetching mechanism in PgClient to use an asynchronous approach, thereby preventing thread pool exhaustion.
**Related Issue with Transaction Status:**
A similar problem can arise when read queries check transaction status. In this case, all worker threads might become blocked while waiting for transaction status responses. Crucially, these responses need to be processed by the same thread pool that's currently being blocked by the requests.
**Solution Implemented:**
To resolve this, we used a existing high-priority thread pool for handling transaction status responses. This separation ensures that transaction status queries can be processed even when the main worker pool is under heavy load.
Jira: DB-16330
Test Plan: PgSingleTServerTest.ManyYsqlConnections
Reviewers: rthallam, dmitry
Reviewed By: rthallam, dmitry
Subscribers: yql, ybase
Tags: #jenkins-ready
Differential Revision: https://phorge.dev.yugabyte.com/D43500