-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async reads for JDBC #13196
Async reads for JDBC #13196
Conversation
Prevents JDBC timeouts on long queries by returning empty batches when a batch fetch takes too long. Uses an async model to run the result fetch concurrently with JDBC requests.
I always thought the HTTP timeout is a client behavior, and didn't know that it can be controlled by the server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM with some minor comments
sql/src/main/java/org/apache/druid/sql/avatica/DruidJdbcResultSet.java
Outdated
Show resolved
Hide resolved
sql/src/main/java/org/apache/druid/sql/avatica/DruidJdbcResultSet.java
Outdated
Show resolved
Hide resolved
sql/src/main/java/org/apache/druid/sql/avatica/DruidJdbcResultSet.java
Outdated
Show resolved
Hide resolved
sql/src/main/java/org/apache/druid/sql/avatica/DruidJdbcResultSet.java
Outdated
Show resolved
Hide resolved
@FrankChen021, you get the Fastest Review Ever award! Thanks!
You are right: the client (and proxies and Router) all determine timeout: the shortest timeout wins. The challenge is when that HTTP timeout is shorter than the amount of time Druid needs to compute the next batch of results. In a regular REST API via the The challenge is with the occasional "BI" style query that, for whatever reason, returns many rows, or takes a long time to compute. Maybe you've got billions of rows and need to find the one event, anywhere in that set, that represents access to some resource. There is nothing for it but to scan all the data, and that might take a while depending on the filter used. Or, maybe someone wants to grab several million rows of data to feed into an ML model. And so on. Normally you would not want to run such a query on Druid: that;'s not what Druid is designed for. But, sometimes you just gotta do what you gotta do. In this case, if your query takes a minute to run, and one of the network elements in the chain has a 30 second timeout, you can't successfully run the query with request response. But, JDBC is different. It works by sending multiple requests for each query, each returning some number of rows, generally in the thousands. The client keeps asking for more batches until the server returns EOF. So, if we had a long running query, but each batch of records could be computed in less time than the HTTP timeout, the query would succeed. Most of the time in a big query, however, is the work before we get the first row. So, we want to generalize the above idea. JDBC allows us to return a batch of 0 rows. If so, the client just turns around and requests another batch until it gets rows or is told that an EOF occurred. Just to be clear, this polling is a feature of the Avatica client. The application just iterates though a That's what the async thing in this PR does for us. We tell Druid to go get a batch of rows. If this takes longer than the fetch timeout, we return an empty batch, and remember that Druid is busily working away to get us a batch. Maybe it will be ready next time. If, so, we return it. If not, we again return an empty batch and the whole thing repeats. For this to work, the fetch timeout has to be less than the HTTP timeout. Given jitter, we might use the Nyquist rule and set the fetch timeout to half of the HTTP timeout. It doesn't have to be exact, any value less than 1/2 the network timeout should be fine. Voila! We've decoupled query run time from network timeouts by running the query (or at least each batch fetch) asynchronously with the Avatica REST HTTP requests. So, you see, we don't set the HTTP timeout, we just work around it by always returning within the timeout, knowing that the client will do the right thing if we return 0 rows to stay within the timeout. |
I get it, it's a mechanism provided by the client. No wonder I didn't see such behaviour in other HTTP-based JDBC clients. |
I was thinking what's the proper tag to label this PR. We have a label And What do you think @paul-rogers @kfaraz |
Fixed race condition in Druid's Avatica server-side handler Fixed issue with no-user connections
@FrankChen021, thanks for taking a look at the PR and for working out the proper tag. You suggest:
When used for issues, it is hard for a user to know that an issue is in JDBC vs. the other areas you mention. Us developers know, but JDBC is a rather small area. Maybe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Prevents JDBC timeouts on long queries by returning empty batches when a batch fetch takes too long. Uses an async model to run the result fetch concurrently with JDBC requests.
Release Notes
Druid's Avatica-based JDBC handler is not the preferred way to run long-running queries in Druid.
Druid supports the Avatica JDBC driver. Avatica uses HTTP requests to communicate with the server. When using JDBC with long-running queries, the HTTP request can time out, producing an error. This PR uses an internal feature of JDBC to avoid timeouts by returning "empty batches" of rows when Druid takes too long to return the actual rows. The JDBC client automatically requests more rows, resulting in Druid queries running asynchronously with JDBC requests. The result is that JDBC queries no longer time out.
This feature is enabled by default with a timeout of 5 seconds. You can modify the time out by changing the
druid.sql.avatica.fetchTimeoutMs
property to the new timeout. Specify the timeout in milliseconds. Druid enforces a minimum of 1000 milliseconds to prevent hammering of the Broker.Description
Druid's Avatica handler implementation already uses async execution to open and close the "yielder" for a query. This PR extends the idea by using the same executor for fetches. The fetch state (yielder, fetch offset) resides in a new
ResultFetcher
class that is invoked on each request to get results.The async logic is simple:
ExecutorService
which returns a future.The close operation is modified to wait for any in-flight fetch to return before shutting down the result set. A
ResultFetcherFactory
creates the fetcher for each query. The factory is needed only to allow introducing artificial delays for testing.The Avatica handler tests are cleaned up to eliminate some redundancy. This then allowed tests for async to be created with less copy/paste than with the existing code.
This PR has: