-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting a timeout for BigQuery async_query doesn't work #4135
Comments
/cc @jonparrott |
Just to clarify, the You have to run One reason for the behavior you are experiencing is that the |
I'm trying to provide back compatibility for this feature: googleapis/python-bigquery-pandas#76. This is for preventing stalled jobs from hanging an entire script, and to have an elegant way to move on in the script without having to write our own timer function to handle; if the actual query can be killed in background, all the better, but at the bare minimum, it should stop looking for results after a set amount of time and move on. Unfortunately I can't replicate with aync_query, but I can with sync_query. As for seconds, I tried several more complex queries that definitely takes more than 1 second to complete and it still returned results for all of them. I'm ok with updating to seconds in my project since I can't imagine many use cases for sub-1 second timeouts being needed, but if it is indeed supposed to work this way, the documentation here should update to reflect this since there are no mentions of the timeout parameter being in seconds (and for legacy users and those perusing the docs, one might be mistaken that it's supposed to be milliseconds due to the timeoutMs and timeout_ms references). |
I think I know what's going on. The call to result repeatedly calls the done method, but since the done method does an API that could last up to 10 seconds, any timeout less than 10 seconds won't work. @jonparrot Do you think we could add an optional timeout parameter to done? Or a should we pass the timeout from result to done in some other way? |
@tswast I don't think we should use connection/response timeouts to indicate incomplete. If we are going to poll indefinitely for the response then blocking on the request is fine, but if the poll has a deadline then the request to check if a job is done should return as quickly as possible. |
Since the deadline is an integer in seconds, maybe it would make sense if |
This will be fixed in the next release after the |
The documentation and code appears to allow for setting a maximum duration before timing out for an async_query by passing
timeout
(an int for the number of milliseconds) to theresult()
function that is called on a query job (https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/job.py#L476 and https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/job.py#L1311). Similarly, there also appear to be some references to thefetch_data()
function of a QueryResults object being able to accept a timeout parameter (https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/bigquery/google/cloud/bigquery/query.py#L390).However, passing a timeout value of 1 or 0 in either fashion with a async_query does nothing to interrupt the query from being completed. Queries complete regardless and output results:
Output:
By contrast, setting a timeout value for a sync_query by setting the
timeout_ms
property of a QueryJob indeed works as expected as raises the appropriate JobComplete = False flag:Output:
It appears timeout for async_queries is still to be implemented and it is unclear how to access a JobIncomplete property via an async_query, forcing users to continue using sync_queries if they want the ability to time out their queries.
cc: @tswast who requested I raise the issue which I identified in this pandas-gbq PR: googleapis/python-bigquery-pandas#25 (comment)
OSX
Python 2.7.13
google-cloud-python 0.27.0 and google-cloud-bigquery 0.26.0
The text was updated successfully, but these errors were encountered: