New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query causes bigrquery to fail, likely due to page_size memory issue #169
Comments
I had the same error and changing the page_size to a smaller value solved my issue |
Changing the page_size is not helping me. I've 20,000 rows with 175 columns. It processes some megabytes and then shows
|
Can someone please attempt to create a reprex using publicly available data? Unfortunately the chances of me being able to locate and fix the underlying issue are slim. |
Oh this is actually a duplicate of #209 |
#209 is actually an issue related to handling the response from the GBQ API. It is resolved (98dec6f)by outputting the "responseTooLarge" reason received from the API. This issue is related to a memory error on the local machine while attempting to parse the response received with the default page_size=10,000. A reprex is available-memory-dependent, but by increasing the page_size I was able to reproduce it with the following on a machine with 4 GB of RAM:
|
@ras44 thanks! I'll look into it, but it's unlikely I'll be able to do much more than give an informative error. |
@hadley I attempted to reproduce this again today and it looks like it's not a memory issue: bigrquery is actually receiving a response from the GBQ API. With the results of #209 and 98dec6f, bigrquery now outputs the "Response too large" reason, which should help people understand that they need to reduce the page_size. I think we can close this one again if you agree. |
It's also possible that you need to explicitly save to a temporary table: https://cloud.google.com/bigquery/docs/writing-results#large-results |
I can confirm that in both cases (when I create a temporary table or just run the query) I receive the "responseTooLarge" reason from the API. It's as if the API has an internal limit with which it decides whether a response is too large. In either case, I think it's safe to say this is not an issue with R and the response is being handled appropriately now with the reason included in the output. |
I have a table with 128 columns and about 10,000 rows in bigquery. In R Studio running on a Google Cloud instance, I have been assigning the result of a query to a variable with:
This has worked for the past few weeks with no problems. Today, I received the error:
Solution:
If we change page_size to a smaller value (default is 10,000) the query succeeds, supposedly because results are spread across multiple page requests, and whatever (possibly memory issue?) is causing the internal error is not triggered. I'm skeptical as to whether it's a system memory issue, because I can continue to assign values to other variables in R Studio with no problems. It seems like there is an issue with iterating through the pages in the response.
If this paging exception were caught, outputting a "decrease page_size" suggestion might help others.
The text was updated successfully, but these errors were encountered: