Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Default timeout for requests breaks chunked downloads #18

Closed
bjoernpollex-sc opened this issue Jan 6, 2020 · 9 comments · Fixed by #45
Closed

Storage: Default timeout for requests breaks chunked downloads #18

bjoernpollex-sc opened this issue Jan 6, 2020 · 9 comments · Fixed by #45
Assignees
Labels
api: storage Issues related to the googleapis/python-storage API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@bjoernpollex-sc
Copy link

The default timeout introduced in googleapis/google-resumable-media-python#88 is causing crashes in our application. We are using chunked downloads by setting chunk_size on the blob, and then calling download_to_file. Our application is multi-threaded, and we are actually downloading files into a custom stream that is backed by a ring-buffer (so writes may block until space is available again). In some cases, and I haven't figured out the pattern yet, our application hits the default timeout when fetching a new chunk of data inside AuthorizedSession.request. Currently, google.cloud.storage offers no way to use a custom transport (as suggested here), so this workaround is not applicable, and there is no way to override the timeout.

I can't provide a simple reproduction here yet, I'm still investigating, but since it's an issues that only occurs sporadically, I'm not even sure that's possible. I'm wondering if maybe in some scenarios Python's multi-threading just hits an unlucky timing, and the thread running the request doesn't get scheduled for a longer time than usual, causing the observed timeout to be much higher. Not sure how to test this though.

I'm posting here, because the upstream change was made to fix googleapis/google-cloud-python#5909, I'm not sure what the proper fix would be.

@HemangChothani
Copy link
Contributor

@crwilcox Should we expose transport_kwargs for upload and download the blob which pass along to the transport.request?

@plamut
Copy link
Contributor

plamut commented Jan 9, 2020

FWIW, there is a resumable media PR that I coincidentally opened yesterday which adds customizable timeouts. It's for uploads only, though, but the same changes can be added to downloads as well.

A generalization of this would be exposing all transport_kwargs as @HemangChothani already pointed out.

@bjoernpollex-sc
Copy link
Author

@HemangChothani @crwilcox If there is a clear approach, I'm happy to support the implementation. Any thoughts on this?

@plamut
Copy link
Contributor

plamut commented Jan 19, 2020

@bjoernpollex-sc Is the error you are seeing a Timeout error raised by the TimeoutGuard context manager? In other words, the issue similar to https://github.com/googleapis/google-cloud-python/issues/10137 ?

If so, there is a pull request awaiting review that will address this. The default timeout will no longer be interpreted as the "total timeout", but rather just a timeout for the transport layer for each request (as implemented by the requests lib). The role of the "total timeout" will be taken over by a separate parameter.

@bjoernpollex-sc
Copy link
Author

@plamut Yes, the timeout is raised by that context manager. It sounds like your PR might fix my issue as well, thank you very much! I am not entirely sure, but my guess is that the timeout in my case gets triggered by context switched between threads, since the timeout is measured against the wall-clock, and the timed operations are not atomic. I can try it out sometime this week.

@plamut
Copy link
Contributor

plamut commented Jan 24, 2020

@bjoernpollex-sc The yesterday's google-auth release contains the change that separates transport timeouts from the total time wall clock timeout. The latter is None by default, meaning that TimeoutGuard should not be raising unnecessary Timeout errors anymore.

Until a new storage version is released, you can try manually pinning google-auth==1.11.0, which should help based on the information in this thread.

One note - the release also changed the default transport timeout to 120 seconds to prevent requests from hanging indefinitely. It's the same timeout that is already used for automatically refreshing credentials, but nevertheless mentioning this if it plays a role in your setup with all the threads, etc.

@bjoernpollex-sc
Copy link
Author

@plamut Yeah, I already got notified, thanks! I'll try out the new release on Monday, and report here if it resolves the issue.

@bjoernpollex-sc
Copy link
Author

bjoernpollex-sc commented Jan 31, 2020

@plamut I updated to google-auth==1.11.0 and everything is working fine, thank you very much!

@plamut
Copy link
Contributor

plamut commented Jan 31, 2020

@bjoernpollex-sc That's great to hear!

I will still keep the issue open for a little while, though, because the PR that bumps the google-auth version should be merged first.

@plamut plamut reopened this Jan 31, 2020
@plamut plamut assigned plamut and unassigned crwilcox Jan 31, 2020
@crwilcox crwilcox transferred this issue from googleapis/google-cloud-python Jan 31, 2020
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label Jan 31, 2020
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 3, 2020
@plamut plamut added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 4, 2020
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/python-storage API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants