-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to write into GCS bucket with papermill[gcs] #312
Comments
Do you know how the gcs rate limiting system works? We're emitting a save after each cell executes today. We could capture rate limiting requests and try to respect them but the number of saves here should be #cells + 2 which seems reasonable for most interfaces |
Looks like we are experiencing this: "There is no limit to how quickly you can create or update different objects in a bucket. However, a single particular object can only be updated or overwritten up to once per second. For example, if you have an object bar in bucket foo, then you should only upload a new copy of foo/bar about once per second. Updating the same object faster than once per second may result in 429 Too Many Requests errors" |
We can modify the client wrapper to retry with a backoff on 429 in papermill. It sounds like that would resolve this issue? |
Going to release 0.18.1 with the fix. Thanks for getting the issue resolved. |
This issue seems to be happening again with
|
Is this with the latest papermill release (1.1.0) or an earlier one? |
Yes with papermill==1.1.0. I haven't tried other papermill versions.
|
Ok thanks for the heads up. If no one else gets to it I can look at it this weekend. 1.1.0 has another minor bug that also needs addressing anyway. |
I tried to reproduce with the same versions and I see that in 0.3.0, Google Cloud is responding with a 429 first then a 410 error: In 0.2.3 I see GCSFS sending: https://www.googleapis.com:443 "POST /upload/storage/v1/b/dpe-sandbox/o?uploadType=multipart HTTP/1.1" 429 463 In 0.3.0:
https://b.corp.google.com/issues/137168102
In gcsfs 0.2.3:
|
Thanks for helping to look into it @gogasca ! |
FYI @MichelleUfford was taking a look at this one. I got my gcsfs setup running on this computer to test once there's a fix. |
So neither myself nor @MichelleUfford can reproduce the issue. Based on the changes in |
I believe this is now fixed in 1.2.0, but I was unable to reproduce the issue to prove it. Can one of the reporters of the problem test with the latest papermill version and confirm if this issue can be closed again? |
I am facing this issue with version 1.2.0. papermill is unable to write the output to gcs |
@abdsamad1 Could you open a new issue with details for your failed request (as much as you can shate)? Details like the notebook, the rate of cell execution, the stack trace, consistency of failure (happens sometimes, everytime, on Tuesdays), if the failure occurs across buckets or only on a specific key, etc. |
For the record, I've heard from Google support about this. To quote: As of right now, the issue is a bug and not a customer issue, and while a fix is on the way, there is a workaround that can be done on the customer’s side. The official workaround to circumvent 5xx and 410 errors is to implement retries, as was indicated in this comment from a Issue Tracker entry you have commented yourself (see https://issuetracker.google.com/issues/137168102#comment2). The retry method recommendation can also be seen here (https://issuetracker.google.com/issues/35903805#comment2). To retry successfully, catching 500 and 410 errors is required and, as the official documentation recommends (https://cloud.google.com/storage/docs/json_api/v1/status-codes#410_Gone), implementing a retry by starting a new session for the upload that received an unsuccessful status code but still needs uploading. The new session creation may be what was missing on your end, causing retries to be unsuccessful as you have mentioned previously. Additionally, exponential backoffs recommended in comments (see https://issuetracker.google.com/35903805#comment2) are the way to go to mitigate the issue (see https://cloud.google.com/storage/docs/exponential-backoff ). |
Thanks for the link @j256 ! We do have retries and exponential backoff on writes, but it sounds like that's not always sufficient either. Looking forward to the API finally getting fixed. |
Hi all,
according to the output it seems that papermill re-tries if the "rate of changes exceeds" error occurs but if I try to downlaod the notebook from the bucket and I try to open it inside Jupyter, locally, I am NOT ABLE to open the notebook (so I think that the notebook in Cloud Storage is not correctly saved by papermill) |
The error I get is:
|
I have not hit such an error, but I don't consistently use gcfs. You may need to create an issue on the gcsfs extension. That being said, some things to check are:
|
When running GCFS application via
papermill[gcs]
I'm getting Error: HTTP 429 Rate exceeds.
Works if output notebook is written locally:
Local file size is: 57K
GCSFS reference fsspec/gcsfs#130
How to reproduce?
Logs:
I already defined:
in MacOS environment I get similar errors: (Added debugging)
The text was updated successfully, but these errors were encountered: