External Storage: Google Drive: 403 User Rate Limit Exceeded #20481

Closed
swurzinger opened this Issue Nov 12, 2015 · 41 comments

Comments

Projects
None yet
@swurzinger

swurzinger commented Nov 12, 2015

Steps to reproduce

  1. Set up Google Drive as external Storage. In my case I was accessing via a shared (sub)folder
  2. Upload a lot of files to that folder e.g. using the ownCloud client (probably also download?)
  3. Errors will occur if more than e.g. 10 requests/second

Expected behaviour

Files should be uploaded without errors.

Actual behaviour

Error: (403) User Rate Limit Exceeded

Google Drive has a limitation of maximum requests per second, that is set to 10 max according to Google's API documentation and the (max) value set in the Google Developer Console.

According to Google's documentation an application should implement exponential backoff when it gets that error, see https://developers.google.com/drive/web/handle-errors#implementing_exponential_backoff

Although owncloud/google returns a 403 error the upload succeeds sometimes, see also http://stackoverflow.com/questions/18578768/403-rate-limit-on-insert-sometimes-succeeds

Finally the owncloud client uploaded all my files successfully, I guess it has retried the ones that failed initially.

Server configuration

Operating system:
Linux af91f 2.6.32-504.8.1.el6.x86_64
Web server:
Apache (unknown version; shared hosting provider)
Database:
5.5.46 - MySQL Community Server
PHP version:
PHP Version 5.4.45
ownCloud version: (see ownCloud admin page)
ownCloud 8.2.0 (stable)
Updated from an older ownCloud or fresh install:
fresh install
List of activated apps:
default + external storage

The content of config/config.php:
config.txt

Are you using external storage, if yes which one: local/smb/sftp/...
Yes, Google Drive.

Are you using encryption: yes/no
No

Are you using an external user-backend, if yes which one: LDAP/ActiveDirectory/Webdav/...
No

Client configuration

Browser:
irrelevant

Operating system:
irrelevant; in my case it was a Windows XP Virtual Machine

Logs

Web server error log

empty

ownCloud log (data/owncloud.log)

owncloud.txt

Browser log

irrelevant

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Nov 16, 2015

Member

The only way to fix this is trying to reduce the number of API calls.
Currently it is likely that too many calls are made.

Such repeated calls could be buffered/prevented by using a local stat cache, similar to #7897

Member

PVince81 commented Nov 16, 2015

The only way to fix this is trying to reduce the number of API calls.
Currently it is likely that too many calls are made.

Such repeated calls could be buffered/prevented by using a local stat cache, similar to #7897

@swurzinger

This comment has been minimized.

Show comment
Hide comment
@swurzinger

swurzinger Nov 16, 2015

I really like the idea of buffering the API requests. That would both improve speed and reduce the number of requests.

Another way to deal with that behavior is to retry the request when it fails with that error as Google suggests. The question is whether that's the task of files_external or the client. For more abstract things I'd say this is the task of the client, for internal things I'd say this can be the task of files_external.

I really like the idea of buffering the API requests. That would both improve speed and reduce the number of requests.

Another way to deal with that behavior is to retry the request when it fails with that error as Google suggests. The question is whether that's the task of files_external or the client. For more abstract things I'd say this is the task of the client, for internal things I'd say this can be the task of files_external.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Dec 23, 2015

Member

A bit of research shows that the AWS library and others are referring to a curl plugin called "BackoffStrategy".

That might be it: https://github.com/Guzzle3/plugin-backoff/blob/master/CurlBackoffStrategy.php
It seems to react on some error codes and then will try again after a delay.
That plugin isn't available in ownCloud / 3rdparty libs so might need to be included.

Member

PVince81 commented Dec 23, 2015

A bit of research shows that the AWS library and others are referring to a curl plugin called "BackoffStrategy".

That might be it: https://github.com/Guzzle3/plugin-backoff/blob/master/CurlBackoffStrategy.php
It seems to react on some error codes and then will try again after a delay.
That plugin isn't available in ownCloud / 3rdparty libs so might need to be included.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@swurzinger

This comment has been minimized.

Show comment
Hide comment
@swurzinger

swurzinger Dec 23, 2015

BackoffStrategy seems to implement what I mentioned in the first post and is recommended by Google. Here's the link again (it is from the REST API, but that applies to the other APIs as well): https://developers.google.com/drive/web/handle-errors#implementing_exponential_backoff

Batching will probably not help in regard of this issue as "A set of n requests batched together counts toward your usage limit as n requests, not as one request." (from https://developers.google.com/drive/v2/web/batch)

BackoffStrategy seems to implement what I mentioned in the first post and is recommended by Google. Here's the link again (it is from the REST API, but that applies to the other APIs as well): https://developers.google.com/drive/web/handle-errors#implementing_exponential_backoff

Batching will probably not help in regard of this issue as "A set of n requests batched together counts toward your usage limit as n requests, not as one request." (from https://developers.google.com/drive/v2/web/batch)

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Dec 23, 2015

Member

Unfortunately it seems the library OC uses doesn't use the Curl library but rather uses PHP's curl_xxx statements directly: https://github.com/owncloud/core/blob/v8.2.2/apps/files_external/3rdparty/google-api-php-client/src/Google/IO/Curl.php#L85

So even if that plugin was added, I'm not sure it would fit in.
We'd need to find another library that actually uses Guzzle

Member

PVince81 commented Dec 23, 2015

Unfortunately it seems the library OC uses doesn't use the Curl library but rather uses PHP's curl_xxx statements directly: https://github.com/owncloud/core/blob/v8.2.2/apps/files_external/3rdparty/google-api-php-client/src/Google/IO/Curl.php#L85

So even if that plugin was added, I'm not sure it would fit in.
We'd need to find another library that actually uses Guzzle

@PVince81 PVince81 referenced this issue Feb 19, 2016

Merged

Fix GDrive handling of office files #22405

6 of 6 tasks complete

@PVince81 PVince81 added this to the 9.1-next milestone Feb 19, 2016

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Feb 19, 2016

Member

@davitol this is what you observed yesterday

Member

PVince81 commented Feb 19, 2016

@davitol this is what you observed yesterday

@davitol

This comment has been minimized.

Show comment
Hide comment
@davitol

davitol Feb 19, 2016

Contributor

@PVince81 thanks 😺

Contributor

davitol commented Feb 19, 2016

@PVince81 thanks 😺

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Feb 29, 2016

Member

Setting this to critical, it has been observed already 3-4 times in different environments.

Member

PVince81 commented Feb 29, 2016

Setting this to critical, it has been observed already 3-4 times in different environments.

@LukasReschke

This comment has been minimized.

Show comment
Hide comment
@LukasReschke

LukasReschke Feb 29, 2016

Member

Considering that the limit is 1000 requests per 100 seconds per user we probably need some change detection here. Otherwise this seems like something that one can easily fall into again.

Member

LukasReschke commented Feb 29, 2016

Considering that the limit is 1000 requests per 100 seconds per user we probably need some change detection here. Otherwise this seems like something that one can easily fall into again.

@thefirstofthe300

This comment has been minimized.

Show comment
Hide comment
@thefirstofthe300

thefirstofthe300 Mar 14, 2016

I too am experiencing this issue on OC 9. I am attempting to upload a music library to ownCloud so I have a lot of little files being synced. If you need any more logs, I will happily provide them.

I too am experiencing this issue on OC 9. I am attempting to upload a music library to ownCloud so I have a lot of little files being synced. If you need any more logs, I will happily provide them.

@Spacefish

This comment has been minimized.

Show comment
Hide comment
@Spacefish

Spacefish Mar 24, 2016

i have the same problem, should be fixed somehow.. maybe we can cache the api calls? especially for the single files.. Or introduce some sort of rate limit counter, which postpones api calls not absolutely necesarry?

i have the same problem, should be fixed somehow.. maybe we can cache the api calls? especially for the single files.. Or introduce some sort of rate limit counter, which postpones api calls not absolutely necesarry?

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 29, 2016

Member

I think there is already some caching inside the library, I remember seeing some code that "remembers" calls made to the same URLs. Not sure if that works though.

Member

PVince81 commented Mar 29, 2016

I think there is already some caching inside the library, I remember seeing some code that "remembers" calls made to the same URLs. Not sure if that works though.

@JDrewes

This comment has been minimized.

Show comment
Hide comment
@JDrewes

JDrewes Apr 16, 2016

I am experiencing issues involving the 403 User Rate Limit Exceeded message.
My problem is that afaik, part of my GDrive has been copied to the owncloud server (and is accessible there through the files webfront), but of my many files and folders, only 2 empty folders have actually made their way to my harddrive through my owncloud-client (linux, 2.1.1).
Also, even on the webfront of my owncloud server, many of the files in the Google Drive folder are marked as "Pending", even after more than a day has passed. I would like to help get Google Drive sync to work - what kind of information can I provide?

Owncloud is 9.0.1 running on Debian 7
I use a free personal Google Drive account.

JDrewes commented Apr 16, 2016

I am experiencing issues involving the 403 User Rate Limit Exceeded message.
My problem is that afaik, part of my GDrive has been copied to the owncloud server (and is accessible there through the files webfront), but of my many files and folders, only 2 empty folders have actually made their way to my harddrive through my owncloud-client (linux, 2.1.1).
Also, even on the webfront of my owncloud server, many of the files in the Google Drive folder are marked as "Pending", even after more than a day has passed. I would like to help get Google Drive sync to work - what kind of information can I provide?

Owncloud is 9.0.1 running on Debian 7
I use a free personal Google Drive account.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 18, 2016

Member

I'm not sure but from what I heard, in some setups people seem to hit against the limit less often.
And some people have reported hitting against the limit more quickly before realizing they hadn't setup their GDrive app properly. So if you're sure that your GDrive app was configured properly on the GDrive side (API keys, etc), then I'm not aware of any possible workaround at this time.

In theory one could add a few usleep(100) in the GDrive library to make it slower and less likely to run into limits, but it's really not a proper solution. I haven't tried it, just guessing.

The proper solution is to implement exponential backoff, which would use adaptive sleep to sleep longer and longer until the request goes through, retrying several times.

Member

PVince81 commented Apr 18, 2016

I'm not sure but from what I heard, in some setups people seem to hit against the limit less often.
And some people have reported hitting against the limit more quickly before realizing they hadn't setup their GDrive app properly. So if you're sure that your GDrive app was configured properly on the GDrive side (API keys, etc), then I'm not aware of any possible workaround at this time.

In theory one could add a few usleep(100) in the GDrive library to make it slower and less likely to run into limits, but it's really not a proper solution. I haven't tried it, just guessing.

The proper solution is to implement exponential backoff, which would use adaptive sleep to sleep longer and longer until the request goes through, retrying several times.

@guruz

This comment has been minimized.

Show comment
Hide comment
@guruz

guruz Apr 18, 2016

Contributor

(I haven't checked how our GDrive stuff works, I don't even know how often we "list" the remote directory)

Maybe it's interesting to incorporate this into our caching or even ETag logic: https://developers.google.com/drive/v2/web/manage-changes#retrieving_changes

Contributor

guruz commented Apr 18, 2016

(I haven't checked how our GDrive stuff works, I don't even know how often we "list" the remote directory)

Maybe it's interesting to incorporate this into our caching or even ETag logic: https://developers.google.com/drive/v2/web/manage-changes#retrieving_changes

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 18, 2016

Member

@guruz yeah, that would be part of looking into update detection: #11797

Member

PVince81 commented Apr 18, 2016

@guruz yeah, that would be part of looking into update detection: #11797

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Jun 1, 2016

Member

Good news ! We've updated the Google SDK library and from grepping the code I saw that there are parts that will automatically do the exponential backoff !

* A task runner with exponential backoff support.

Member

PVince81 commented Jun 1, 2016

Good news ! We've updated the Google SDK library and from grepping the code I saw that there are parts that will automatically do the exponential backoff !

* A task runner with exponential backoff support.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Jun 1, 2016

Member

From looking at the code it seems that it should already work without any additional configs, so I'm going to close this.

If you are able to, please try the 9.1beta1 build that contains this update and let me know if you're still getting "403 User Rate Limit Exceeded" as I'm not able to reproduce this locally.

CC @davitol @SergioBertolinSG

Member

PVince81 commented Jun 1, 2016

From looking at the code it seems that it should already work without any additional configs, so I'm going to close this.

If you are able to, please try the 9.1beta1 build that contains this update and let me know if you're still getting "403 User Rate Limit Exceeded" as I'm not able to reproduce this locally.

CC @davitol @SergioBertolinSG

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Jun 2, 2016

Member

Kudos to @altyr for submitting the library update PR #24516 😄

Member

PVince81 commented Jun 2, 2016

Kudos to @altyr for submitting the library update PR #24516 😄

@stevenmcastano

This comment has been minimized.

Show comment
Hide comment
@stevenmcastano

stevenmcastano Aug 2, 2016

I'm on Ubuntu 14.04 with php5.5 and ownCloud 9.1 and I can confirm this behavior still exists.

I'm on Ubuntu 14.04 with php5.5 and ownCloud 9.1 and I can confirm this behavior still exists.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Aug 8, 2016

Member

@stevenbuehner are you seeing the message "403 User Rate Limit Exceeded" ? How often does it happen ? I suppose that the exponential backoff algorithm might give up after retrying several times.

Member

PVince81 commented Aug 8, 2016

@stevenbuehner are you seeing the message "403 User Rate Limit Exceeded" ? How often does it happen ? I suppose that the exponential backoff algorithm might give up after retrying several times.

@stevenbuehner

This comment has been minimized.

Show comment
Hide comment
@stevenbuehner

stevenbuehner Aug 8, 2016

Contributor

I guess @PVince81, you are refering to @stevenmcastano post?

Contributor

stevenbuehner commented Aug 8, 2016

I guess @PVince81, you are refering to @stevenmcastano post?

@stevenmcastano

This comment has been minimized.

Show comment
Hide comment
@stevenmcastano

stevenmcastano Aug 9, 2016

@stevenbuehner, that's what I thought @PVince81 was doing... but I wasn't sure so I stayed quiet for a bit.

As for the 403 issues, yes... I'm seeing them almost constantly. I can't get the desktop client to sync much at all without it showing an internal server error and stopping while the server logs show the user rate limit error.

Also, in the web portal itself, most of the documents in there show up as "Pending" with no date or size. The only ones that seem to reflect real size and date info are the same ones that the desktop client was able to download.

@stevenbuehner, that's what I thought @PVince81 was doing... but I wasn't sure so I stayed quiet for a bit.

As for the 403 issues, yes... I'm seeing them almost constantly. I can't get the desktop client to sync much at all without it showing an internal server error and stopping while the server logs show the user rate limit error.

Also, in the web portal itself, most of the documents in there show up as "Pending" with no date or size. The only ones that seem to reflect real size and date info are the same ones that the desktop client was able to download.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Aug 9, 2016

Member

@stevenbuehner yes sorry, not sure why you have been randomly chosen by autocomplete.

Member

PVince81 commented Aug 9, 2016

@stevenbuehner yes sorry, not sure why you have been randomly chosen by autocomplete.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Aug 9, 2016

Member

@stevenmcastano okay, then let's reopen this ticket.
So far the exponential backoff logic is implemented in the library itself but maybe it needs some tweaking as I saw it has parameters.

How many big/small files are you syncing ?

Member

PVince81 commented Aug 9, 2016

@stevenmcastano okay, then let's reopen this ticket.
So far the exponential backoff logic is implemented in the library itself but maybe it needs some tweaking as I saw it has parameters.

How many big/small files are you syncing ?

@PVince81 PVince81 reopened this Aug 9, 2016

@PVince81 PVince81 modified the milestones: 9.1.1, 9.1 Aug 9, 2016

@stevenmcastano

This comment has been minimized.

Show comment
Hide comment
@stevenmcastano

stevenmcastano Aug 9, 2016

Most of them are fairly small. Some tiny spreadsheets and word documents. I'd say there's about 225 files total, with maybe 1 or 2 of them being between 75 and 100M, some big photoshop .PSD files.

It hasn't even gotten to the big files yet... it's synced about 40 or so files, and everytime I try to pause the sync then resume it to try to grab some more, by the time it verifies the files it already has, it hits the user rate limit before it can download anymore.

Most of them are fairly small. Some tiny spreadsheets and word documents. I'd say there's about 225 files total, with maybe 1 or 2 of them being between 75 and 100M, some big photoshop .PSD files.

It hasn't even gotten to the big files yet... it's synced about 40 or so files, and everytime I try to pause the sync then resume it to try to grab some more, by the time it verifies the files it already has, it hits the user rate limit before it can download anymore.

@PVince81 PVince81 modified the milestones: 9.1.2, 9.1.1 Sep 21, 2016

@guruz guruz referenced this issue in owncloud/client Oct 2, 2016

Closed

Do not stop sync when error occurs #5187

@Trefex

This comment has been minimized.

Show comment
Hide comment
@Trefex

Trefex Oct 11, 2016

Would it be possible to return a different error code to the client in addition to trying to fix the rate limit issue?

Trefex commented Oct 11, 2016

Would it be possible to return a different error code to the client in addition to trying to fix the rate limit issue?

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Oct 11, 2016

Member

You are probably referring to owncloud/client#5187 (comment).
In the case of a PROPFIND, a 503 would be returned.

In the case of PUT or any write operation, we could change Webdav to translate the exception to another failure code. @ogoffart any suggestion ?

Member

PVince81 commented Oct 11, 2016

You are probably referring to owncloud/client#5187 (comment).
In the case of a PROPFIND, a 503 would be returned.

In the case of PUT or any write operation, we could change Webdav to translate the exception to another failure code. @ogoffart any suggestion ?

@ogoffart

This comment has been minimized.

Show comment
Hide comment
@ogoffart

ogoffart Oct 11, 2016

The problem is that 503 for PUT will stop the sync.
I guess the 502 error code might work better in this case if it is only for one file.

Alternative is to dinstinguish code that should block the sync using another header or something.

The problem is that 503 for PUT will stop the sync.
I guess the 502 error code might work better in this case if it is only for one file.

Alternative is to dinstinguish code that should block the sync using another header or something.

@PVince81 PVince81 modified the milestones: 9.1.3, 9.1.2 Oct 20, 2016

@PVince81 PVince81 self-assigned this Nov 21, 2016

@PVince81 PVince81 modified the milestones: 9.1.4, 9.1.3 Nov 30, 2016

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Nov 30, 2016

Member

I need to build a good test case where this issue is reproducible every time. Any suggestions ?

Member

PVince81 commented Nov 30, 2016

I need to build a good test case where this issue is reproducible every time. Any suggestions ?

@Trefex

This comment has been minimized.

Show comment
Hide comment
@Trefex

Trefex Nov 30, 2016

@PVince81 upload a lot of files to Google Drive and sync everything to your local machine.

I guess that should do it. A lot of small files, and perhaps 100.000 files or so.

Trefex commented Nov 30, 2016

@PVince81 upload a lot of files to Google Drive and sync everything to your local machine.

I guess that should do it. A lot of small files, and perhaps 100.000 files or so.

@stevenmcastano

This comment has been minimized.

Show comment
Hide comment
@stevenmcastano

stevenmcastano Nov 30, 2016

@Trefex

This comment has been minimized.

Show comment
Hide comment
@Trefex

Trefex Nov 30, 2016

Trefex commented Nov 30, 2016

@stevenmcastano

This comment has been minimized.

Show comment
Hide comment
@stevenmcastano

stevenmcastano Nov 30, 2016

@cellycell

This comment has been minimized.

Show comment
Hide comment
@cellycell

cellycell Dec 13, 2016

I store my entire music Library in Google Drive (~65GB)

This includes mixtapes and other shit you cannot find on Spotify and the likes. I also tend to keep high quality files like 320kbps MP3s and FLACs/ALACs

I tried to use Drive as my MASTER copy but the Google Drive client along with syncing between NTFS+ and HFS has fucked it sometimes and created duplicates, doesn't sync files, etc.

I'm paying for 1TB premium in Google

I just got this error and I assumed I would, I have moved 500GB of data through Googles network they have to have some sort of internal guage for consumers, correct?

I store my entire music Library in Google Drive (~65GB)

This includes mixtapes and other shit you cannot find on Spotify and the likes. I also tend to keep high quality files like 320kbps MP3s and FLACs/ALACs

I tried to use Drive as my MASTER copy but the Google Drive client along with syncing between NTFS+ and HFS has fucked it sometimes and created duplicates, doesn't sync files, etc.

I'm paying for 1TB premium in Google

I just got this error and I assumed I would, I have moved 500GB of data through Googles network they have to have some sort of internal guage for consumers, correct?

@PVince81 PVince81 modified the milestones: 9.1.5, 9.1.4 Feb 6, 2017

@DejaVu

This comment has been minimized.

Show comment
Hide comment
@DejaVu

DejaVu Mar 7, 2017

Another way to put this to the test I've found is to upload/sync a folder with 1,000 files - any size and then rename them locally and resync.

Expected behaviour is to simply rename remote files and the procedure is offered correctly.

Trying to rename at the rate GoodSync does on Goohle Drive, although quick and we all prefer quick, these 403 User Limits are hit and no files that are already uploaded are renamed either.

DejaVu commented Mar 7, 2017

Another way to put this to the test I've found is to upload/sync a folder with 1,000 files - any size and then rename them locally and resync.

Expected behaviour is to simply rename remote files and the procedure is offered correctly.

Trying to rename at the rate GoodSync does on Goohle Drive, although quick and we all prefer quick, these 403 User Limits are hit and no files that are already uploaded are renamed either.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 7, 2017

Member

I suspect that GoodSync doesn't rename the way the desktop client renames. Instead of a single MOVE it might be doing a copy of every file first. Or maybe it does a MKCOL on the new folder and then recursively move every file there instead of doing just once. Looking at the web server access log should tell.

In general I'd expect a simple Webdav MOVE to not cause any rate limit issue.

Member

PVince81 commented Mar 7, 2017

I suspect that GoodSync doesn't rename the way the desktop client renames. Instead of a single MOVE it might be doing a copy of every file first. Or maybe it does a MKCOL on the new folder and then recursively move every file there instead of doing just once. Looking at the web server access log should tell.

In general I'd expect a simple Webdav MOVE to not cause any rate limit issue.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 29, 2017

Member

Here you go, one liner to enable retries in the GDrive lib: #27530

Member

PVince81 commented Mar 29, 2017

Here you go, one liner to enable retries in the GDrive lib: #27530

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Mar 29, 2017

Member

Please all help testing this and let us know if it solved your problem.

You might still bump into API limits from time to time but it shouldn't be as bad as before.

Member

PVince81 commented Mar 29, 2017

Please all help testing this and let us know if it solved your problem.

You might still bump into API limits from time to time but it shouldn't be as bad as before.

@PVince81

This comment has been minimized.

Show comment
Hide comment
@PVince81

PVince81 Apr 3, 2017

Member

This will be in 9.1.5

Member

PVince81 commented Apr 3, 2017

This will be in 9.1.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment