Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

flux hitting Google API rate limits when working with GCR #1016

Closed
crw opened this issue Mar 20, 2018 · 8 comments
Closed

flux hitting Google API rate limits when working with GCR #1016

crw opened this issue Mar 20, 2018 · 8 comments

Comments

@crw
Copy link

crw commented Mar 20, 2018

Google Cloud API seems to have a basic rate limit of 20 reqs/sec: https://cloud.google.com/compute/docs/api-rate-limits

If a user has more than 20 images in their GCR registry, then this rate limit can be triggered. Is there any way to back off, or have preconfigured rate limit values for different image registries so that users do not have to manually configure this setting?

@squaremo
Copy link
Member

Is there any way to back off, or have preconfigured rate limit values for different image registries so that users do not have to manually configure this setting?

For the record, these are the settings in question (from https://github.com/weaveworks/flux/blob/master/site/daemon.md)

--registry-rps	200	maximum registry requests per second per host
--registry-burst	125	maximum number of warmer connections to remote and memcache

We could tune them down for everyone by giving them more conservative defaults. As a beer coaster calculation: to fetch image metadata from scratch needs distinct images X avg number of tags per image requests, for which typical numbers would be 100 and (much more variation here) 100, so about 10,000 requests. At 20 rps it would take about 8 minutes to fill the DB. That seems acceptable.

@squaremo
Copy link
Member

That seems acceptable.

... by which I mean, acceptable if you are on GCP and can't make it go faster :) I think it'd be better to tune it down just for GCP, and to do that, we'll have to alter generated config or something else.

Is there any way to back off,

We probably do get specific status codes when throttled, so this may be a possibility depending on how well (or if at all) the docker distribution lib exposes those.

@rade
Copy link
Contributor

rade commented Mar 23, 2018

At 20 rps it would take about 8 minutes to fill the DB. That seems acceptable.

Doesn't that mean it would take that long for the likes of 'Deploy Status' in Weave Cloud to fully populate?

back off

That would be neat.

@squaremo
Copy link
Member

Doesn't that mean it would take that long for the likes of 'Deploy Status' in Weave Cloud to fully populate?

Yep, but whatcha gonna do.

@rade
Copy link
Contributor

rade commented Mar 23, 2018

Out of interest, what actually happens when the rate limit is reached? Shouldn't we still get a success at roughly the rate limit? Or to put it another way, what does rate limiting at the flux end actually achieve?

@squaremo
Copy link
Member

Or to put it another way, what does rate limiting at the flux end actually achieve?

In the first instance, it's being a good API user. But the other side of that coin is that some registries will blacklist your IP if you are continually bouncing off their throttling.

@squaremo
Copy link
Member

Giving the argument --registry-cache-expiry a higher value will also cut down on requests, since it will keep records around longer. If you don't care about being sensitive to tags being updated, you could set this to 24h or more.

@stefanprodan
Copy link
Member

This has been fixed by #1354 and #1538
Next Flux release will include those fixes

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants