Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux hitting Google API rate limits when working with GCR #1016

Closed
crw opened this issue Mar 20, 2018 · 8 comments
Labels

Comments

@crw
Copy link

@crw crw commented Mar 20, 2018

Google Cloud API seems to have a basic rate limit of 20 reqs/sec: https://cloud.google.com/compute/docs/api-rate-limits

If a user has more than 20 images in their GCR registry, then this rate limit can be triggered. Is there any way to back off, or have preconfigured rate limit values for different image registries so that users do not have to manually configure this setting?

@squaremo

This comment has been minimized.

Copy link
Member

@squaremo squaremo commented Mar 23, 2018

Is there any way to back off, or have preconfigured rate limit values for different image registries so that users do not have to manually configure this setting?

For the record, these are the settings in question (from https://github.com/weaveworks/flux/blob/master/site/daemon.md)

--registry-rps	200	maximum registry requests per second per host
--registry-burst	125	maximum number of warmer connections to remote and memcache

We could tune them down for everyone by giving them more conservative defaults. As a beer coaster calculation: to fetch image metadata from scratch needs distinct images X avg number of tags per image requests, for which typical numbers would be 100 and (much more variation here) 100, so about 10,000 requests. At 20 rps it would take about 8 minutes to fill the DB. That seems acceptable.

@squaremo

This comment has been minimized.

Copy link
Member

@squaremo squaremo commented Mar 23, 2018

That seems acceptable.

... by which I mean, acceptable if you are on GCP and can't make it go faster :) I think it'd be better to tune it down just for GCP, and to do that, we'll have to alter generated config or something else.

Is there any way to back off,

We probably do get specific status codes when throttled, so this may be a possibility depending on how well (or if at all) the docker distribution lib exposes those.

@rade

This comment has been minimized.

Copy link
Contributor

@rade rade commented Mar 23, 2018

At 20 rps it would take about 8 minutes to fill the DB. That seems acceptable.

Doesn't that mean it would take that long for the likes of 'Deploy Status' in Weave Cloud to fully populate?

back off

That would be neat.

@squaremo

This comment has been minimized.

Copy link
Member

@squaremo squaremo commented Mar 23, 2018

Doesn't that mean it would take that long for the likes of 'Deploy Status' in Weave Cloud to fully populate?

Yep, but whatcha gonna do.

@rade

This comment has been minimized.

Copy link
Contributor

@rade rade commented Mar 23, 2018

Out of interest, what actually happens when the rate limit is reached? Shouldn't we still get a success at roughly the rate limit? Or to put it another way, what does rate limiting at the flux end actually achieve?

@squaremo

This comment has been minimized.

Copy link
Member

@squaremo squaremo commented Mar 23, 2018

Or to put it another way, what does rate limiting at the flux end actually achieve?

In the first instance, it's being a good API user. But the other side of that coin is that some registries will blacklist your IP if you are continually bouncing off their throttling.

@squaremo

This comment has been minimized.

Copy link
Member

@squaremo squaremo commented Jun 18, 2018

Giving the argument --registry-cache-expiry a higher value will also cut down on requests, since it will keep records around longer. If you don't care about being sensitive to tags being updated, you could set this to 24h or more.

@stefanprodan

This comment has been minimized.

Copy link
Member

@stefanprodan stefanprodan commented Dec 7, 2018

This has been fixed by #1354 and #1538
Next Flux release will include those fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.