adds support for fine grained service limits. #1291

cfieber · 2016-12-09T23:32:20Z

a service limit is a key to Double value pair, for example a rateLimit against an API.

limits can be configured at a fine grained level (implementation and account) and fall back through
several levels of defaults (implementation default, account default, cloud provider default, global default).

configuration looks like:

serviceLimits:
  defaults:
    rateLimit: 10

  cloudProviderOverrides:
    aws:
      rateLimit: 15

  accountOverrides:
    test:
      rateLimit: 5
    prod:
      rateLimit: 100

  implementationLimits:

    AmazonEC2:
      defaults:
        rateLimit: 200
      accountOverrides:
        prod:
          rateLimit: 500

    AmazonElasticLoadBalancing:
      defaults:
        rateLimit: 10

In this example:

requesting the rateLimit for AmazonElasticLoadBalancing for any account would return 10
requesting the rateLimit for AmazonEC2 for the prod account would return 500, and any other account 200
requesting the rateLimit for AmazonCloudWatch for the prod account would return 100, test 5, any other account 15

This configuration should extend to support per caching-agent level configuration (pollInterval) and can be applied in other cloud provider api clients as well.

The initial application of this configuration is to configure a rateLimit (maximum requests per second) for the various Amazon API clients returned by AmazonClientProvider. If unconfigured a default of 10 requests per second is used.

The rate limits in AmazonClientProvider are scoped to a client type (e.g. AmazonEC2) in a specific account and region.
The rate limits are applied globally to all clients requested by AmazonClientProvider and enforced by a request handler that acquires an object from a guava RateLimiter before request execution.

Additionally, this change updates AmazonClientProvider to keep a cache of API clients it has created (again per client type, account, and region) instead of always creating a new SDK client for each call.

closes #1284

@andrewbackes PTAL (In playing with my prototype implementation and working through some scenarios I ended up getting this close enough that I just polished it off)
@spinnaker/netflix-reviewers PTAL
@spinnaker/google-reviewers FYI (config mechanism may be useful in the near term if you have any similar API throttling issues to contend with, and I expect to extend this into the caching agent scheduler as a followup)

I'm still doing some testing around this, so not up for merge yet

venturaville · 2016-12-13T02:14:06Z

I like the idea of it. Most of the rate limiting problems I ran into were limited to particular parts of the API

ewiseblatt

I'm not sure I fully understand this. I only see specification to say and ask what the limit is, not application of monitoring and enforcing. In the code you remove, you had rate limiting (though looked local to the method, not global to the process).

It looks like the intent is to centralize it so it is available in the core, but I dont see where you actually use it. I assume this is an enabling PR for a followup where you apply it.

You specify rates by "account" but I suspect by the implementation here that this is only going to be by "account within the process". I would expect that rate limiting by account would be global to the system -- applied across replicas. I'm not sure what problem you are solving. I can imagine where you would want to rate limit by account within a process (e.g. you are facing starvation) but if you are hitting up against quota limits or other server/provider needs, then process level limiting might not be sufficient.

cfieber · 2016-12-13T16:10:01Z

@ewiseblatt rate limiting is applied to all AWS SDK clients created in AmazonClientProvider ( https://github.com/spinnaker/clouddriver/pull/1291/files#diff-7f74150448d21001ad0596173027f1cdR486 )

agree that it is not perfect from a multi node perspective, but I think we can mostly ignore that problem because we coordinate workers across the nodes so there should only ever be one running instance of a particular caching agent that is doing the particular rate limited API calls. if that turns out not to be the case we could look at a redis-backed rate limiter to allocate tickets across multiple processes

ewiseblatt · 2016-12-13T16:32:25Z

ok. thanks. I missed the "beforeRequest" acquisition in my first pass. regarding multi node, i was just pointing that out to make sure you were aware in case it was relevant.

…

On Tue, Dec 13, 2016 at 11:10 AM, Cameron Fieber ***@***.***> wrote: @ewiseblatt <https://github.com/ewiseblatt> rate limiting is applied to all AWS SDK clients created in AmazonClientProvider ( https://github.com/spinnaker/clouddriver/pull/1291/files#diff- 7f74150448d21001ad0596173027f1cdR486 ) agree that it is not perfect from a multi node perspective, but I think we can mostly ignore that problem because we coordinate workers across the nodes so there should only ever be one running instance of a particular caching agent that is doing the particular rate limited API calls. if that turns out not to be the case we could look at a redis-backed rate limiter to allocate tickets across multiple processes — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1291 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEKkGJ2S9xjIwNBHtDeokfVFURqYwzsaks5rHsNbgaJpZM4LJdvP> .

jlarocque88 · 2016-12-14T14:18:12Z

Hi,

Was just wondering if this might get merged soon, I think it may help with a problem I'm having of ExceedingThrottle.

Thanks.

a limit is a key to Double value pair, for example a rateLimit against an API. limits can be configured at a fine grained level (implementation and account) and fall back through several levels of defaults (implementation default, account default, cloud provider default, global default). configuration looks like: ````yaml serviceLimits: defaults: rateLimit: 10 cloudProviderOverrides: aws: rateLimit: 15 accountOverrides: test: rateLimit: 5 prod: rateLimit: 100 implementationLimits: AmazonEC2: defaults: rateLimit: 200 accountOverrides: prod: rateLimit: 500 AmazonElasticLoadBalancing: defaults: rateLimit: 10 ```` In this example: * requesting the rateLimit for AmazonElasticLoadBalancing for any account would return 10 * requesting the rateLimit for AmazonEC2 for the prod account would return 500, and any other account 200 * requesting the rateLimit for AmazonCloudWatch for the prod account would return 100, test 5, any other account 15 This configuration should extend to support per caching-agent level configuration (pollInterval) and can be applied in other cloud provider api clients as well. The initial application of this configuration is to configure a rateLimit (maximum requests per second) for the various Amazon API clients returned by AmazonClientProvider. If unconfigured a default of 10 requests per second is used. The rate limits in AmazonClientProvider are scoped to a client type (e.g. AmazonEC2) in a specific account and region. The rate limits are applied globally to all clients requested by AmazonClientProvider and enforced by a request handler that acquires an object from a guava RateLimiter before request execution. Additionally, this change updates AmazonClientProvider to keep a cache of API clients it has created (again per client type, account, and region) instead of always creating a new SDK client for each call. This change refactors AmazonClientProvider to extract a couple smaller more specialized classes (AwsSdkClientSupplier, ProxyHandlerBuilder) and delegates the complexity of reflection and dynamic proxy creation there.

cfieber force-pushed the configurable_rate_limiting branch from b56b43a to 57d8eb7 Compare December 9, 2016 23:53

ewiseblatt reviewed Dec 13, 2016

View reviewed changes

andrewbackes mentioned this pull request Dec 13, 2016

Configurable Service Limits #1284

Closed

cfieber force-pushed the configurable_rate_limiting branch from 57d8eb7 to bdab138 Compare December 15, 2016 23:14

cfieber force-pushed the configurable_rate_limiting branch from bdab138 to 6a55f0a Compare December 15, 2016 23:31

cfieber merged commit 8d1b349 into spinnaker:master Dec 15, 2016

cfieber deleted the configurable_rate_limiting branch December 15, 2016 23:41

ajordens mentioned this pull request Dec 16, 2016

Revert "adds support for fine grained service limits." #1323

Merged

cfieber restored the configurable_rate_limiting branch January 3, 2017 22:35

cfieber mentioned this pull request Jun 1, 2017

API Limit Exceeded spinnaker/spinnaker#751

Closed

ttomsu pushed a commit to ttomsu/clouddriver that referenced this pull request Mar 11, 2020

chore(openstack): remove openstack provider (spinnaker#1291)

9cf9af1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adds support for fine grained service limits. #1291

adds support for fine grained service limits. #1291

cfieber commented Dec 9, 2016

venturaville commented Dec 13, 2016 •

edited

ewiseblatt left a comment

cfieber commented Dec 13, 2016

ewiseblatt commented Dec 13, 2016 via email

jlarocque88 commented Dec 14, 2016

adds support for fine grained service limits. #1291

adds support for fine grained service limits. #1291

Conversation

cfieber commented Dec 9, 2016

venturaville commented Dec 13, 2016 • edited

ewiseblatt left a comment

Choose a reason for hiding this comment

cfieber commented Dec 13, 2016

ewiseblatt commented Dec 13, 2016 via email

jlarocque88 commented Dec 14, 2016

venturaville commented Dec 13, 2016 •

edited