Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds support for fine grained service limits. #1291

Merged
merged 1 commit into from Dec 15, 2016

Conversation

cfieber
Copy link
Contributor

@cfieber cfieber commented Dec 9, 2016

a service limit is a key to Double value pair, for example a rateLimit against an API.

limits can be configured at a fine grained level (implementation and account) and fall back through
several levels of defaults (implementation default, account default, cloud provider default, global default).

configuration looks like:

serviceLimits:
  defaults:
    rateLimit: 10

  cloudProviderOverrides:
    aws:
      rateLimit: 15

  accountOverrides:
    test:
      rateLimit: 5
    prod:
      rateLimit: 100

  implementationLimits:

    AmazonEC2:
      defaults:
        rateLimit: 200
      accountOverrides:
        prod:
          rateLimit: 500

    AmazonElasticLoadBalancing:
      defaults:
        rateLimit: 10

In this example:

  • requesting the rateLimit for AmazonElasticLoadBalancing for any account would return 10
  • requesting the rateLimit for AmazonEC2 for the prod account would return 500, and any other account 200
  • requesting the rateLimit for AmazonCloudWatch for the prod account would return 100, test 5, any other account 15

This configuration should extend to support per caching-agent level configuration (pollInterval) and can be applied in other cloud provider api clients as well.

The initial application of this configuration is to configure a rateLimit (maximum requests per second) for the various Amazon API clients returned by AmazonClientProvider. If unconfigured a default of 10 requests per second is used.

The rate limits in AmazonClientProvider are scoped to a client type (e.g. AmazonEC2) in a specific account and region.
The rate limits are applied globally to all clients requested by AmazonClientProvider and enforced by a request handler that acquires an object from a guava RateLimiter before request execution.

Additionally, this change updates AmazonClientProvider to keep a cache of API clients it has created (again per client type, account, and region) instead of always creating a new SDK client for each call.

closes #1284

@andrewbackes PTAL (In playing with my prototype implementation and working through some scenarios I ended up getting this close enough that I just polished it off)
@spinnaker/netflix-reviewers PTAL
@spinnaker/google-reviewers FYI (config mechanism may be useful in the near term if you have any similar API throttling issues to contend with, and I expect to extend this into the caching agent scheduler as a followup)

I'm still doing some testing around this, so not up for merge yet

@venturaville
Copy link

venturaville commented Dec 13, 2016

I like the idea of it. Most of the rate limiting problems I ran into were limited to particular parts of the API

Copy link

@ewiseblatt ewiseblatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand this. I only see specification to say and ask what the limit is, not application of monitoring and enforcing. In the code you remove, you had rate limiting (though looked local to the method, not global to the process).

It looks like the intent is to centralize it so it is available in the core, but I dont see where you actually use it. I assume this is an enabling PR for a followup where you apply it.

You specify rates by "account" but I suspect by the implementation here that this is only going to be by "account within the process". I would expect that rate limiting by account would be global to the system -- applied across replicas. I'm not sure what problem you are solving. I can imagine where you would want to rate limit by account within a process (e.g. you are facing starvation) but if you are hitting up against quota limits or other server/provider needs, then process level limiting might not be sufficient.

@cfieber
Copy link
Contributor Author

cfieber commented Dec 13, 2016

@ewiseblatt rate limiting is applied to all AWS SDK clients created in AmazonClientProvider ( https://github.com/spinnaker/clouddriver/pull/1291/files#diff-7f74150448d21001ad0596173027f1cdR486 )

agree that it is not perfect from a multi node perspective, but I think we can mostly ignore that problem because we coordinate workers across the nodes so there should only ever be one running instance of a particular caching agent that is doing the particular rate limited API calls. if that turns out not to be the case we could look at a redis-backed rate limiter to allocate tickets across multiple processes

@ewiseblatt
Copy link

ewiseblatt commented Dec 13, 2016 via email

@jlarocque88
Copy link

Hi,

Was just wondering if this might get merged soon, I think it may help with a problem I'm having of ExceedingThrottle.

Thanks.

a limit is a key to Double value pair, for example a rateLimit against an API.

limits can be configured at a fine grained level (implementation and account) and fall back through
several levels of defaults (implementation default, account default, cloud provider default, global default).

configuration looks like:

````yaml
serviceLimits:
  defaults:
    rateLimit: 10

  cloudProviderOverrides:
    aws:
      rateLimit: 15

  accountOverrides:
    test:
      rateLimit: 5
    prod:
      rateLimit: 100

  implementationLimits:

    AmazonEC2:
      defaults:
        rateLimit: 200
      accountOverrides:
        prod:
          rateLimit: 500

    AmazonElasticLoadBalancing:
      defaults:
        rateLimit: 10
````

In this example:
* requesting the rateLimit for AmazonElasticLoadBalancing for any account would return 10
* requesting the rateLimit for AmazonEC2 for the prod account would return 500, and any other account 200
* requesting the rateLimit for AmazonCloudWatch for the prod account would return 100, test 5, any other account 15

This configuration should extend to support per caching-agent level configuration (pollInterval) and can be
applied in other cloud provider api clients as well.

The initial application of this configuration is to configure a rateLimit (maximum requests per second) for the
various Amazon API clients returned by AmazonClientProvider. If unconfigured a default of 10 requests per second
is used.

The rate limits in AmazonClientProvider are scoped to a client type (e.g. AmazonEC2) in a specific account and region.
The rate limits are applied globally to all clients requested by AmazonClientProvider and enforced by a request handler
that acquires an object from a guava RateLimiter before request execution.

Additionally, this change updates AmazonClientProvider to keep a cache of API clients it has created (again per client
type, account, and region) instead of always creating a new SDK client for each call.

This change refactors AmazonClientProvider to extract a couple smaller more specialized classes (AwsSdkClientSupplier,
ProxyHandlerBuilder) and delegates the complexity of reflection and dynamic proxy creation there.
@cfieber cfieber merged commit 8d1b349 into spinnaker:master Dec 15, 2016
@cfieber cfieber deleted the configurable_rate_limiting branch December 15, 2016 23:41
@cfieber cfieber restored the configurable_rate_limiting branch January 3, 2017 22:35
ttomsu pushed a commit to ttomsu/clouddriver that referenced this pull request Mar 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants