Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support retry strategy on Lambda LogRetention #8257

Closed
jaapvanblaaderen opened this issue May 28, 2020 · 9 comments · Fixed by #8258
Closed

Support retry strategy on Lambda LogRetention #8257

jaapvanblaaderen opened this issue May 28, 2020 · 9 comments · Fixed by #8258
Assignees
Labels
@aws-cdk/aws-lambda Related to AWS Lambda effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. good first issue Related to contributions. See CONTRIBUTING.md in-progress This issue is being actively worked on.

Comments

@jaapvanblaaderen
Copy link
Contributor

Deployment of one of our CDK projects randomly fails with rate exceeded errors. These errors occur when CDK creates LogRetention resources related to the Lambda functions we have.

Reproduction Steps

The issue occurs when deploying multiple CDK stacks that contain quite some Lamba's with log retention resources.

I created a small test project to reproduce the issue: https://github.com/jaapvanblaaderen/log-retention-rate-limit With this simple setup, I wasn't able to reproduce the issue when deploying a few stacks sequentially (which is what we use in our actual project). The issue can however be observed when deploying the stacks in parallel.

Error Log

128/101 | 9:04:29 AM | CREATE_IN_PROGRESS   | Custom::LogRetention        | hello_5/LogRetention (hello5LogRetention5D258C6A) Resource creation Initiated
 129/101 | 9:04:29 AM | CREATE_FAILED        | Custom::LogRetention        | hello_5/LogRetention (hello5LogRetention5D258C6A) Failed to create resource. Rate exceeded
	new LogRetention (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/log-retention.ts:67:22)
	\_ new Function (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/function.ts:537:28)
	\_ new LogRetentionRateLimitStack (/repos/logretention-rate-limit/lib/log-retention-rate-limit-stack.ts:17:18)
	\_ Object.<anonymous> (/repos/logretention-rate-limit/bin/log-retention-rate-limit.ts:8:3)
	\_ Module._compile (internal/modules/cjs/loader.js:1151:30)
	\_ Module.m._compile (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:858:23)
	\_ Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
	\_ Object.require.extensions.<computed> [as .ts] (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:861:12)
	\_ Module.load (internal/modules/cjs/loader.js:1000:32)
	\_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
	\_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
	\_ main (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:227:14)
	\_ Object.<anonymous> (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:513:3)
	\_ Module._compile (internal/modules/cjs/loader.js:1151:30)
	\_ Object.Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
	\_ Module.load (internal/modules/cjs/loader.js:1000:32)
	\_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
	\_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
	\_ /usr/local/lib/node_modules/npm/node_modules/libnpx/index.js:268:14

Environment

  • CLI Version: 1.41.0 (build 9e071d2)
  • Framework Version: 1.41.0
  • OS: OSX 10.14

Analysis

It fails when creating CloudWatch log groups. The issue could be fixed by relaxing the retry options for the CloudWatch SDK instance, I tested this locally by changing it to:

const cloudwatchlogs = new AWS.CloudWatchLogs({ apiVersion: '2014-03-28', maxRetries: 6, retryDelayOptions: { base: 300 }});

Another solution might be increasing a service limit. Unfortunately, I have no clue which rate limit is being hit here. It's not clear from the documentation:


This is 🐛 Bug Report

@jaapvanblaaderen jaapvanblaaderen added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 28, 2020
jaapvanblaaderen added a commit to jaapvanblaaderen/aws-cdk that referenced this issue May 28, 2020
This prevents throttling issues on stacks with a lot of Lambdas.

fixes aws#8257
@jaapvanblaaderen
Copy link
Contributor Author

Added PR with a change that fixes the issue. Code was inspired by a similar fix in: https://github.com/aws/aws-cdk/pull/2053/files

Not sure if this is the right approach though. Can imagine this can be better managed in one central location and/or be configurable.

@SomayaB SomayaB added @aws-cdk/aws-lambda Related to AWS Lambda in-progress This issue is being actively worked on. labels Jun 1, 2020
@nija-at nija-at added feature-request A feature should be added or improved. and removed bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 2, 2020
@nija-at nija-at changed the title Rate exceeded errors occur when log retention resources are created as part of creating Lambda functions Support retry strategy on Lambda LogRetention Jun 2, 2020
@nija-at
Copy link
Contributor

nija-at commented Jun 2, 2020

Re-classified this as a feature request.

@nija-at nija-at added effort/small Small work item – less than a day of effort good first issue Related to contributions. See CONTRIBUTING.md labels Jun 2, 2020
jaapvanblaaderen added a commit to jaapvanblaaderen/aws-cdk that referenced this issue Jun 8, 2020
jaapvanblaaderen added a commit to jaapvanblaaderen/aws-cdk that referenced this issue Jun 8, 2020
jaapvanblaaderen added a commit to jaapvanblaaderen/aws-cdk that referenced this issue Jun 11, 2020
jaapvanblaaderen added a commit to jaapvanblaaderen/aws-cdk that referenced this issue Jun 11, 2020
jaapvanblaaderen added a commit to jaapvanblaaderen/aws-cdk that referenced this issue Jun 12, 2020
@mergify mergify bot closed this as completed in #8258 Jun 12, 2020
mergify bot pushed a commit that referenced this issue Jun 12, 2020
…8258)

This prevents throttling issues on stacks with a lot of Lambdas.

fixes #8257

Implemented configurable `maxRetries` and `base` properties as part of the LogRetentionRetryOptions. The AWS SDK also supports specifying a customBackoff function. I skipped that as it's hard to implement in the current setup (impossible to provide a callback function in the event JSON).

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@georstoy
Copy link

georstoy commented Oct 1, 2020

This is my config for logRetentionRetryOptions.
Selection_318

I'm still testing but it seems to fix the 'Rate exceeded' error.
Selection_315

@jaapvanblaaderen thank you for developing this feature!

@michchan
Copy link

This is my config for logRetentionRetryOptions.
Selection_318

I'm still testing but it seems to fix the 'Rate exceeded' error.
Selection_315

@jaapvanblaaderen thank you for developing this feature!

Mine seems fixed too. Thanks @georstoy !

@aman-saraf
Copy link

@jaapvanblaaderen . Thanks for implementing the retry mechanism on the logs retention. Keep up the good work mate !!

@kernwig
Copy link

kernwig commented Sep 5, 2023

For anyone who finds themselves here, this fix was broken in the SDK v3 upgrade and just fixed again in PR #26858, released in v2.94.0.

@scott-korin-ps
Copy link

scott-korin-ps commented Oct 24, 2023

For anyone who finds themselves here, this fix was broken in the SDK v3 upgrade and just fixed again in PR #26858, released in v2.94.0.

I'm getting warning messages that logRetentionRetryOptions is deprecated. We added this option to our lambdas because we are getting the rate exceeded error during the creation of LogRetention resources. We are using aws-cdk 2.96.2

Is this a different option we are supposed to set, or is this just supposed to be resolved without any additional changes?

@jaapvanblaaderen
Copy link
Contributor Author

For anyone who finds themselves here, this fix was broken in the SDK v3 upgrade and just fixed again in PR #26858, released in v2.94.0.

I'm getting warning messages that logRetentionRetryOptions is deprecated. We added this option to our lambdas because we are getting the rate exceeded error during the creation of LogRetention resources. We are using aws-cdk 2.96.2

Is this a different option we are supposed to set, or is this just supposed to be resolved without any additional changes?

Are you sure? As far as I know only the logRetentionRetryOptions.base property is deprecated, logRetentionRetryOptions.maxRetries should still be configurable (and used).

@scott-korin-ps
Copy link

scott-korin-ps commented Oct 24, 2023

For anyone who finds themselves here, this fix was broken in the SDK v3 upgrade and just fixed again in PR #26858, released in v2.94.0.

I'm getting warning messages that logRetentionRetryOptions is deprecated. We added this option to our lambdas because we are getting the rate exceeded error during the creation of LogRetention resources. We are using aws-cdk 2.96.2
Is this a different option we are supposed to set, or is this just supposed to be resolved without any additional changes?

Are you sure? As far as I know only the logRetentionRetryOptions.base property is deprecated, logRetentionRetryOptions.maxRetries should still be configurable (and used).

"aws-cdk-lib.aws_logs.LogRetentionRetryOptions#base is deprecated"

LOL, gotta work on that reading comprehension. Never mind. Thanks for the correction!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-lambda Related to AWS Lambda effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. good first issue Related to contributions. See CONTRIBUTING.md in-progress This issue is being actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants