New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ResourceLimitExceeded for cloudwatchLog event #5554

Merged
merged 2 commits into from Jan 7, 2019

Conversation

Projects
None yet
4 participants
@rdsedmundo
Copy link
Contributor

rdsedmundo commented Dec 2, 2018

What did you implement:

Closes #3447

How did you implement it:

As per CloudFormation nature, it won't delete a resource before trying to create a new one firstly, and the problem come up once it tries to create a new subscription filter, as it will hit CloudWatch's hard limit of 1 subscription filter per log group (https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html). The unique option to make it work is to manually delete the resource and then run the CF update again -- since the resource is gone CF is going to be able to create the new one, and also delete the old as its definition is not on template anymore. The idea is to delete the subscription filter whenever the destination function ARN is different from the one that was just configured at serverless.yml.

I haven't added UT at all because I'd like to know firstly if this implementation makes sense. It worked for all situations that were pointed out by @ztmdsbt on #3447.

How can we verify it:

  1. Create a project with serverless.yml with the following configuration: https://gist.github.com/rdsedmundo/556d8b6ed7fc119f13b8c54bf47595fa. handler.js content can be whatever, e.g just a hello world:
'use strict';

module.exports.hello = async (event, context) => {
  return { message: 'Go Serverless v1.0! Your function executed successfully!', event };
};
  1. Change cloudwatchLog log group interchangeably (set test-1 for the first function, then test-2 to the 3rd, etc), change function definitions orders (add hello first, then hello2 at the end, etc), create new functions at the beginning of the function definitions, at the middle, etc.
  2. If you're using the version from this PR you won't face the "ResourceLimitExceeded" error as we got on master (reference: https://serverless.com/framework/docs/providers/aws/events/cloudwatch-log#current-gotchas).

Todos:

  • Write tests
  • Write documentation
  • Fix linting errors
  • Make sure code coverage hasn't dropped
  • Provide verification config / commands / resources
  • Enable "Allow edits from maintainers" for this PR
  • Update the messages below

Is this ready for review?: YES
Is it a breaking change?: NO

@bryanrcampbell

This comment has been minimized.

Copy link

bryanrcampbell commented Dec 3, 2018

Thanks for doing this @rdsedmundo! This would solve an extremely irritating problem we have with cloudwatch log subscriptions.

@dschep
Copy link
Member

dschep left a comment

Mostly seems good. Few things:

  • what potential issues might there be with manually deleting the subscription before then updating the cfn stack?
  • see code comment
@rdsedmundo

This comment has been minimized.

Copy link
Contributor

rdsedmundo commented Dec 3, 2018

what potential issues might there be with manually deleting the subscription before then updating the cfn stack?

That's exactly what I was expecting to hear from the serverless team as a feedback.

From what I understand deleting outside CloudFormation is not ideal, but in the case of the subscription filters, when there's a change on the function target CloudFormation will remove the old definition from the template, so the script is removing it by hand, but this definition is not also on the template anymore in that circumstance.

From the tests that I run I saw no issue or inconsistency on the CloudFormation at the console. Run it multiple times and everything went fine.

It would be great to have an input of someone with better knowledge on that here.

@dschep
Copy link
Member

dschep left a comment

@horike37 isn't a fan of this (I'm inclined to agree with him. Deleting stuff out of band like this seems questionable). He suggests the following:

How about adding a flag to delete subscription filters outside CFN like sls deploy --force-update-loggroups? only when the flag enable, sls allows to delete the setting with aws-sdk

and don't forget to update https://github.com/serverless/serverless/blob/master/docs/providers/aws/events/cloudwatch-log.md#current-gotchas

@rdsedmundo

This comment has been minimized.

Copy link
Contributor

rdsedmundo commented Dec 7, 2018

Yeah, although on this situation it doesn't make difference as if the destinationArns are different the resource would have been deleted either way later, I agree that it's more prudent to let CloudFormation to handle that.

But instead of this CLI option, I'd prefer adding some extra configuration at the cloudWatchLog event, otherwise, I'd need to pass this option every time and it may get annoying. What do you think? I can also still keep with the check and provide a warning on a situation that probably an error will be triggered (in case of the flag not being provided and the deletion would have been necessary).

@dschep
Copy link
Member

dschep left a comment

Hmm. I like that, but it makes it extra clear to me that this isn't really behavior anyone would want to turn off. So as non-ideal it is to have aws-sdk calls outside of the CloudFormation deployment, cfn gives us little choice. As such, I'm fine with SDK calls, don't worry about making it an option.

I'd like a few changes however:

  1. The delete should not happen in compile step. This would cause sls package to perform the delete. It should only happen on sls deploy
  2. Remove the use of destructuring and ensure tests pass on node 4
  3. Remove the 'gotcha' note from the docs: https://github.com/serverless/serverless/blob/master/docs/providers/aws/events/cloudwatch-log.md#current-gotchas
cloudWatchLogs
.describeSubscriptionFilters({ logGroupName })
.promise()
.then(({ subscriptionFilters: [subscriptionFilter] }) => {

This comment has been minimized.

@dschep

dschep Dec 7, 2018

Member

Don't use destructuring as the framework still supports Node v4.


this.provider.getAccountId()
.then((accountId) => {
const { destinationArn: oldDestinationArn, filterName } = subscriptionFilter;

This comment has been minimized.

@dschep

dschep Dec 7, 2018

Member

here too

@rdsedmundo

This comment has been minimized.

Copy link
Contributor

rdsedmundo commented Dec 7, 2018

@dschep I submitted a new version without seeing your comments (was working on them while you're reviewing again). I'll address then later and squash the commits. Meanwhile, I added some UT tests. I'm more familiar using Jest so I might have not written some of them in the best way, please feel free to suggest improvements.

Could you guide me how to know if it's a sls:deploy or sls:package by the way? I'm not too much familiar with the framework codebase.

@dschep

This comment has been minimized.

Copy link
Member

dschep commented Dec 7, 2018

You shouldn't have to check if it's sls deploy explicitly. I recommend moving the check out of the compile functions & in to the deploy functions. Putting it in /lib/plugins/aws/deploy/lib/checkForChanges.js makes the most sense, I think.

@rdsedmundo

This comment has been minimized.

Copy link
Contributor

rdsedmundo commented Dec 9, 2018

@dschep Did you mean moving the whole code to the checkForChanges file and letting the original compile method as it was before on this file? Or what you meant was to use the deploy hooks instead of package?

@dschep

This comment has been minimized.

Copy link
Member

dschep commented Dec 10, 2018

The code checking if its need & then deleting the existing subscription should live in checkForChanges. You don't need to explicitly worry about deploy vs package. package calls the compile modules(this generates the cloudformation template). deploy and calls package before moving onto the deploy work, including checkForChanges.

@rdsedmundo rdsedmundo force-pushed the rdsedmundo:issues/3447 branch from a698877 to 73dab5e Dec 19, 2018

@rdsedmundo

This comment has been minimized.

Copy link
Contributor

rdsedmundo commented Dec 19, 2018

@dschep Changes done. I went ahead and excluded the old commits for avoiding noise on the PR. Let me know what you think.

@dschep
Copy link
Member

dschep left a comment

Sorry I wasn't super clear here, but I changed my mind on it being an option. Lets just always do it! I've updated the code/docs/tests to do that.

one last thought: what happens if it's already been deleted, should we catch that error and continue? That way if someone has scripted the deletion before sls deploy to work around this issue, this fix won't break their workflow.

it will throw when trying to get subscription filters of a log group that was just added
to the serverless.yml (therefore not created in AWS yet), we can safely ignore this error
*/
.catch(() => undefined);

This comment has been minimized.

@dschep

dschep Dec 21, 2018

Member

Oh, nevermind, this addresses my 'one last thought', right?

This comment has been minimized.

@rdsedmundo

rdsedmundo Dec 21, 2018

Contributor

Yes, that's right.

@dschep dschep dismissed their stale review Dec 21, 2018

looking good. just want final confirmation on one thing before approving.

@rdsedmundo

This comment has been minimized.

Copy link
Contributor

rdsedmundo commented Dec 21, 2018

By the way, the tests failed randomly it seems, AwsInvokeLocal, I don't think it's related to those changes.

@dschep

This comment has been minimized.

Copy link
Member

dschep commented Dec 21, 2018

Yeah. I don't worry too much about windows tests failures so long as most of the sub-builds pass.

@rdsedmundo

This comment has been minimized.

Copy link
Contributor

rdsedmundo commented Jan 2, 2019

Hey @dschep, happy new year! What's missing to this to get merged? I'm really looking forward to having this around for my projects here.

@dschep

This comment has been minimized.

Copy link
Member

dschep commented Jan 7, 2019

Me remembering to merge this is what's missing 😬

:shipit: !!

@dschep dschep merged commit 3489aed into serverless:master Jan 7, 2019

2 of 3 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.02%) to 90.799%
Details

@dschep dschep referenced this pull request Jan 9, 2019

Merged

v1.36.0 release! #5670

@shortjared shortjared added this to the 1.36.0 milestone Jan 9, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment