-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asg spin doctor #88
Comments
this can also occur due to instances with non-complying tags being terminated. in our policy for example, we terminate instances with non-complying tags once an hour. if the missing tag is who owns the instance, we have no way to email the owner and the asg keeps spinning. would be nice to be able to track an hourly instance scan back to the creator of the asg, but at the very least suspending the ASGs under these cases as well not just applicable to this issue, but maybe custodian could hook into cloud trail on all instance creation events and auto tag them with who launched it? that person could always be included in event notifications |
i added better support for asg cwe rules including state notifications, but i'm a still a little unclear what we should do as an action when we detect these. In some of the larger accounts, there would be thousands of event fires a day. We could try batch and aggregate for notification. We could also resize down, but i'm hesistant to due that unless its a structural issue with the launch config, ie. elb health check outage could be transient. sounds like we need a filter on the structural issue with resize down and notify actions. |
to add some specifics of things I think would be useful... ASGs usually spin for a few different reasons. I'm sure there are others, but these come to mind:
It would be nice to be able to split them out to perform different actions based on the category of spin. For example, in the case of invalid configs, we can suspend and/or delete. In the case of continually failing health checks, I'd rather just notify since we've had cases where spinning was due to a network change, and we may not want to stop ASGs in that case. For the spinning due to instances being killed by other rules, I think we're solving that by putting those rules on the asg configs. There may be some outliers, but that seems like less of an issue right now. |
addressed in #220 .. basically a filter for asgs to detect invalid configs, or missing elbs |
resize/suspend/email/etc
more specifically several accounts have autoscale groups that are spinning, ie. trying and failing to launch an instance, repeatedly due to invalid ami, subnets, elbs, etc.
The text was updated successfully, but these errors were encountered: