Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix: Align UnhealthyInstance alarm period with underlying metric period
GuUnhealthyInstancesAlarm is built on top of the UnHealthyHostCount metric, which the Load Balancer measures and posts in 60-second intervals. Having the alarm period set to 5 minutes requires cloudwatch alarms to bucket the underlying metric into buckets of 5, and pick (in our case) a maximum value. The result of this operation forms the basis for the alarm to calculate the set of `alarm data points` used to decide if we are in an alarm state or not. The results of this bucketing operation can be unstable, as cloudwatch alarms operate on a rolling window basis. This makes the triggering of the alarm itself unstable, prone to false-recoveries upon initial alarm, and false alarms upon recovery. By setting the alarm period to the same period as the underlying metric they become synchronised, and alarm conditions become much more stable.
- Loading branch information