"Large shard size" Stack Monitoring rule is missing "Look at the average over X minutes" option #111889
Labels
bug
Fixes for quality problems that affect the customer experience
Feature:Stack Monitoring
SM alerting improvements
Team:Monitoring
Stack Monitoring team
The documentation of stack monitoring alert for Large shard size mentions "The condition is met if an index’s average shard size is 55gb or higher in the last 5 minutes" but the parameter to specify the time period is somehow missing on the rule definition.
We don't want a single spike over 55gb for primary shard size to cause an alert.
Force merges can cause the shard to grow much more than 50 GB (in some cases may double) for a short while and potentially trigger an alert that would be considered false positive.
We want the alert to fire only when size in last X minutes (default 15 minutes) averages over 75gb.
This provides additional control point for the users and avoids unneeded noise at time.
This would be similar to "Disk usage" rule.
The text was updated successfully, but these errors were encountered: