From 332e2ca73ec7c58eb72192a5f382ae07af066077 Mon Sep 17 00:00:00 2001 From: sumoanema Date: Wed, 6 Nov 2024 13:47:46 +0530 Subject: [PATCH 1/3] Removing monitors in kafka classic app which are not part of the monitors package but are part of documentation. Rearraging monitor list in the same order as present in json file --- .../containers-orchestration/kafka.md | 23 +++++++------------ 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/docs/integrations/containers-orchestration/kafka.md b/docs/integrations/containers-orchestration/kafka.md index 00156cd956..2f597962b4 100644 --- a/docs/integrations/containers-orchestration/kafka.md +++ b/docs/integrations/containers-orchestration/kafka.md @@ -728,31 +728,24 @@ Use this dashboard to: | Alert Name | Alert Description and conditions | Alert Condition | Recover Condition | |:---------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|:-------------------| -| Kafka - High CPU on Broker node | This alert fires when we detect that the average CPU utilization for a broker node is high (`>=`85%) for an interval of 5 minutes. | | | | Kafka - High Broker Disk Utilization | This alert fires when we detect that a disk on a broker node is more than 85% full. | `>=`85 | < 85 | +| Kafka - Failed Zookeeper connections | This alert fires when we detect Broker to Zookeeper connection failures | | | +| Kafka - High Leader election rate | This alert fires when we detect high leader election rate. | | | | Kafka - Garbage collection | This alert fires when we detect that the average Garbage Collection time on a given Kafka broker node over a 5 minute interval is more than one second. | > = 1 | < 1 | -| Kafka - High Broker Memory Utilization | This alert fires when the average memory utilization within a 5 minute interval for a given Kafka node is high (`>=`85%). | `>=` 85 | < 85 | -| Kafka - Large number of broker errors | This alert fires when we detect that there are 5 or more errors on a Broker node within a time interval of 5 minutes. | | | -| Kafka - Large number of broker warnings | This alert fires when we detect that there are 5 or more warnings on a Broker node within a time interval of 5 minutes. | | | -| Kafka - Out of Sync Followers | | | | -| Kafka - Unavailable Replicas | This alert when we detect that there are replicas that are unavailable. | | | -| Kafka - Consumer Lag | This alert fires when we detect that a Kafka consumer has a 30 minutes and increasing lag | | | +| Kafka - Offline Partitions | This alert fires when we detect offline partitions on a given Kafka broker. | | | | Kafka - Fatal Event on Broker | This alert fires when we detect a fatal operation on a Kafka broker node | `>=`1 | `<`1 | -| Kafka - Multiple Errors on Broker | This alert fires when we detect five or more errors on a Kafka broker node in a 5 minute interval. | `>=`5 | `<`5 | | Kafka - Underreplicated Partitions | This alert fires when we detect underreplicated partitions on a given Kafka broker. | | | -| Kafka - Offline Partitions | This alert fires when we detect offline partitions on a given Kafka broker. | | | -| Kafka - High Leader election rate | This alert fires when we detect high leader election rate. | | | -| Kafka - Failed Zookeeper connections | This alert fires when we detect Broker to Zookeeper connection failures | | | -| Kafka - Replica Lag | This alert fires when we detect that a Kafka replica has a lag of over 30 minutes | | | -| Kafka - Lower Producer-Consumer buffer time | This alert fires when we detect that there is only one hour of time remaining between earliest offset and consumer position. | | | +| Kafka - Large number of broker errors | This alert fires when we detect that there are 5 or more errors on a Broker node within a time interval of 5 minutes. | | | +| Kafka - High CPU on Broker node | This alert fires when we detect that the average CPU utilization for a broker node is high (`>=`85%) for an interval of 5 minutes. | | | +| Kafka - Out of Sync Followers | | | | +| Kafka - High Broker Memory Utilization | This alert fires when the average memory utilization within a 5 minute interval for a given Kafka node is high (`>=`85%). | `>=` 85 | < 85 | + ## Kafka Metrics Here's a list of available Kafka metrics. -d - From f7f20c486f551d64ce55542c4b9050b55ac1c3ac Mon Sep 17 00:00:00 2001 From: sumoanema Date: Thu, 7 Nov 2024 10:30:22 +0530 Subject: [PATCH 2/3] Review comment implementation --- docs/integrations/containers-orchestration/kafka.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/integrations/containers-orchestration/kafka.md b/docs/integrations/containers-orchestration/kafka.md index 2f597962b4..ed22a51b2e 100644 --- a/docs/integrations/containers-orchestration/kafka.md +++ b/docs/integrations/containers-orchestration/kafka.md @@ -737,7 +737,7 @@ Use this dashboard to: | Kafka - Underreplicated Partitions | This alert fires when we detect underreplicated partitions on a given Kafka broker. | | | | Kafka - Large number of broker errors | This alert fires when we detect that there are 5 or more errors on a Broker node within a time interval of 5 minutes. | | | | Kafka - High CPU on Broker node | This alert fires when we detect that the average CPU utilization for a broker node is high (`>=`85%) for an interval of 5 minutes. | | | -| Kafka - Out of Sync Followers | | | | +| Kafka - Out of Sync Followers | This alert fires when we detect that there are Out of Sync Followers within a time interval of 5 minutes. | | | | Kafka - High Broker Memory Utilization | This alert fires when the average memory utilization within a 5 minute interval for a given Kafka node is high (`>=`85%). | `>=` 85 | < 85 | From f3fb08b253a8b99c4856cb7fd0e859d90eaa937a Mon Sep 17 00:00:00 2001 From: Alekh Nema <91047769+sumoanema@users.noreply.github.com> Date: Thu, 7 Nov 2024 10:30:53 +0530 Subject: [PATCH 3/3] Apply suggestions from code review Co-authored-by: John Pipkin (Sumo Logic) --- docs/integrations/containers-orchestration/kafka.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/integrations/containers-orchestration/kafka.md b/docs/integrations/containers-orchestration/kafka.md index ed22a51b2e..61ed4e3225 100644 --- a/docs/integrations/containers-orchestration/kafka.md +++ b/docs/integrations/containers-orchestration/kafka.md @@ -729,7 +729,7 @@ Use this dashboard to: | Alert Name | Alert Description and conditions | Alert Condition | Recover Condition | |:---------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|:-------------------| | Kafka - High Broker Disk Utilization | This alert fires when we detect that a disk on a broker node is more than 85% full. | `>=`85 | < 85 | -| Kafka - Failed Zookeeper connections | This alert fires when we detect Broker to Zookeeper connection failures | | | +| Kafka - Failed Zookeeper connections | This alert fires when we detect Broker to Zookeeper connection failures. | | | | Kafka - High Leader election rate | This alert fires when we detect high leader election rate. | | | | Kafka - Garbage collection | This alert fires when we detect that the average Garbage Collection time on a given Kafka broker node over a 5 minute interval is more than one second. | > = 1 | < 1 | | Kafka - Offline Partitions | This alert fires when we detect offline partitions on a given Kafka broker. | | |