SumoLogic · amee-sumo · Dec 10, 2024 · Dec 8, 2024 · Dec 9, 2024 · Dec 9, 2024
@@ -15,6 +15,10 @@ RabbitMQ logs are sent to Sumo Logic through the OpenTelemetry [filelog receiver
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/RabbitMq-OpenTelemetry/RabbitMQ-Schematics.png' alt="Schematics" />
 
+:::info
+This app includes [built-in monitors](#rabbitmq-alerts). For details on creating custom monitors, refer to the [Create monitors for RabbitMQ app](#create-monitors-for-rabbitmq-app).
+:::
+
 ## Fields creation in Sumo Logic for RabbitMQ
 
 Following are the [Fields](/docs/manage/fields/) which will be created as part of RabbitMQ App install if not already present.
@@ -230,3 +234,20 @@ The **RabbitMQ - Logs** dashboard gives you an at-a-glance view of error message
 The **RabbitMQ - Metrics** dashboard gives you an at-a-glance view of your RabbitMQ deployment across brokers, queue, exchange, consumer, and messages.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/RabbitMq-OpenTelemetry/RabbitMQ-Metrics.png' alt="RabbitMQ Metrics dashboards" />
+
+## Create monitors for RabbitMQ app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### RabbitMQ alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `RabbitMQ - High Consumer Count` | This alert is triggered when consumers are higher than given value (Default 10000) in a queue. | Count `>=` 10000 | Count `<` 10000 |
+| `RabbitMQ - High Message Queue Size` | This alert is triggered when the number of messages in a queue exceeds a given threshold (Default 10000), indicating potential consumer issues or message processing bottlenecks. | Count `>=` 10000 | Count `<` 10000 |
+| `RabbitMQ - High Messages Count` | This alert is triggered when messages are higher than given value (Default 10000) in a queue. | Count `>=` 10000 | Count `<` 10000 |
+| `RabbitMQ - High Unacknowledged Messages` | This alert is triggered when there are too many unacknowledged messages (Default 5000), suggesting consumer processing issues. | Count `>=` 5000 | Count `<` 5000 |
+| `RabbitMQ - Node Down` | This alert is triggered when a node in the RabbitMQ cluster is down. | Count `>=` 1 | Count `<` 1 |
+| `RabbitMQ - Zero Consumers Alert` | This alert is triggered when a queue has no consumers, indicating potential service issues. | Count `<=` 0 | Count `>` 0 |
@@ -2,7 +2,7 @@
 id: cassandra-opentelemetry
 title: Cassandra - OpenTelemetry Collector
 sidebar_label: Cassandra - OTel Collector
-description: Learn about the Sumo Logic OpenTelemetry App for Cassandra.
+description: Learn about the Sumo Logic OpenTelemetry app for Cassandra.
 ---
 
 import useBaseUrl from '@docusaurus/useBaseUrl';
@@ -15,10 +15,14 @@ The [Cassandra](https://cassandra.apache.org/_/cassandra-basics.html) app is a l
 
 Cassandra logs are sent to Sumo Logic through OpenTelemetry [filelog receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filelogreceiver) and cassandra metrics are sent to Sumo Logic using [JMX opentelemetry receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/jmxreceiver) with the `target_system` set as [`cassandra`](https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/jmx-metrics/docs/target-systems/cassandra.md).
 
-The app supports Logs from the open-source version of Cassandra. The App is tested on the 4.0.0 version of Cassandra.
+The app supports logs from the open-source version of Cassandra. The app is tested on the 4.0.0 version of Cassandra.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Schematics.png' alt="Schematics" />
 
+:::info
+This app includes [built-in monitors](#cassandra-alerts). For details on creating custom monitors, refer to the [Create monitors for Cassandra app](#create-monitors-for-cassandra-app).
+:::
+
 ## Fields creation in Sumo Logic for Cassandra
 
 Following are the [Fields](/docs/manage/fields/) which will be created as part of Cassandra App install if not already present:
@@ -36,17 +40,17 @@ Following are the [Fields](/docs/manage/fields/) which will be created as part o
 JMX receiver collects Cassandra metrics from Cassandra server as part of the OpenTelemetry Collector (OTC).
 
   1. Follow the instructions in [JMX - OpenTelemetry's prerequisites section](/docs/integrations/app-development/opentelemetry/jmx-opentelemetry/#prerequisites) to download the [JMX Metric Gatherer](https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/jmx-metrics/README.md). This gatherer is used by the [JMX Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/jmxreceiver#details).
-
   2. Set the JMX port as part of `JAVA_OPTS` for Tomcat startup. Usually, it is set in the `/etc/systemd/system/cassandra.service` or `C:\Program Files\apache-tomcat\bin\tomcat.bat` file.
 
       ```json
       JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=11099 -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.password.file=${CASSANDRA_CONF_DIR}/jmx.password -Dcom.sun.management.jmxremote.access.file=${CASSANDRA_CONF_DIR}/jmx.access"
       ```
 
 #### For log collection
-Cassandra has three main logs: system.log, debug.log, and gc.log which hold general logging messages, debugging logging messages, and java garbage collection logs respectively.
 
-These logs by default live in `${CASSANDRA_HOME}/logs`, but most Linux distributions relocate logs to `/var/log/cassandra`. Operators can tune this location as well as what levels are logged using the provided logback.xml file. For more details on Cassandra logs, see[ this](https://cassandra.apache.org/doc/latest/troubleshooting/reading_logs.html) link.
+Cassandra has three main logs: `system.log`, `debug.log`, and `gc.log`, which hold general logging messages, debugging logging messages, and java garbage collection logs respectively.
+
+These logs by default live in `${CASSANDRA_HOME}/logs`, but most Linux distributions relocate logs to `/var/log/cassandra`. Operators can tune this location as well as what levels are logged using the provided logback.xml file. For more details on Cassandra logs, see [this](https://cassandra.apache.org/doc/latest/troubleshooting/reading_logs.html).
 
 import LogsCollectionPrereqisites from '../../../reuse/apps/logs-collection-prereqisites.md';
 
@@ -78,7 +82,7 @@ You can add any custom fields which you want to be tagged with the data ingested
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-YAML.png' style={{border:'1px solid gray'}} alt="YAML" />
 
-### Step 3: Send logs to Sumo
+### Step 3: Send logs to Sumo Logic
 
 import LogsIntro from '../../../reuse/apps/opentelemetry/send-logs-intro.md';
 
@@ -133,7 +137,7 @@ import LogsOutro from '../../../reuse/apps/opentelemetry/send-logs-outro.md';
 
 <LogsOutro/>
 
-## Sample log messages
+## Sample log message
 
 ```sql
   INFO [ScheduledTasks:1] 2023-01-08 09:18:47,347 StatusLogger.java:101 - system.schema_aggregates
@@ -176,7 +180,7 @@ import LogsOutro from '../../../reuse/apps/opentelemetry/send-logs-outro.md';
 }
 ```
 
-## Sample log queries 
+## Sample log query 
 
 Following is a query from the Cassandra app's **Cassandra - Overview** dashboard Nodes Up panel:
 
@@ -191,6 +195,7 @@ Following is a query from the Cassandra app's **Cassandra - Overview** dashboard
 ```
 
 ## Sample metrics query
+
 Following is the query from Cassandra App's overview Dashboard's Number of Requests Panel:
 
 ```sql
@@ -205,20 +210,15 @@ The **Cassandra - Overview** dashboard provides an at-a-glance view of Cassandra
 
 Use this dashboard to:
 
-- Identify number of nodes which are up and down
-- Gain insights into Memory - Init, used, Max and committed
-- Gain insights into the error and warning logs by thread and Node activity
+- Identify number of nodes which are up and down.
+- Gain insights into Memory - Init, used, Max, and committed.
+- Gain insights into the error and warning logs by thread and Node activity.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Overview.png' alt="Collector" />
 
 ### Cache Stats
 
-The **Cassandra - Cache Stats** dashboard provides insight into the database cache status, schedule, and items.
-
-Use this dashboard to:
-
-- Monitor Cache performance.
-- Identify Cache usage statistics.
+The **Cassandra - Cache Stats** dashboard provides insight into the database cache status, schedule, and items. Use this dashboard to monitor cache performance and identify cache usage statistics.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Cache-Stats.png' alt="Cache Stats" />
 
@@ -246,36 +246,50 @@ Use this dashboard to:
 
 ### Memtable
 
-The **Cassandra - Memtable** dashboard provides insights into memtable statistics.
-
-Use this dashboard to:
-
-- Review flush activity and memtable status.
+The **Cassandra - Memtable** dashboard provides insights into memtable statistics. Use this dashboard to review flush activity and memtable status.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Memtable.png' alt="Memtable" />
 
 ### Resource Usage
 
-The **Cassandra - Resource Usage** dashboard provides details of resource utilization across Cassandra clusters.
-
-Use this dashboard to:
-
-- Identify resource utilization. This can help you to determine whether resources are over-allocated or under-allocated.
+The **Cassandra - Resource Usage** dashboard provides details of resource utilization across Cassandra clusters. Use this dashboard to identify resource utilization. This can help you to determine whether resources are over-allocated or under-allocated.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Resource-Usage-Logs.png' alt="Resource Usage" />
 
 ### Compaction
 
 The **Cassandra - Compactions** dashboard provides insight into the completed and pending compaction tasks.
+
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Compaction.png' alt="Compaction" />
 
 ### Requests
 
 The **Cassandra - Requests** dashboard provides insight into the number of request served, number of error request, and their distribution by status and operation. Also you can monitor the read and write latency of the cluster instance using this dashboard.
+
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Requests.png' alt="Requests" />
 
 ### Storage
 
 The **Cassandra - Storage** dashboard provides insight into the current value of total hints of your Cassandra cluster along with storage managed by the cluster.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Cassandra-OpenTelemetry/Cassandra-Storage.png' alt="Storage" />
+
+## Create monitors for Cassandra app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### Cassandra alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `Cassandra - Compaction Task Pending` | This alert is triggered when there are more than 15 pending Compaction tasks. | Count > = 15 | Count < 15 |
+| `Cassandra - High Hints Backlog` | This alert is triggered when the number of in-progress hints exceeds the given value for 5 minutes. | Count > = 5000 | Count < 5000 |
+| `Cassandra - High Memory Usage` | This alert is triggered when memory used exceeds 85% of committed memory for more than 10 minutes. | Count  > = 1 | Count < 1 |
+| `Cassandra - Node Down Alert` | This alert is triggered when a Cassandra node status changes to DOWN for more than 5 minutes. | Count > = 1 | Count < 1 |
+| `Cassandra - Operation Error Rate High` | This alert is triggered when the error rate of operations exceeds given value (Default 5%) for 5 minutes. | Count > 5 | Count < = 5 |
+| `Cassandra - Range Query Latency High (99th Percentile)` | This alert is triggered when the 99th percentile of range query latency exceeds the given value (Default 2 seconds) for 5 minutes. | Count > = 2000000 | Count < 2000000 |
+| `Cassandra - Read Latency High (99th Percentile)` | This alert is triggered when the 99th percentile of read latency exceeds given value (Default 500ms) for 5 minutes. | Count > = 500000 | Count < 500000 |
+| `Cassandra - Storage Growth Rate Abnormal` | This alert is triggered when the storage growth rate exceeds given value (Default 25MB/minute) for 5 minutes. | Count > = 26214400 | Count < 26214400 |
+| `Cassandra - Write Latency High (99th Percentile)` | This alert is triggered when the 99th percentile of write latency exceeds given value (Default 200ms) for 5 minutes. | Count > = 200000 | Count < 200000 |
@@ -19,17 +19,20 @@ Memcached logs are sent to Sumo Logic through the OpenTelemetry [filelog receive
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Memcached-OpenTelemetry/Memcached-Schematics.png' alt="Schematics" />
 
+:::info
+This app includes [built-in monitors](#memcached-alerts). For details on creating custom monitors, refer to the [Create monitors for Memcached app](#create-monitors-for-memcached-app).
+:::
+
 ## Fields creation in Sumo Logic for Memcached
 
 Following are the [Fields](/docs/manage/fields/) which will be created as part of Memcached App install if not already present.
 
 - **`sumo.datasource`**. Has a fixed value of **memcached**.
-- **`db.system`**. Has a fixed value of **memcached**
-- **`deployment.environment`**. User configured. This is the deployment environment where the Memcache cluster resides. For example: dev, prod or qa.
+- **`db.system`**. Has a fixed value of **memcached**.
+- **`deployment.environment`**. User configured. This is the deployment environment where the Memcache cluster resides. For example: dev, prod, or qa.
 - **`db.cluster.name`**. User configured. Enter a name to identify this Memcached cluster. This cluster name will be shown in the Sumo Logic dashboards.
 - **`db.node.name`**. This has value of the FQDN of the machine where OpenTelemetry collector is collecting logs and metrics from.
 
-
 ## Prerequisites
 
 1. Configure logging in Memcached: By default, the installation of Memcached will not write any request logs to disk. To add a log file for Memcached, you can use the following syntax:
@@ -221,13 +224,12 @@ Following is the query from Errors panel of Memcached app's overview Dashboard:
 | sum(ERROR) as ERROR by _timeslice
 ```
 ## Sample metrics queries
-**Total Get**
 
-```
+```sql title="Total Get"
 sumo.datasource=memcached deployment.environment=* db.cluster.name=* db.node.name=* metric=memcached.commands command=get | sum
 ```
 
-## Viewing Memcached Dashboards
+## Viewing the Memcached dashboards
 
 ### Overview
 
@@ -237,7 +239,7 @@ The **Memcached - Overview** dashboard provides an at-a-glance view of the Memca
 
 ### Operations
 
-The **Memcached - Operations** Dashboard provides detailed analysis on connections, thread requested, network bytes, table size.
+The **Memcached - Operations** Dashboard provides detailed analysis on connections, thread requested, network bytes, and table size.
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Memcached-OpenTelemetry/Memcached-Operations.png' alt="Memcached dashboards" />
 
@@ -247,7 +249,6 @@ The **Memcached - Command Stats** dashboard provides detailed insights into the
 
 <img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Memcached-OpenTelemetry/Memcached-Command-Stats.png' alt="Memcached dashboards" />
 
-
 ### Cache Information
 
 The **Memcached - Cache Information** dashboard provides insight into cache states, cache hit, and miss rate over time.
@@ -258,4 +259,21 @@ The **Memcached - Cache Information** dashboard provides insight into cache stat
 
 The **Memcached - Logs** dashboard helps you quickly analyze your Memcached error logs, commands executed, and objects stored.
 
-<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Memcached-OpenTelemetry/Memcached-Logs.png' alt="Memcached dashboards" />
+<img src='https://sumologic-app-data-v2.s3.amazonaws.com/dashboards/Memcached-OpenTelemetry/Memcached-Logs.png' alt="Memcached dashboards" />
+
+
+## Create monitors for Memcached app
+
+import CreateMonitors from '../../../reuse/apps/create-monitors.md';
+
+<CreateMonitors/>
+
+### Memcached alerts
+
+| Name | Description | Alert Condition | Recover Condition |
+|:--|:--|:--|:--|
+| `Memcached - Cache Hit Ratio` | This alert is triggered when low cache hit ratio is less than 50%. The hit rate is one of the most important indicators of Memcached performance. A high hit rate means faster responses to your users. If the hit rate is falling, you need quick visibility into why. | Count < = 50% | Count > 50% |
+| `Memcached - Commands Error` | This alert is triggered when Memcached has error commands. | Count > 0 | Count < = 0 |
+| `Memcached - Current Connections` | This alert is triggered when current connections to Memcached are zero. | Count < = 0 | Count > 0 |
+| `Memcached - High Memory Usage` | This alert is triggered when the Memcached exceed given threshold memory usage (in GB). | Count > 5 | Count < = 5 |
+| `Memcached - High Number of Connections` | This alert is triggered when the number of current connection for Memcached exceed given threshold. | Count > = 1000 | Count < 1000 |