Skip to content

Latest commit

 

History

History
117 lines (86 loc) · 3.81 KB

alerts.rst

File metadata and controls

117 lines (86 loc) · 3.81 KB

Notifications and Alerts

SystemGuard provides a robust system for managing notifications and alerts related to system metrics. Users can view, configure, and manage alert rules, as well as monitor active alerts and the status of the AlertManager. The following sections provide more detailed insights into these features.

View Rules

The View Rules section allows users to see the alert rules that are currently configured in the system. These rules define the conditions under which alerts are triggered. SystemGuard supports creating new alert rules based on custom system metrics.

Features:

  • List Existing Rules: View all the pre-configured alert rules in the system.
  • Create New Rules: Add new custom alert rules to monitor specific metrics or system behavior.
  • Edit/Delete Rules: Modify or remove existing rules based on the system’s requirements.

To view or create rules:

  1. Navigate to the Rules section in the SystemGuard Settings.
  2. Review the list of current alert rules.
  3. Select Add Rule to create a new rule.
  4. Define the conditions and thresholds for the alert trigger.

Example Use Case:

You can create a rule to trigger an alert if CPU usage exceeds 80% for more than 5 minutes. Once this rule is activated, SystemGuard will continuously monitor the system and notify you if the conditions are met.

Active Alerts

The Active Alerts section provides a real-time overview of all the alerts that are currently active in the system. These alerts are triggered based on the conditions defined in the alert rules.

Features:

  • View All Active Alerts: Display a list of all alerts that have been triggered and are currently unresolved.
  • Alert Details: For each active alert, view detailed information including:
    • Alert Name: The identifier for the alert.
    • Triggered Rule: The rule that caused the alert to activate.
    • Severity Level: The level of urgency (e.g., critical, warning).
    • Timestamp: The time when the alert was triggered.
  • Acknowledge/Resolve Alerts: Users can acknowledge alerts to mark them as noticed or resolve them if the issue has been addressed.

To view active alerts:

  1. Navigate to the Alerts section in the SystemGuard settings.
  2. Review the list of active alerts.
  3. Select an alert for more details or to acknowledge it.

Example Use Case:

If a memory usage alert is triggered due to excessive consumption, this section will show the alert, along with when it was triggered and the specific threshold that was breached.

AlertManager Status

The AlertManager Status section provides details about the current state of the AlertManager service, which is responsible for handling the alerts generated by Prometheus. The status page helps users understand whether the alerting system is functioning as expected.

Features:

  • Service Health Check: Check if the AlertManager service is running smoothly without any issues.
  • Recent Alerts: View a log of recently handled alerts.
  • Service Restart: If the AlertManager service is not running or has encountered issues, users can restart the service directly from this page.

To view the AlertManager Status:

  1. Go to the AlertManager Status section in the SystemGuard settings.
  2. Review the status and any issues with the service.
  3. Take necessary action, such as restarting the service, if required.

Example Use Case:

If alerts are not being processed or sent out properly, you can visit the AlertManager Status page to verify that the service is operational and troubleshoot any issues.