In a MetalK8s cluster, the Prometheus service records real-time metrics in a time series database. Prometheus can query a list of data sources called "exporters" at a specific polling frequency, and aggregate this data across the various sources.
Prometheus uses a special language, Prometheus Query Language (PromQL), to write alerting and recording rules.
Alert rules enable a user to specify a condition that must occur before an external system like Slack is notified. For example, a MetalK8s administrator might want to raise an alert for any node that is unreachable for more than one minute.
Out of the box, MetalK8s ships with preconfigured alert rules, which are written as PromQL queries. The table below outlines all the preconfigured alert rules exposed from a newly deployed MetalK8s cluster.
To customize predefined alert rules, refer to :ref:`csc-prometheus-customization`.
To snapshot the database, you must first :ref:`enable the Prometheus admin API<csc-enable-prometheus-admin-api>`.
To generate a snapshot, use the :ref:`sosreport utility<sosreport-generate-report>` with the following options:
root@host # sosreport --batch --build -o metalk8s -kmetalk8s.prometheus-snapshot=True
The name of the generated archive is printed on the console output and
the Prometheus snapshot can be found under prometheus_snapshot
directory.
Warning
You must ensure you have sufficient disk space (at least the size
of the Prometheus volume) under /var/tmp
or change the archive
destination with --tmp-dir=<new_dest>
option.