Releases: aerospike/aerospike-monitoring
Releases · aerospike/aerospike-monitoring
Aerospike Monitoring v3.6.0
Description & Highlights
- Aerospike Monitoring Stack version 3.6.0 includes a new usecase dashboards Secondary Index summary view.
New features
- Added the Secondary Index Summary View usecase dashboard, which displays all batch transaction statistics. [OM-190]
- Attached dashboards as a tar artefact to the release tag [OM-191]
Aerospike Monitoring v3.5.0
New in this version
This version implements a variety of feature requests and improvements to enhance performance, usability, and reliability.
New features
- Added the Batch Transactions dashboard, which displays all batch transaction statistics. [OM-183]
- Added the Kubernetes View dashboard to show Kubernetes statistics while using Aerospike Database. [OM-185]
- Enhanced the Namespace View dashboard to show stripe metrics. [OM-182]
Aerospike Monitoring v3.4.0
Description & Highlights
- Aerospike Monitoring Stack version 3.4.0 includes 2 new dashboards Graph service functional and jvm metrics
New features
- Graph Service dashboard to display key functional metrics including graph service gremlin metrics [OM-108]
- Graph service JVM dashboard to display key metrics specific to Java virtual machine [OM-170]
- Enhanced Cluster View dashboard to include the newly added system metrics (memory and network) from Aerospike Prometheus Exporter [OM-167]
- Enhanced XDR View dashboard to include the newly added system metrics (only network) from Aerospike Prometheus Exporter [OM-166]
Aerospike Monitoring v3.3.0
Description & Highlights
- Aerospike Monitoring Stack version 3.3.0 includes a new dashboard to compare multiple Aerospike clusters configured in the same or different data-centers.
NOTE:
- The Datacenter comparison dashboard requires that Prometheus Exporter and all clusters are configured in a single Prometheus instance.
New features
- Added a dashboard to compare clusters in the same datacenter or in different data-centers, providing side-by-side comparison of cluster data and namespace details. [OM-159]
Bug fixes
- Some panels in the Host and Node View dashboard were displaying labels incorrectly. [OM-160]
- Some labels were incorrect in the Host View dashboard. [OM-163]
Aerospike Monitoring v3.2.0
Description & Highlights
- Aerospike Monitoring Stack version 3.2.0 combines Prometheus Exporter metrics with relevant Aerospike metrics.
NOTE:
- The Host-related dashboards require Prometheus Node Exporter.
- To include the alert cluster label, you must add a cluster-name custom label to Prometheus Node Exporter.
Features
- Combine Node Exporter and Aerospike metrics in a single dashboard. [OM-158]
- Supports analysis, issues, and activities like upgrade and node additions by showing critical Aerospike metrics alongside node and system metrics.
- Covers migrations, latencies, and XDR in current release.
- Add additional Node Exporter alerts. [OM-157]
- Includes alerts related to memory, disk, network, and CPU utilization with customizable thresholds.
Aerospike Monitoring v3.1.0
Description & Highlights
- Aerospike Monitoring Stack version 3.1.0 includes new dashboards for Secret Agent and Host Details, as well as some critical bug fixes.
NOTE:
- The Host Details dashboard requires Prometheus Node Exporter.
Features
- [OM-149] - Added the new Secret Agent dashboard.
- Helps you monitor the Secret Agent component deployed alongside the Aerospike Server.
- Provides information about connections to the cloud, vault, and server.
- Displays errors related to fetching or serving secrets, including key errors, communication errors, and so on.
- [OM-153] - Added the new Host Details dashboard.
- Includes system metrics required to monitor the host running Aerospike Server.
- Utilizes metrics from Node Exporter running on the host and displays key metrics as state, aggregate, and historical.
- Provides metrics related to core, CPU utilization, network utilization, and errors.
- Includes alerts related to memory, disk, network, and CPU utilization with customizable thresholds.
- [OM-151] - Added support for microbenchmarks in Latency view.
Fixes
- [OM-150] - Cluster size was not always displayed correctly in cluster-view, rolling-restart, and multi-cluster view.
Aerospike Monitoring v3.0.0
Description & Highlights
- Aerospike Monitoring Stack version 3.0.0 is a major upgrade with improved dashboards designed to be forward and backward compatible with Aerospike versions.
- Establishes a consistent design pattern with status displayed at the top and detailed time ranges displayed below.
- All dashboards and alerts are forward compatible to 7.x versions and backwards compatible to 5.x versions.
- Removed unused and deprecated dashboards (alerts, exporter, and jobs).
Breaking Changes
- Alert severity is modified to be of type string like critical, error, warn, and info. Earlier number based severity is deprecated.
- New alerts related to 7.0 metrics, connectors, and bug-fixes are added with string type severity only.
- Removed the 3 deprecated alerts, exporters and job dashboards.
Features
- [OM-116] - Add DynaTrace to the OTEL Examples.
- Added support documentation and example otel-collector configurations on integrating Aerospike metrics with DynaTrace.
- [OM-127] - Node View - Handle 7.0 metric changes.
- Revamped dashboard according to 7.0 metrics theme and display build version, alert by severity, and data, index and memory metrics are split into respective panels.
- Data, index and memory metrics are shown as minimum, average and maximum to identify anomalies easily across namespaces.
- [OM-128] - Namespace View - Handle 7.0 metric changes.
- Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
- Data, index and memory metrics are shown as minimum, average and maximum to identify anomalies easily across all nodes.
- [OM-129] - Unique Data View - Handle 7.0 metric changes.
- Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
- Displays usage across clusters and historical usage by each cluster.
- Data is displayed in three layers: 1. all-clusters, 2. single-cluster, 3. by namespace in each cluster.
- [OM-130] - Update Alert Rule to Handle 7.0 metric changes.
- Enhanced alerts to use 7.x metrics and marked previous alerts with "pre7x" prefix.
- List of alerts added / modified.
- Modified - NamespaceDataCloseToStopWrites, LowDataAvailWarning, LowDataAvailCritical.
- Added - HighDataUseNamespaceWarning, HighDataUseNamespaceCritical.
- Renamed - pre7x_NamespaceSetQuotaWarning, pre7x_NamespaceSetQuotaAlertCritical.
- [OM-133] - Set Index - Handle 7.0 metric changes.
- Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
- Data metrics are shown as minimum, average and maximum to identify anomalies easily across all nodes.
- [OM-134] - All Flash - Handle 7.0 metrics.
- Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
- [OM-135] - Rolling Restart Dashboard - Handle 7.0 metric changes.
- Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
- Data and memory are now showing both top-k and bottom-k, which represents both over-utilized and under-utilized.
- [OM-136] - Cluster view - Handle 7.0 metric changes.
- Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
- [OM-139] - Handle 7.0 - Multi cluster view.
- Revamped dashboard according to 7.0 metrics theme and display build-version, alert by severity, and data, index and memory metrics are split into respective panels.
- Topology diagram now shows dashboards and connectors using different diagrams.
- [OM-144] - Remove deprecated job and alerts (old) dashboards.
- Removed the 2 deprecated jobs, exporter and alerts dashboard. Alerts dashboard is replaced with new alertsview dashboard in previous release.
Fixes
- [OM-140] - Standardized alert severity colors, bug-fix where info alert count now showing correctly.
- [OM-114] - AMS - Change ordering of memory free pct graph on Rolling Restart dashboard.
- [OM-74] - Avoid average function in namespace dashboard.
- [OM-105] - Monitoring dashboard "Namespace" does not show namespace level values.
- [OM-109] - Improve Dashboard Queries and Linting.
Aerospike Monitoring v2.8.0
Description
NOTE: The v2.8.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
- This release includes 1 major feature - Connector Dashboard, Alerts and topology.
- Aerospike Monitoring Stack version 2.8.0 adds 2 dashboard, alerts and bug fixes:
- 2 dashboards to monitor connectors and connector JVM metrics.
- Enhanced alerts to cover various aspects of Connector key metric thresholds and JVM health.
NOTE:
- Aerospike Prometheus exporter 1.13.0 or greater must be used to get the Aerospike 6.4 metrics.
- The Multi-Cluster View dashboard now requires the Diagram Panel plugin.
Features
- [OM-64] - Create predefined Prometheus alert rules for Connectors.
- This release include 6 alerts to cover mandatory functional and process/health of the Connectors.
- Key alerts covered are connector-status, connector-request-lag, connector-request-errors, jvm heap, jvm cpu and jvm gc.
- This release include 6 alerts to cover mandatory functional and process/health of the Connectors.
- [OM-56] - Connectors alerts & Dashboards
- Connector view dashboard which helps to monitor 6 connectors.
- Connectors supported are - xdr-proxy, kafka-outboud, pulsar-outbound, esp-outbound, elastic-search and jms-outbound.
- Key metrics covered are - request lag, request error, success, skipped, connections, xdr record byte size, etc....
- Connector view dashboard which helps to monitor 6 connectors.
- [OM-107] - Create a dashboard for a Connector(s)
- Connector JVM view dashboard which helps to monitor JVM health of 6 Connectors.
- Connectors supported are - xdr-proxy, kafka-outboud, pulsar-outbound, esp-outbound, elastic-search and jms-outbound.
- Key metrics covered are - uptime, cpu, memory, threads, files, classes and buffers.
- Multi-cluster view dashboard is enhanced to display Aerospike Server topology using the cluster-name and xdr dc configurations.
NOTE:- To view data replication topology in multi-cluster-view.
- The cluster-name is mandatory and destination cluster-name is configured as the name of dc in xdr section of the Aerospike Server configuration.
- Connector JVM view dashboard which helps to monitor JVM health of 6 Connectors.
Fixes
- [OM-122] - Avoid duplicate defrag metric values on the namespace dashboard.
- [OM-113] - Namespace view dashboard - average objects per sprig stat.
- [OM-120] - Add high-water mark breached to the Rolling Restart dashboard.
Aerospike Monitoring v2.7.0
Description
NOTE: The v2.7.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
- This release includes 2 major features - Enhanced Alerts and All Flash use-case dashboard.
- Aerospike Monitoring Stack version 2.7.0 adds new dashboard and bug fixes:
- All Flash dashboard, various key metrics which should be monitored while working with flash storage at both index and sindex.
- Enhanced alerts to cover various aspects of server metrics, this release covers alerts on Namespaces, XDR, Latencies, Best checks, Node-exporter etc...
NOTE:
- Aerospike Prometheus exporter 1.13.0 or greater must be used to get the Aerospike 6.4 metrics.
Features
- [OM-104] - Add new XDR bytes-shipped metrics to dashboards.
- Display bytes-shipped both as stat and time-series which can help monitoring the replication progress.
- [OM-98] - Observability & Management Alerts - Enhance / enrich prometheus alerts from ACMS.
- This release includes 40 alerts covering various metrics of Aerospike Server, some key areas are:
- Namespaces, Latencies, data replication (xdr), set, node-exporter, flash , best checks etc...
- This release includes 40 alerts covering various metrics of Aerospike Server, some key areas are:
- [OM-93] - Use-case Dashboard: all-flash.
- A new use-case dashboard is introduced in this release, this dashboard focuses mainly on key metrics and alerts related to flash usage.
- Some key metrics are average-objects per sprig, index-pressure, primary index flash and secondary index flash etc...
- A new use-case dashboard is introduced in this release, this dashboard focuses mainly on key metrics and alerts related to flash usage.
- [OM-48] - Use-case Dashboard Organization & Naming.
- Added brief descriptions on each dashboard and updated tags to identify each dashboard easily.
- [OM-111] - Observability dashboard unit tests.
- Created a framework to test our dashboard automatically including panels, expression / queries, layout and expression results.
- [OM-103] - Add user stat related alerts.
- Added user stat specific alerts covering connections, connection churn etc...
- [OM-101] - Add warning for best practice failures.
- Alerts if best-practices are not followed while setting up the Aerospike server, this flag is sent by the server after a series of checks.
- [OM-102] - Add warning for node-exporter not being present.
- As a precursor to integrate node-exporter metrics into Aerospike Monitoring stack, this alert is introduced if node-exporter is not configured, raising a warning alert in the Alerts View dashboard.
Aerospike Monitoring v2.6.1
Description
- The v2.6.1 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
- Aerospike Monitoring Stack version 2.6.1 adds bug fixes.
NOTE:
- Aerospike Prometheus exporter 1.12.0 or greater must be used to get the Aerospike 6.3 metrics.
- Deprecated
- Existing Alerts dashboard is deprecated and will be removed in future releases.
- Existing Jobs dashboard is deprecated and will be removed in future releases.
Fixes
- [OM-100]- Issues in Multi-cluster view dashboard
-- Corrected label and unit in XDR panel.
-- Corrected links from XDR and Latencies to respective dashboards (instead of cluster-view).
-- Added a alert-severity based filter. - Issues in Alerts view
-- Panel colors are corrected according to the severity types. - Issues in Unique Data view
-- Unique data bytes are not shown correctly when custom labels are enabled in configuration.
-- Added historical time-series for unique data-bytes data point.