Skip to content

Add more Raft status metrics for Prometheus and Grafana dashboard#386

Merged
bootjp merged 5 commits intomainfrom
copilot/add-raft-status-metrics
Mar 21, 2026
Merged

Add more Raft status metrics for Prometheus and Grafana dashboard#386
bootjp merged 5 commits intomainfrom
copilot/add-raft-status-metrics

Conversation

Copy link
Contributor

Copilot AI commented Mar 21, 2026

Expands the existing Prometheus metrics exporter with additional Raft status gauges sourced from the already-polled raft.Raft.Stats() map — no extra RPC calls.

New metrics

Metric Description
elastickv_raft_term Current Raft term — increments on each leader election
elastickv_raft_last_log_index Last written log entry index — useful for cross-node replication lag
elastickv_raft_last_snapshot_index Most recent snapshot index
elastickv_raft_fsm_pending Commands queued to FSM but not yet applied — backpressure indicator
elastickv_raft_num_peers Other voting members in the cluster, excluding this node

All metrics carry the same group, node_id, and node_address labels as existing Raft gauges.

Grafana dashboard

A new dashboard monitoring/grafana/dashboards/elastickv-raft-status.json is included for visualising Raft health at a glance:

Row Panels
Cluster Overview Raft Term · Voting Peers · Member Count · FSM Pending (stat, color-coded)
Leader & State Current Leader per Group · Node Local State · Cluster Members (tables)
Index Progress last log / commit / applied / snapshot index over time (time series)
Replication Health Replication Lag (last_log_index − applied_index) · Last Leader Contact in seconds (time series)

The dashboard includes a group template variable for filtering by Raft group and auto-refreshes every 10 seconds.

Original prompt

This section details on the original issue you should resolve

<issue_title>Add metrics output for prometheus</issue_title>
<issue_description></issue_description>

<agent_instructions>add more raft status metrics</agent_instructions>

Comments on the Issue (you are @copilot in this section)


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Copilot AI changed the title [WIP] Add metrics output for prometheus Add more Raft status metrics for Prometheus Mar 21, 2026
Copilot AI requested a review from bootjp March 21, 2026 09:57
@bootjp
Copy link
Owner

bootjp commented Mar 21, 2026

@copilot fix golangci-lint and ai require document add

Copy link
Contributor Author

Copilot AI commented Mar 21, 2026

@copilot fix golangci-lint and ai require document add

Fixed in 71eedf1:

  • golangci-lint: Removed extra alignment spaces from the RaftMetrics struct fields (the gci formatter requires standard gofmt alignment, not manual column alignment).
  • Doc comments: Added a type-level doc comment to RaftMetrics and per-field comments describing each gauge's purpose, including all five new fields (term, lastLogIndex, lastSnapshotIndex, fsmPending, numPeers).

golangci-lint run ./monitoring/... now reports 0 issues.

Copilot AI requested a review from bootjp March 21, 2026 10:09
@bootjp
Copy link
Owner

bootjp commented Mar 21, 2026

@copilot Raftの状態を確認するGrafanaダッシュボードのJSONを作ってください

Copy link
Contributor Author

Copilot AI commented Mar 21, 2026

@copilot Raftの状態を確認するGrafanaダッシュボードのJSONを作ってください

Added monitoring/grafana/dashboards/elastickv-raft-status.json in commit 7755874.

The dashboard has 4 rows:

Row Panels
Cluster Overview Raft Term · Voting Peers · Member Count · FSM Pending (stat, color-coded)
Leader & State Current Leader per Group · Node Local State · Cluster Members (tables)
Index Progress last log / commit / applied / snapshot index over time (time series)
Replication Health Replication Lag (last log − applied entries) · Last Leader Contact in seconds (time series)

A group template variable is included so you can filter by Raft group, and the dashboard auto-refreshes every 10 seconds.

Copilot AI changed the title Add more Raft status metrics for Prometheus Add more Raft status metrics for Prometheus and Grafana dashboard Mar 21, 2026
@bootjp bootjp marked this pull request as ready for review March 21, 2026 11:08
@bootjp bootjp enabled auto-merge March 21, 2026 11:08
@bootjp bootjp merged commit ffa29f1 into main Mar 21, 2026
7 checks passed
@bootjp bootjp deleted the copilot/add-raft-status-metrics branch March 21, 2026 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add metrics output for prometheus

2 participants