Skip to content

Commit

Permalink
Ensures prometheus metrics associated with a deleted node are no long…
Browse files Browse the repository at this point in the history
…er reported.

[ upstream commit e9f97cd ]

When a node is deleted from a cluster, metrics associated with that node
are still being exported to prometheus. Short of restarting the agent,
we want to dynamically delete these metrics when a node is removed from the cluster.

This PR ensures node_connectivity_status and node_connectivity_latency
no longer report metrics for nodes that are no longer present on the
cluster.

[ Backporter's notes: Original PR was adapted! ]

The original PR depends (mainly!) on 2 other PRs that haven't been
backported and are fairly substential.
Given this, I've opted to adapt the original implementation to surface the
fix while minimizing impact with 2 updates:
1. pkg/metrics/interfaces did not introduce pkg/metrics/metric wrappers
  as of this release. Hence adapted deletableVec to use the current
implementation. (Referring to commit: 84ea383)
2. pkg/node/manager/manager was adapted to provide for metrics deletion when a
   node is deleted. Subsequent PR refactored the manager metrics structure which
   the original PR used. (Referring to commit: c49ef45)

Signed-off-by: Fernand Galiana <fernand.galiana@isovalent.com>
  • Loading branch information
derailed committed Nov 6, 2023
1 parent 07bf097 commit 4b9c985
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 26 deletions.
39 changes: 15 additions & 24 deletions pkg/metrics/interfaces.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@ type CounterVec interface {
}

type GaugeVec interface {
deletableVec

WithLabelValues(lvls ...string) prometheus.Gauge
prometheus.Collector
}

type DeletableGaugeVec interface {
GaugeVec

type deletableVec interface {
Delete(ll prometheus.Labels) bool
DeleteLabelValues(lvs ...string) bool
DeletePartialMatch(labels prometheus.Labels) int
Expand All @@ -44,13 +44,12 @@ var (
NoOpMetric prometheus.Metric = &metric{}
NoOpCollector prometheus.Collector = &collector{}

NoOpCounter prometheus.Counter = &counter{NoOpMetric, NoOpCollector}
NoOpCounterVec CounterVec = &counterVec{NoOpCollector}
NoOpObserver prometheus.Observer = &observer{}
NoOpObserverVec prometheus.ObserverVec = &observerVec{NoOpCollector}
NoOpGauge prometheus.Gauge = &gauge{NoOpMetric, NoOpCollector}
NoOpGaugeVec GaugeVec = &gaugeVec{NoOpCollector}
NoOpGaugeDeletableVec DeletableGaugeVec = &gaugeDeletableVec{gaugeVec{NoOpCollector}}
NoOpCounter prometheus.Counter = &counter{NoOpMetric, NoOpCollector}
NoOpCounterVec CounterVec = &counterVec{NoOpCollector}
NoOpObserver prometheus.Observer = &observer{}
NoOpObserverVec prometheus.ObserverVec = &observerVec{NoOpCollector}
NoOpGauge prometheus.Gauge = &gauge{NoOpMetric, NoOpCollector}
NoOpGaugeVec GaugeVec = &gaugeVec{NoOpCollector}
)

// Metric
Expand Down Expand Up @@ -136,28 +135,20 @@ func (g *gauge) SetToCurrentTime() {}

// GaugeVec

type gaugeDeletableVec struct {
gaugeVec
type gaugeVec struct {
prometheus.Collector
}

func (*gaugeDeletableVec) Delete(ll prometheus.Labels) bool {
func (*gaugeVec) Delete(ll prometheus.Labels) bool {
return false
}

func (*gaugeDeletableVec) DeleteLabelValues(lvs ...string) bool {
func (*gaugeVec) DeleteLabelValues(lvs ...string) bool {
return false
}

func (*gaugeDeletableVec) DeletePartialMatch(labels prometheus.Labels) int {
func (*gaugeVec) DeletePartialMatch(labels prometheus.Labels) int {
return 0
}

func (*gaugeDeletableVec) Reset() {}

type gaugeVec struct {
prometheus.Collector
}

func (*gaugeVec) Reset() {}
func (gv *gaugeVec) WithLabelValues(lvls ...string) prometheus.Gauge {
return NoOpGauge
}
4 changes: 2 additions & 2 deletions pkg/metrics/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -229,11 +229,11 @@ var (

// NodeConnectivityStatus is the connectivity status between local node to
// other node intra or inter cluster.
NodeConnectivityStatus = NoOpGaugeDeletableVec
NodeConnectivityStatus = NoOpGaugeVec

// NodeConnectivityLatency is the connectivity latency between local node to
// other node intra or inter cluster.
NodeConnectivityLatency = NoOpGaugeDeletableVec
NodeConnectivityLatency = NoOpGaugeVec

// Endpoint

Expand Down

0 comments on commit 4b9c985

Please sign in to comment.