Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
health/server: Fix stale references to old nodes during health probe
[ upstream commit 7c7b723 ] Given the order of operations in prober.OnIdle, it is possible for the health probe to have a stale references to a deleted nodes. When that occurs, node connectivity metrics which were previously deleted [1] would be brought back, causing confusion. If users defined alerts for node connectivity health checks metrics (see example below), then this would erroneously trigger because the old nodes would appear in the metric labels as a failing health check. Example given deletion of "kind-worker2" node: ``` cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-control-plane" target_nod e_type="remote_intra_cluster" type="endpoint" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-control-plane" target_nod e_type="remote_intra_cluster" type="node" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-worker" target_node_type= "local_node" type="endpoint" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-worker" target_node_type= "local_node" type="node" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-worker2" target_node_type ="remote_intra_cluster" type="endpoint" 0.000000 ``` Fixes: d9e1ff8 ("cilium-health: Remove unnecessary goroutine") [1]: e9f97cd ("Ensures prometheus metrics associated with a deleted node are no longer reported.") Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
- Loading branch information