Skip to content

Commit

Permalink
Unregister prom gauges when recycling cluster watcher
Browse files Browse the repository at this point in the history
Fixes #11839

When in `restartClusterWatcher` we fail to connect to the target cluster
for whatever reason, the function gets called again 10s later, and tries
to register the same prometheus metrics without unregistering them
first, which generates warnings.

The problem lies in `NewRemoteClusterServiceWatcher`, which instantiates
the remote kube-api client and registers the metrics, returning a nil
object if the client can't connect. `cleanupWorkers` at the beginning of
`restartClusterWatcher` won't unregister those metrics because of that
nil object.

This fix reorders `NewRemoteClusterServiceWatcher` so that an object is
returned even when there's an error, so cleanup on that object can be
performed.
  • Loading branch information
alpeb committed Jan 3, 2024
1 parent 311e7c9 commit 999eff3
Showing 1 changed file with 9 additions and 6 deletions.
15 changes: 9 additions & 6 deletions multicluster/service-mirror/cluster_watcher.go
Expand Up @@ -173,10 +173,6 @@ func NewRemoteClusterServiceWatcher(
if err != nil {
return nil, fmt.Errorf("cannot initialize api for target cluster %s: %w", clusterName, err)
}
_, err = remoteAPI.Client.Discovery().ServerVersion()
if err != nil {
return nil, fmt.Errorf("cannot connect to api for target cluster %s: %w", clusterName, err)
}

// Create k8s event recorder
eventBroadcaster := record.NewBroadcaster()
Expand All @@ -188,7 +184,7 @@ func NewRemoteClusterServiceWatcher(
})

stopper := make(chan struct{})
return &RemoteClusterServiceWatcher{
rcsw := RemoteClusterServiceWatcher{
serviceMirrorNamespace: serviceMirrorNamespace,
link: link,
remoteAPIClient: remoteAPI,
Expand All @@ -207,7 +203,14 @@ func NewRemoteClusterServiceWatcher(
headlessServicesEnabled: enableHeadlessSvc,
// always instantiate the gatewayAlive=true to prevent unexpected service fail fast
gatewayAlive: true,
}, nil
}

_, err = remoteAPI.Client.Discovery().ServerVersion()
if err != nil {
return &rcsw, fmt.Errorf("cannot connect to api for target cluster %s: %w", clusterName, err)
}

return &rcsw, nil
}

func (rcsw *RemoteClusterServiceWatcher) mirroredResourceName(remoteName string) string {
Expand Down

0 comments on commit 999eff3

Please sign in to comment.