[Autodiscover Kubernetes] Leader election never stops instances after lease lock is lost #38543

constanca-m · 2024-03-22T08:45:38Z

This issue is just to put together everything we found out so far about the problems surrounding our current implementation of leader election, and to help track the tasks we need to accomplish before calling it solved.

This issue was first reported here: #34998

Description

We can see from the logs that the lease renewal is failing:

E0403 16:29:14.243178      13 leaderelection.go:330] error retrieving resource lock kube-system/metricbeat-cluster-leader: leases.coordination.k8s.io "metricbeat-cluster-leader" is forbidden: User "system:serviceaccount:kube-system:metricbeat" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system"
...
I0403 16:29:16.238161      13 leaderelection.go:283] failed to renew lease kube-system/metricbeat-cluster-leader: timed out waiting for the condition
E0403 16:29:16.238229      13 leaderelection.go:306] Failed to release lock: resource name may not be empty
..."message":"leader election lock LOST, id beats-leader-kind-worker"...

However this should not be a reason for the previous metricbeat lease holder to keep reporting metrics. The expected behavior should be: as soon as the holder loses the lock - no matter if there was a renewal or not - that metricbeat instance should stop reporting metrics.

Since we can see in the logs the message leader election lock LOST, id beats-leader-kind-worker, we know at least that this function is being called correctly:

beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go

Lines 301 to 305 in 10ff992

    
           OnStoppedLeading: func() { 
        
           	logger.Debugf("leader election lock LOST, id %v", id) 
        
           	eventID := fmt.Sprintf("%v-%v", metaUID, time.Now().UnixNano()) 
        
           	stopLeading(uuid.String(), eventID) 
        
           },

Why is this problem happening then?

The reason for the duplicated metrics is actually quite simple.

The leader, once it starts, emits an event with the flag start set to true:

beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go

Lines 208 to 213 in 10ff992

    
           event := bus.Event{ 
        
           	"start":    true, 
        
           	"provider": uuid, 
        
           	"id":       eventID, 
        
           	"unique":   "true", 
        
           }

This event is then captured in this part of the code:

beats/libbeat/autodiscover/autodiscover.go

Lines 141 to 146 in 10ff992

    
           if _, ok := event["start"]; ok { 
        
           	// if updated is true, we don't want to set it back to false 
        
           	if a.handleStart(event) { 
        
           		updated = true 
        
           	} 
        
           }

And this handle start function initializes the right configuration for our a.configs here:

beats/libbeat/autodiscover/autodiscover.go

Lines 264 to 266 in 10ff992

    
           if updated { 
        
           	a.configs[eventID] = newCfg 
        
           }

So now we know updated is set to true and we need to reload our autodiscover configuration.

Once we handle stop events we should do the same. However, we have a problem. When dealing with stopLeading(), we use a new event id:

beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go

Lines 301 to 305 in 10ff992

    
           OnStoppedLeading: func() { 
        
           	logger.Debugf("leader election lock LOST, id %v", id) 
        
           	eventID := fmt.Sprintf("%v-%v", metaUID, time.Now().UnixNano()) 
        
           	stopLeading(uuid.String(), eventID) 
        
           },

And this event id was used to save the configuration on autodiscover... So once we start handling the stop event, we check if we have the configuration there and upload the new autodiscover settings:

beats/libbeat/autodiscover/autodiscover.go

Lines 281 to 284 in 10ff992

    
           if len(a.configs[eventID]) > 0 { 
        
           	a.logger.Debugf("Stopping %d configs", len(a.configs[eventID])) 
        
           	updated = true 
        
           }

Because this is a new event id, nothing is found there, and our metricbeat instance never stops reporting metrics...

Originally posted by @constanca-m in #34998 (comment)

Consequences

Currently, we use Run for leader election:

beats/libbeat/autodiscover/providers/kubernetes/kubernetes.go

Line 355 in 4525d78

go le.Run(ctx)

And it says this:

// Run starts the leader election loop. Run will not return
// before leader election loop is stopped by ctx or it has
// stopped holding the leader lease

This means we will never have the same leader twice! Because they stop running and they never are reelected!

So we have a problem. Example:

We have two nodes, node-1 and node-2.
node-1 is the first leader.
node-1 loses the locker, so it stops running.
node-2 gets elected.
There happens some kind of lease renewal that fails with timeout (for example, rolebinding gets deleted). node-2 loses the lease. It stops running.
Who's going to be leader now? There are no more instances running to report the metrics...

I tried this with a unit test, trying to renew around 20 times, and I could see the leader election had stopped like in the example above.

So I believe the implementation as of now (the official one, not the from this branch) has two problems:

A metricbeat instance never stops from reporting metrics, even after losing the lease! It causes duplicated documents.
A metricbeat instance can never be reelected as the leader.
- This doesn't necessarily cause a problem in the current implementation, since our previous leader instances never stop.

I think we need to consider other alternatives to leader election or find a way to make it run again, because like this we will be forcing users to delete pods so they can start again.

Originally posted by @constanca-m in #38471 (comment)

How to test

Issue #34998 already mentioned how to reproduce.

Here is an alternative way to do it:

Create a two node cluster. You can do it this way.

$ kind create cluster --config kind-config.yaml

And kind-config.yaml:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    kubeadmConfigPatches:
      - |
        kind: KubeProxyConfiguration
        metricsBindAddress: "0.0.0.0"
  - role: worker

Deploy metricbeat on Kubernetes.

Update the lease object like this, so a lease renewal fail occurrs.

Depending on your current holder, you might have to update holderIdentity:

apiVersion: coordination.k8s.io/v1
kind: Lease
metadata:
  name: metricbeat-cluster-leader
  namespace: kube-system
spec:
  holderIdentity: beats-leader-metricbeat-worker

You should see both hosts reporting metrics now.

Tasks

Update autodiscover in beats directory (code is here)
- PR: [Metricbeat][Autodiscover Kubernetes] Fix multiple instances reporting same metrics #38471
- Also updates documentation of autodiscover
Update elastic agent kubernetes leader election provider (code is here)
- PR: [Kubernetes leader election] Run leader elector at all times elastic-agent#4542

The text was updated successfully, but these errors were encountered:

MichaelKatsoulis · 2024-03-28T15:12:36Z

I checked in elastic-agent and the behaviour seems a bit different. If the leader loses the lease, then it stops collecting the cluster metrics. But it is also removed from leader election process, and then it cannot take the lease again.

constanca-m added the bug label Mar 22, 2024

constanca-m self-assigned this Mar 22, 2024

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 22, 2024

constanca-m added Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team and removed needs_team Indicates that the issue/PR needs a Team:* label labels Mar 22, 2024

This was referenced Mar 22, 2024

[Kubernetes provider] LeaderElection: error during lease renewal leads to events duplication #34998

Closed

[Metricbeat][Autodiscover Kubernetes] Fix multiple instances reporting same metrics #38471

Merged

mergify bot mentioned this issue Apr 8, 2024

[8.13](backport #38471) [Metricbeat][Autodiscover Kubernetes] Fix multiple instances reporting same metrics #38761

Merged

6 tasks

This was referenced Apr 8, 2024

[Kubernetes leader election] Run leader elector at all times elastic/elastic-agent#4542

Merged

[Kubernetes leader election] Acquire lease after it is lost elastic/elastic-agent#4543

Closed

constanca-m closed this as completed in elastic/elastic-agent#4542 Apr 22, 2024

mergify bot mentioned this issue Apr 22, 2024

[8.14](backport #4542) [Kubernetes leader election] Run leader elector at all times elastic/elastic-agent#4604

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Autodiscover Kubernetes] Leader election never stops instances after lease lock is lost #38543

[Autodiscover Kubernetes] Leader election never stops instances after lease lock is lost #38543

constanca-m commented Mar 22, 2024 •

edited

Loading

MichaelKatsoulis commented Mar 28, 2024

[Autodiscover Kubernetes] Leader election never stops instances after lease lock is lost #38543

[Autodiscover Kubernetes] Leader election never stops instances after lease lock is lost #38543

Comments

constanca-m commented Mar 22, 2024 • edited Loading

Description

Why is this problem happening then?

Consequences

How to test

Tasks

MichaelKatsoulis commented Mar 28, 2024

constanca-m commented Mar 22, 2024 •

edited

Loading