[8.13](backport #38471) [Metricbeat][Autodiscover Kubernetes] Fix multiple instances reporting same metrics #38761
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed commit message
If the holder of the lease changes when using metricbeat autodiscover, then we have multiple hosts reporting the same metrics.
Please read the issue #38543 for a more detailed description.
This only affects the metrics that are unique cluster wide, like KSM metrics.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
If you want to see this fix in action, follow the section Results below.
Edit: here are more detailed steps for this:
Create a two node cluster. You can do it this way.
And
kind-config.yaml
:Build a docker image from this branch, in a place where you have the metricbeat binary.
So build metricbeat with
GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build
in metricbeat directory.You can use this Dockerfile to create the docker image:
Then run:
And then upload it to kind nodes:
Deploy the metricbeat manifest with this image.
I am using this manifest, that only has
state_node
enabled, and a few other metricsets.Don't forget to set up your ES outputs if you are not using the elastic stack.
Then deploy the manifest.
Update the lease object like this, so a lease renewal fail occurrs.
Depending on your current holder, you might have to update
holderIdentity
:Related issues
Closes #34998.
Relates #38543.
Results
Lease belongs to
control-plane
metricbeat instance:Logs from leader,
control-plane
metricbeat instance:Change the leader. You can modify the lease like this, which will cause a failure on lease renewal.
Now we see in the logs from the previous leader
control-plane
metricbeat instance:And in the logs from the new leader
worker
metricbeat instance:Results in discover just reporting metrics from one host.name. Check by comparing the number of documents before and after lease holder changed: they remain the same, so we are not having duplicates like before:
This is an automatic backport of pull request #38471 done by [Mergify](https://mergify.com).