Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alert on node and machine IP mismatch #12333

Open
embik opened this issue Jun 2, 2023 · 1 comment
Open

Alert on node and machine IP mismatch #12333

embik opened this issue Jun 2, 2023 · 1 comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management.

Comments

@embik
Copy link
Member

embik commented Jun 2, 2023

Description of the feature you would like to add / User story

As a KKP admin
I would like to be alerted of primary IP mismatches between Node objects and the actual machine/VM
in order to be aware that a node changed IP addresses.

Solution details

  • Raise a Prometheus alert if Node and Machine status disagree over the primary IP of the node.
  • Use Prometheus metrics exposed by kube-state-metrics and machine-controller for that, consider implementing metrics upstream if they are not available yet.

Alternative approaches

Use cases

When underlying infrastructure updates a VM's ip address (e.g. by returning a differente DHCP), node to node communication in Kubernetes breaks down. We also need to document that as per kubermatic/docs#1444. But if it still happens, we should raise an alert so admins are aware and can investigate.

Additional information

@embik embik added kind/feature Categorizes issue or PR as related to a new feature. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. labels Jun 2, 2023
@embik embik added this to the KKP 2.24 milestone Jun 2, 2023
@embik embik modified the milestones: KKP 2.24, KKP 2.25 Nov 3, 2023
@embik embik modified the milestones: KKP 2.25, KKP 2.26 Mar 11, 2024
@csengerszabo
Copy link
Contributor

/milestone clear

@kubermatic-bot kubermatic-bot removed this from the KKP 2.26 milestone Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management.
Projects
None yet
Development

No branches or pull requests

3 participants