Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete Nodes by NodeRef and compare node and machine names when getting a Hetzner node #813

Merged
merged 5 commits into from
Aug 27, 2020

Conversation

xmudrii
Copy link
Member

@xmudrii xmudrii commented Aug 23, 2020

What this PR does / why we need it:

This PR attempts to fix the bug which occurs when a Machine is rolled-out and the new Machine has the same IP addresses as the old one. It could happen that the machine-controller sets the NodeOwner label on the old Node object to the UID of the new Machine. That can break the logic for removing the old Node object, as it depends on the NodeOwner label having the UID of the Machine that's being deleted.

There are two fixes:

  • The getNode function which is used when adding the NodeRef now confirms that the instance and node names match for Hetzner Machines/instances. Previously, we only checked IP addresses, so if we a new Machine has the same addresses as the old one, it can adopt a node that's to-be-deleted
    • This assumes the name of the Node object will always be the same as the Machine name.
      • The Machine name is not always same as the Node name. For example, on AWS, the Machine name is based on the MachineDeployment name, while the Node name is the internal DNS name.
    • On providers with CCM (internal or external), that's not a problem because CCM is responsible for setting the ProviderID on the Node object. If the ProviderID is present, it will be used by the getNode function instead of IP addresses
  • The deleteNodeForMachine function now uses NodeRef if it's present. We use NodeRef for all other things when deleting a machine, such as evicting the node and deleting the cloud provider instance
    • The old logic (based on the NodeOwner label) is kept in the place but it's used only as a fall-back if the NodeRef is not present

Optional Release Note:

* Delete the Node object by NodeRef if it's present
* Compare node and instances names when applying the NodeOwner label for Hetzner machines/instances

@kubermatic-bot kubermatic-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 23, 2020
@xmudrii xmudrii changed the title RFC: Delete Nodes by NodeRef [WIP] RFC: Delete Nodes by NodeRef Aug 23, 2020
@kubermatic-bot kubermatic-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 23, 2020
@kubermatic-bot kubermatic-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 24, 2020
@xmudrii xmudrii changed the title [WIP] RFC: Delete Nodes by NodeRef RFC: Delete Nodes by NodeRef and compare node and machine names when getting a Hetzner node Aug 24, 2020
@kubermatic-bot kubermatic-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 24, 2020
@xmudrii
Copy link
Member Author

xmudrii commented Aug 24, 2020

/assign @kron4eg @xrstf

@xmudrii
Copy link
Member Author

xmudrii commented Aug 25, 2020

/retest

@kubermatic-bot kubermatic-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 26, 2020
@xmudrii xmudrii changed the title RFC: Delete Nodes by NodeRef and compare node and machine names when getting a Hetzner node Delete Nodes by NodeRef and compare node and machine names when getting a Hetzner node Aug 26, 2020
@irozzo-1A
Copy link
Contributor

/approve

@kubermatic-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: irozzo-1A, xmudrii

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@irozzo-1A
Copy link
Contributor

/lgtm

@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 26, 2020
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 875072471baa1beb0bda5c2628e99d3abcea4841

@xmudrii
Copy link
Member Author

xmudrii commented Aug 27, 2020

/hold cancel

@kubermatic-bot kubermatic-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 27, 2020
@kubermatic-bot kubermatic-bot merged commit 01f2d47 into kubermatic:master Aug 27, 2020
@xmudrii xmudrii deleted the noderef-deletion branch August 27, 2020 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants