Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Don't try to resolve machine on delete if cluster not ready #2006

Merged
merged 1 commit into from
Apr 11, 2024

Conversation

mdbooth
Copy link
Contributor

@mdbooth mdbooth commented Apr 9, 2024

This fixes a bug where if we created a machine for a cluster which never became ready, we would never be able to 'resolve' the machine and therefore never delete it. We address this situation in several layers:

Firstly, we move the point in the machine controller at which we add the finalizer in the first place. We don't add the finalizer until we're writing resolved, so this situation can never occur for newly created machines. This makes sense because we don't create resources until we've observed that both the finalizer has been added and resolved is up to date, so we don't need the finalizer to protect resources which can't have been created yet.

Secondly, we shortcut the delete flow if the cluster is not ready. This is safe for the same reason as above, but is only relevant to machines created before v0.10.

Lastly we surface and restrict the circumstances in which 'Resolved' is required on delete anyway. On closer inspection, this is only required in the very specific circumstance that the machine has volumes defined, and we are deleting it without the machine having been created. To make this more obvious we split volume deletion out of DeleteInstance and only resolve the machine spec in the event that it's required.

2 other factors make this change larger than it might otherwise be.

We hit a cyclomatic complexity limit in reconcileDelete(), requiring a refactor.

We remove the DeleteInstance tests which, after separating out DeleteVolumes, are quite trivial, and replace them with much more comprehensive set of tests for reconcileDelete.

Fixes #1792

/hold

This fixes a bug where if we created a machine for a cluster which never
became ready, we would never be able to 'resolve' the machine and
therefore never delete it. We address this situation in several layers:

Firstly, we move the point in the machine controller at which we add the
finalizer in the first place. We don't add the finalizer until we're
writing resolved, so this situation can never occur for newly created
machines. This makes sense because we don't create resources until we've
observed that both the finalizer has been added and resolved is up to
date, so we don't need the finalizer to protect resources which can't
have been created yet.

Secondly, we shortcut the delete flow if the cluster is not ready. This
is safe for the same reason as above, but is only relevant to machines
created before v0.10.

Lastly we surface and restrict the circumstances in which 'Resolved' is
required on delete anyway. On closer inspection, this is only required
in the very specific circumstance that the machine has volumes defined,
and we are deleting it without the machine having been created. To make
this more obvious we split volume deletion out of DeleteInstance and
only resolve the machine spec in the event that it's required.

2 other factors make this change larger than it might otherwise be.

We hit a cyclomatic complexity limit in reconcileDelete(), requiring a
refactor.

We remove the DeleteInstance tests which, after separating out
DeleteVolumes, are quite trivial, and replace them with much more
comprehensive set of tests for reconcileDelete.
@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 9, 2024
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 9, 2024
Copy link

netlify bot commented Apr 9, 2024

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
🔨 Latest commit 9ebb706
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/66155057c519b800082af585
😎 Deploy Preview https://deploy-preview-2006--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Contributor

@dulek dulek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bunch of questions inline, but in general it makes a lot of sense to me.

if instanceStatus != nil {
// If no instance was created we currently need to check for orphaned
// volumes. This requires resolving the instance spec.
// TODO: write volumes to status resources on creation so this is no longer required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we? You can still orphan a volume by crashing after a Cinder call, but before saving the ID into the status. Then on next reconcile you can discover the Machine is being deleted, so no reconcile will "pick-up" the created volume. This has super low probability, so we've hit this in Kuryr all the time. ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bastion flow here is not as robust as the machine flow. However, this was the case before and I haven't tried as hard to fix it. The reason is that the bastion delete flow is extremely janky and over-complicated, and what I really want to do is re-design it.

As long as I didn't regress it here I'm ok with not fixing all the edge cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, notice that previously because we only ran any of this when instanceStatus != nil we would always leak volume in this case. With this change we no longer leak volumes. Updating volumes to use the same pattern as ports will make this simpler, but that's for another time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just commenting on a comment here, I don't think we'll be able to get rid of the DeleteVolumes() even after volumes are saved into the ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... kinda. I eventually want to move them into resources, and at that point they'll go into the Adopt() flow.

Comment on lines +367 to +371
// XXX(mdbooth): This looks wrong to me. Surely we should only ever
// disassociate the floating IP here. I would expect the API server
// floating IP to be created and deleted with the cluster. And if the
// delete behaviour is correct, we leak it if the instance was
// previously deleted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an assumption that if APIServerFloatingIP was nil, then the FIP was created by CAPO [1], but the problem is that this doesn't seem to lookup that FIP, but all the FIPs of that instance. If someone attached additional FIP to an instance, e.g. for debugging purposes it will be deleted too. This is horribly wrong to me.

Also no need to detach FIP at all. If we delete the instance and port is deleted, FIP will be detached.

[1]

// If not specified, a new floatingIP is allocated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Notice that this is just code motion, though. I'm pretty sure it's broken in exactly the same way it was broken before, it's just slightly easier to see now.

If I haven't regressed this I don't want to fix it in this PR. I added the comment because I noticed it while I was moving it.

Comment on lines +589 to +592
userData, err := r.getBootstrapData(ctx, machine, openStackMachine)
if err != nil {
return ctrl.Result{}, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails we still don't create the machine, why move it down here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to live in the 'create transaction'. There's no point putting it before setting the finalizer because it won't be used. We also don't expect this to fail at this point because we've already checked the bootstrap data is set, so if it does fail it's a presumably just a low-probability ephemeral failure that could happen anywhere.

Here I've just moved it to be local to where the data is actually used.

Comment on lines +572 to +582
// We requeue if we either added the finalizer or resolved machine
// resources. This means that we never create any resources unless we
// have observed that the finalizer and resolved machine resources were
// successfully written in a previous transaction. This in turn means
// that in the delete path we can be sure that if there are no resolved
// resources then no resources were created.
if changed {
scope.Logger().V(6).Info("Machine resources updated, requeuing")
return ctrl.Result{}, nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a nice pattern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was always the goal of this pattern! It's only the upgrade case which make it at all complicated, and at some point we should be able to delete most of the complications.

Copy link
Contributor

@lentzi90 lentzi90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this!
I did a rerun of my previous test (where I accidentally used a non-existing externalNetworkID). Now it works as I expected. The cluster enters the Failed phase and can easily be deleted 🙂
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lentzi90

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 10, 2024
@mdbooth
Copy link
Contributor Author

mdbooth commented Apr 11, 2024

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 11, 2024
@huxcrux
Copy link
Contributor

huxcrux commented Apr 11, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 11, 2024
@k8s-ci-robot k8s-ci-robot merged commit 7925a4b into kubernetes-sigs:main Apr 11, 2024
9 checks passed
@nguyenhuukhoi
Copy link
Contributor

Thank you.

@EmilienM EmilienM deleted the issue1792 branch April 22, 2024 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Stuck at deleting cluster
6 participants