Skip to content

Conversation

@mlavacca
Copy link
Contributor

What this PR does / why we need it:
When a machine is deleted, the machine controller now waits for the volumeAttachments deletion before deleting the node.

Which issue(s) this PR fixes (optional, in fixes #<issue number> format, will close the issue(s) when PR gets merged):
Fixes #1189

Special notes for your reviewer:

Optional Release Note:

NONE

@kubermatic-bot kubermatic-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 16, 2022
@moadqassem
Copy link
Member

@mlavacca This seems quite strange and I have two questions:
1- It seems that the issue is related to the vSphere not the other cloud providers, thus why you are doing this on a global level(machine level) ?
2- If the node got deleted before the volume what is the error that you get actually? Because this is quite strange and an issue in the CSI driver itself not machine controller. Usually when a node doesn't exist, CCM/CSI or whatever controller should delete its resources. IIRC, it was possible to delete the volumeAttachment objects and then the PVs are released, PVCs get deleted, via a ticker service.

In general I don't think it is the job for machine controller to remove volumes, that's why we use CSI driver :-) . If this is indeed an issue in the driver, I would strongly recommend to leave a finalizer on the machine object, during the cleanup, check for the node name under the machine status and remove the volumeAttachment. and then remove the finalizer off the machine.

@mlavacca
Copy link
Contributor Author

mlavacca commented Feb 17, 2022

@moadqassem

2- If the node got deleted before the volume what is the error that you get actually? Because this is quite strange and an issue in the CSI driver itself not machine controller.

Yes, there is an issue in the CSI driver for vSphere, and the aiming of this PR is to mitigate that. If the node is deleted before that CSI driver has time to delete the volumeAttachmentes, they will hang forever, and the new pods that will try to mount the associated volume won't be able because of the existence of the outdated volumeAttachments.

In general I don't think it is the job for machine controller to remove volumes, that's why we use CSI driver :-) . If this is indeed an issue in the driver, I would strongly recommend to leave a finalizer on the machine object, during the cleanup, check for the node name under the machine status and remove the volumeAttachment. and then remove the finalizer off the machine.

I'm not removing any volumeAttachment, this code introduces a wait to be sure that all the volumeAttachments are correctly deleted by the CSI driver. For this reason, I do not see why it should be a problem having this behavior for all the cloud providers.

BTW, something didn't work as expected during the e2e tests, I'm going to investigate and debug them

@moadqassem
Copy link
Member

moadqassem commented Feb 17, 2022

I'm not removing any volumeAttachment

That's the thing. I mean if we. can just simply remove those as part of the cleanup process. So in other words, when cleanup is called, check out the volume attachment, and remove them based on the machine referenced node. This way you will not block the node draining/termination and you keep the mitigation locally where it should be until the vmware folks fix the issue is resolved:

kubernetes-sigs/vsphere-csi-driver#359

P.S: it was created almost 2 years ago, but was active again 3 weeks ago so 🤞

@mlavacca
Copy link
Contributor Author

mlavacca commented Feb 17, 2022

I'm not removing any volumeAttachment

That's the thing. I mean if we. can just simply remove those as part of the cleanup process. So in other words, when cleanup is called, check out the volume attachment, and remove them based on the machine referenced node. This way you will not block the node draining/termination and you keep the mitigation locally where it should be until the vmware folks fix the issue is resolved:

kubernetes-sigs/vsphere-csi-driver#359

P.S: it was created almost 2 years ago, but was active again 3 weeks ago so 🤞

Problem is that the volumeAttachments are managed by the CSI driver; if you delete them, they are automatically recreated. Furthermore, they have a finalizer (external-attacher/csi-vsphere-vmware-com) that is added/removed by the CSI driver, therefore I don't see a correct way to manually clean them up.

func (r *Reconciler) deleteNodeForMachine(ctx context.Context, machine *clusterv1alpha1.Machine) (*reconcile.Result, error) {
// List all the volumeAttachments in the cluster; we must be sure that all
// of them will be deleted before deleting the node
volumeAttachments := &storagev1.VolumeAttachmentList{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just abstract this behaviour in a method and only run it if the cloud provider is vSphere.

}

return nil, r.deleteNodeForMachine(ctx, machine)
return r.deleteNodeForMachine(ctx, machine)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call this method here when the cloud provider is vsphere and the Volume attachment are gone.

@kubermatic-bot kubermatic-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 23, 2022
@mlavacca
Copy link
Contributor Author

/retest

@mlavacca
Copy link
Contributor Author

/test pull-machine-controller-e2e-gce

@mlavacca mlavacca requested a review from moadqassem February 23, 2022 14:04
@mlavacca
Copy link
Contributor Author

@moadqassem I implemented your suggested solution, PTAL

@mlavacca
Copy link
Contributor Author

/retest

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
@mlavacca mlavacca force-pushed the wait-volumeattachments-deletion branch from e73baa6 to 6393218 Compare February 28, 2022 20:51
@mlavacca
Copy link
Contributor Author

mlavacca commented Mar 1, 2022

Code has been improved to handle both the node rollout and the node deletion on a single node cluster. @moadqassem PTAL

ErrorQueueLen = 1000
)

type NodeVolumeAttachmentsCleanup struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that we are not gonna need this and we will just change how the node is drained. For example if there is a volume still attached then don't remove the csi driver pod and once no volumeattachment is there then remove the pod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and this is an implementation of that behavior. If there are volumeAttachments:

  1. Cordon the old node
  2. delete all pods using volumes attached to the old node
  3. wait for the CSI-driver to collect the volumeAttachments
  4. drain the old node.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but can't we just fix this in the NodeEviction.Run method? why don't we handle the case over there, instead of having it here. Eventually the code is kinda similar, only difference is, the deletion criteria.

ErrorQueueLen = 1000
)

type NodeVolumeAttachmentsCleanup struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but can't we just fix this in the NodeEviction.Run method? why don't we handle the case over there, instead of having it here. Eventually the code is kinda similar, only difference is, the deletion criteria.

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
@kubermatic-bot kubermatic-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 9, 2022
mlavacca added 2 commits March 9, 2022 11:17
Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
@moadqassem
Copy link
Member

/retest

Copy link
Member

@moadqassem moadqassem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 10, 2022
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 67d0192b6ca5112a0f5afb02d39ab85add9c1667

@kubermatic-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mlavacca, moadqassem

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubermatic-bot kubermatic-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 10, 2022
@kubermatic-bot kubermatic-bot merged commit e35da15 into kubermatic:master Mar 10, 2022
@moadqassem
Copy link
Member

/cherry-pick release/v1.36

@kubermatic-bot
Copy link
Contributor

@moadqassem: #1190 failed to apply on top of branch "release/v1.36":

Applying: Waiting for volumeAttachments deletion
Applying: volumeAttachments check only for vSphere
Applying: ClusterRole updated
Applying: yaml linter fixed
Applying: VolumeAttachments correctly handled
Using index info to reconstruct a base tree...
M	cmd/machine-controller/main.go
M	pkg/controller/machine/machine_controller.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/controller/machine/machine_controller.go
Auto-merging cmd/machine-controller/main.go
CONFLICT (content): Merge conflict in cmd/machine-controller/main.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0005 VolumeAttachments correctly handled
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release/v1.36

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

moadqassem pushed a commit to moadqassem/machine-controller that referenced this pull request Mar 11, 2022
* Waiting for volumeAttachments deletion

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* volumeAttachments check only for vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* ClusterRole updated

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* yaml linter fixed

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* VolumeAttachments correctly handled

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Code factorized

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* renaming

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* fix yamllint

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Logic applied only to vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
(cherry picked from commit e35da15)
moadqassem added a commit that referenced this pull request Mar 11, 2022
* Waiting for volumeAttachments deletion

* volumeAttachments check only for vSphere

* ClusterRole updated

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
(cherry picked from commit e35da15)
moadqassem pushed a commit to kubermatic-bot/machine-controller that referenced this pull request Mar 14, 2022
Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
(cherry picked from commit e35da15)
moadqassem pushed a commit that referenced this pull request Mar 15, 2022
* Fix wrong CPU config

Signed-off-by: Helene Durand <helene@kubermatic.com>

* fix vSphere tests

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

* Waiting for volumeAttachments deletion (#1190)

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
(cherry picked from commit e35da15)
moadqassem pushed a commit to moadqassem/machine-controller that referenced this pull request Mar 15, 2022
* Waiting for volumeAttachments deletion

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* volumeAttachments check only for vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* ClusterRole updated

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* yaml linter fixed

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* VolumeAttachments correctly handled

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Code factorized

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* renaming

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* fix yamllint

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Logic applied only to vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
(cherry picked from commit e35da15)
@moadqassem
Copy link
Member

/cherry-pick release/v1.42

@kubermatic-bot
Copy link
Contributor

@moadqassem: new pull request created: #1212

In response to this:

/cherry-pick release/v1.42

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

LittleFox94 pushed a commit to anexia-it/machine-controller that referenced this pull request Mar 16, 2022
* Waiting for volumeAttachments deletion

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* volumeAttachments check only for vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* ClusterRole updated

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* yaml linter fixed

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* VolumeAttachments correctly handled

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Code factorized

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* renaming

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* fix yamllint

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Logic applied only to vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
@kron4eg
Copy link
Member

kron4eg commented Apr 20, 2022

/cherrypick release/v1.43

@kubermatic-bot
Copy link
Contributor

@kron4eg: new pull request created: #1256

In response to this:

/cherrypick release/v1.43

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kron4eg
Copy link
Member

kron4eg commented Apr 20, 2022

/cherrypick release/v1.37

@kubermatic-bot
Copy link
Contributor

@kron4eg: #1190 failed to apply on top of branch "release/v1.37":

Applying: Waiting for volumeAttachments deletion
Applying: volumeAttachments check only for vSphere
Applying: ClusterRole updated
Applying: yaml linter fixed
Applying: VolumeAttachments correctly handled
Using index info to reconstruct a base tree...
M	cmd/machine-controller/main.go
M	pkg/controller/machine/machine_controller.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/controller/machine/machine_controller.go
Auto-merging cmd/machine-controller/main.go
CONFLICT (content): Merge conflict in cmd/machine-controller/main.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0005 VolumeAttachments correctly handled
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release/v1.37

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kron4eg pushed a commit to kron4eg/machine-controller that referenced this pull request Apr 20, 2022
* Waiting for volumeAttachments deletion

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* volumeAttachments check only for vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* ClusterRole updated

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* yaml linter fixed

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* VolumeAttachments correctly handled

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Code factorized

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* renaming

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* fix yamllint

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Logic applied only to vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>
kubermatic-bot pushed a commit that referenced this pull request Apr 21, 2022
* Waiting for volumeAttachments deletion (#1190)

* Waiting for volumeAttachments deletion

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* volumeAttachments check only for vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* ClusterRole updated

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* yaml linter fixed

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* VolumeAttachments correctly handled

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Code factorized

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* renaming

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* fix yamllint

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* Logic applied only to vSphere

Signed-off-by: Mattia Lavacca <lavacca.mattia@gmail.com>

* disable vSphere tests (#1172)

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

* enable vSphere tests (#1180)

* enable vSphere tests

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

# Conflicts:
#	go.sum

* refactor vSphere datastore cluster

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

* refactor vSphere tests

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

* enable vsphere test

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

* debug vsphere datastore test

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

* debug vsphere datastore test

Signed-off-by: Moath Qasim <moad.qassem@gmail.com>

Co-authored-by: Mattia Lavacca <lavacca.mattia@gmail.com>
Co-authored-by: Moath Qasim <moad.qassem@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wait volumeAttachments deletion before deleting nodes

4 participants