Nodes states do no change to cordon and removing during scaled down when drain on delete is enabled #5220

anupama2501 · 2022-02-25T17:29:07Z

Setup

Rancher version: 2.6-head 8c785a1
Browser type & version: Chrome

Describe the bug
Nodes status should change from removing >> cordoned >> deleted when drain on delete option is enabled on the node driver pools. Nodes status changed from active >> deleted and the node is deleted in both cluster management page and the cluster explorer >> cluster >> nodes page.

To Reproduce

Create an rke2 node driver cluster with 3 node pools 1 for each role with 3 worker, 3 etcd, 2 cp nodes
Enable drain on delete feature for each of the node pools.
From cluster management >> machine pools >> select the worker node w1 >> scale down

Result
The state on the node changes to deleted from active and the node is deleted

Expected Result
From either cluster management page or explorer >> cluster >> nodes page:
The state on the node is expected to change from active >> removing >> cordoned >> deleted

Additional context
Related issue: rancher/rancher#36782

The text was updated successfully, but these errors were encountered:

richard-cox · 2022-03-07T17:26:32Z

Given that we've restricted cordoning / draining state to the kube node list (cluster explorer / cluster / node page) those states should only be showing up there.

I saw some interest behaviour when enabling Drain Before Delete for the machine pool containing the worker machines though. It seemed to sequentially replace them one by one. After waiting for this process to finish (all three replaced) I could scale down one of the new machines (via the machine's action menu on the right, not the pool scale buttons)

The machine went into a Deleting state and it's associated kube node in the cluster explorer / cluster / node page went into a Cordoned state
After about 10 seconds both machine and kube node were removed

This behaviour seems correct, doesn't quite go through all the stages in the issue description but some may not be seen due to speed (deleting --> deleted --> removed).

@anupama2501 Could the issue you're seeing be related to scaling down nodes that weren't recreated following the change to the machine pools Drain Before Delete

anupama2501 · 2022-03-10T01:05:04Z

Hi @richard-cox thank you for the detailed write up.

I saw some interest behaviour when enabling Drain Before Delete for the machine pool containing the worker machines though. It seemed to sequentially replace them one by one. After waiting for this process to finish (all three replaced) I could scale down one of the new machines (via the machine's action menu on the right, not the pool scale buttons)

This is the expected behavior from this comment here rancher/rancher#35274 (comment)

Could the issue you're seeing be related to scaling down nodes that weren't recreated following the change to the machine pools Drain Before Delete

Could you elaborate on following the change part?

I retried the scenario - I have enabled drain on delete option while creating the clusters. I do see the nodes go into cordoned state and the machines were then scaled down.

richard-cox · 2022-03-15T11:28:36Z

@anupama2501 I wondered if the original issue of the nodes not showing as cordoned might be due to

creating the cluster first without drain on delete enabled
enabled drain on delete after the cluster has come up
(nodes will start to sequentially be recreated)
manually scaling down a specific a node that has not yet been recreated

If you're seeing this working now though when setting drain on delete when the cluster is created ... then all is good. Also thanks for clarifying the expected behaviour on changing a deployment, very helpful.

anupama2501 · 2022-03-15T17:43:22Z

Closing as I see the expected behavior noted in this comment #5220 (comment)

anupama2501 added kind/bug-qa team/area2 Hostbusters labels Feb 25, 2022

anupama2501 added this to the v2.6.4 milestone Feb 25, 2022

anupama2501 self-assigned this Feb 25, 2022

anupama2501 mentioned this issue Feb 25, 2022

"Drain Before Delete" support for RKE2 machine pools rancher/rancher#35274

Closed

sowmyav27 added [zube]: To Triage status/release-blocker labels Feb 25, 2022

richard-cox self-assigned this Mar 7, 2022

richard-cox added [zube]: Working and removed [zube]: To Triage labels Mar 7, 2022

richard-cox added [zube]: To Test and removed [zube]: Working labels Mar 7, 2022

sowmyav27 added [zube]: Reopened and removed [zube]: To Test labels Mar 13, 2022

richard-cox added [zube]: To Test and removed [zube]: Reopened labels Mar 15, 2022

anupama2501 closed this as completed Mar 15, 2022

zube bot reopened this Mar 21, 2022

zube bot closed this as completed Mar 21, 2022

zube bot added [zube]: Done and removed [zube]: To Test labels Mar 21, 2022

zube bot removed the [zube]: Done label Jun 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nodes states do no change to cordon and removing during scaled down when drain on delete is enabled #5220

Nodes states do no change to cordon and removing during scaled down when drain on delete is enabled #5220

anupama2501 commented Feb 25, 2022

richard-cox commented Mar 7, 2022

anupama2501 commented Mar 10, 2022 •

edited

Loading

richard-cox commented Mar 15, 2022

anupama2501 commented Mar 15, 2022

Nodes states do no change to cordon and removing during scaled down when drain on delete is enabled #5220

Nodes states do no change to cordon and removing during scaled down when drain on delete is enabled #5220

Comments

anupama2501 commented Feb 25, 2022

richard-cox commented Mar 7, 2022

anupama2501 commented Mar 10, 2022 • edited Loading

richard-cox commented Mar 15, 2022

anupama2501 commented Mar 15, 2022

anupama2501 commented Mar 10, 2022 •

edited

Loading