Ability to scale down individual node(s) for RKE2-provisioned clusters #4446

snasovich · 2021-10-25T23:19:29Z

Detailed Description
For RKE1-provisioned clusters, there is currently an option to scale down specific nodes(s).
As dropdown action for a single node:

As an action button for multiple selected nodes:

The same functionality should be available for RKE2-provisioned clusters.

Context
This is needed for RKE2 provisioning parity with RKE1

Additional Details
It should be possible to achieve this by setting cluster.k8s.io/delete-machine annotation on the node(s) to be scaled down before calling back-end to update node pool(s) with an appropriate number of node counts per each affected node pool.

Also, it looks like RKE1 case may be allowing invalid deletion requests (like scaling down the only control plane node). It would be nice to avoid such issues in RKE2 implementation. For example, I've managed to break my RKE1 cluster by attempting to scale down the only CP node and then scaling it back up (interestingly it was stuck at "Waiting for node to be removed from cluster" and operational until I attempted to scale node pool back up).

The text was updated successfully, but these errors were encountered:

gaktive · 2021-11-05T16:36:00Z

Per @vincent99, this should be relatively easy. It should be a matter of selecting each of the nodes and set the annotation on them. This would then allow scaling to happen as expected.

gaktive · 2021-11-05T16:38:28Z

@snasovich we'll push this to 2.6.4 for now but if you do need this, Vince does have some capacity.

richard-cox · 2022-01-21T11:12:16Z

@snasovich To confirm what steve/norman resource should the cluster.k8s.io/delete-machine annotation be set on (v1/management.cattle.io.nodes, v3/nodes, etc)?

I've tried setting the annotation to "true" on cluster.x-k8s.io.machine and then scaling the pool via the normal scaling call on the cluster.x-k8s.io.machinedeployment

snasovich · 2022-01-25T18:07:31Z

@thedadams , could you please help answering Richard's question above?

thedadams · 2022-01-25T18:16:15Z

@richard-cox Sorry, there was a typo in Sergey's original message. The annotation does go on the cluster.x-k8s.io.machine object, but the annotation is cluster.x-k8s.io/delete-machine.

Auston-Ivison-Suse · 2022-03-01T18:21:28Z

Further Testing
Rancher setup: rancher version: v2.6-head(e93a53c)
Downstream cluster: RKE2 EC2, k8s: v1.21.9+rke2r1

Previously Failed Test Cases:

In an rke2 cluster - 1etcd, 1 cp, 2 worker nodes --> Scale down ETcd nodes. User should NOT be able to Scale Down Etcd nodes (Please note the behavior when the etcd nodes is 1) This now Passes

Further testing
Why are we given the chance to delete a single node from the kebab menus next to a singular node?

While doing this then attempting to scale the node up no longer has a node reference when attempting to bring up another cluster.

To Repeat This Issue

Navigate to cluster management and machine pools
Click the kebab menu on a single node cluster component (i.e. etcd)
press delete

Relevant ScreenShots

Auston-Ivison-Suse · 2022-03-01T18:22:44Z

@richard-cox do you think the last comment is an issue?

richard-cox · 2022-03-02T09:08:13Z

@richard-cox do you think the last comment is an issue?

This should be fine. From my understanding a deleted machine should come back, so may be helpful if that instance is misbehaving. Whereas a scaled down machine will never come back and is permanent

Auston-Ivison-Suse · 2022-03-02T15:54:59Z

Setup For Testing
Rancher setup: rancher version: v2.6-head(c49139d)
Downstream cluster: RKE2 EC2, k8s: v1.21.9+rke2r1

Failed Test Cases:

In an rke2 cluster - 1etcd, 1 cp, 2 worker nodes --> Scale down ETcd nodes. User should NOT be able to Scale Down Etcd nodes (Please note the behavior when the etcd nodes is 1)

Debugging

So the option to scale down etcd with a singular etcd is available.

The moment you scale the node down you will get the following error:

Could not scrape join URL from periodic output (exit code: 0, length: 0) for machine auston-rke2-auston-etcd-645f8895b4-w4ppf Over the cluster.

It appears the node still exists in the node driver's machine provider so it wasn't fully deleted.
But within rancher the deleting node hangs.
You can also edit the config and bring up another etcd node, this will appear to remove the etcd node from rancher, but the deleted etcd still exists within the machine provider.

Screenshots

Auston-Ivison-Suse · 2022-03-02T15:55:33Z

Moving to done seeing as @richard-cox says the expected behavior was seen in my testing and the previously failed test case now passes.

jtravee · 2022-03-16T22:25:39Z

Confirmed with @catherineluse and @gaktive to add release note label.

snasovich added the area/rke2 label Oct 25, 2021

snasovich added this to the v2.6.3 milestone Oct 25, 2021

gaktive modified the milestones: v2.6.3, v2.6.4 Nov 5, 2021

gaktive added the [zube]: To Triage label Nov 5, 2021

gaktive assigned vincent99 Nov 5, 2021

gaktive added the area/clusterprovisioningv2 label Dec 3, 2021

gaktive mentioned this issue Dec 3, 2021

EPIC: Cluster Provisioning v2 GA #3346

Closed

31 tasks

gaktive added the status/release-blocker label Dec 13, 2021

snasovich mentioned this issue Dec 15, 2021

[RKE2] scale down button is not shown on rke2 clusters, but is for rke1 in the manage cluster page rancher/rancher#35756

Closed

Sahota1225 mentioned this issue Jan 5, 2022

[Group] RKE2 Provisioning parity work for RKE2 Provisioning GA rancher/rancher#36044

Closed

24 tasks

gaktive assigned richard-cox Jan 11, 2022

gaktive added [zube]: Next Up and removed [zube]: To Triage labels Jan 11, 2022

richard-cox added [zube]: Working and removed [zube]: Next Up labels Jan 20, 2022

richard-cox unassigned vincent99 Jan 20, 2022

richard-cox mentioned this issue Jan 21, 2022

Scale down specific RKE2 machine #4968

Merged

richard-cox added [zube]: Backend Blocked status/waiting-backend and removed [zube]: Working labels Jan 21, 2022

richard-cox added [zube]: Next Up and removed status/waiting-backend [zube]: Backend Blocked labels Jan 25, 2022

Sahota1225 added the team/area2 Hostbusters label Jan 26, 2022

richard-cox added [zube]: To Test and removed [zube]: Review labels Mar 1, 2022

richard-cox closed this as completed in #5200 Mar 1, 2022

zube bot added [zube]: Done and removed [zube]: To Test labels Mar 1, 2022

github-actions bot reopened this Mar 1, 2022

zube bot added [zube]: To Triage and removed [zube]: Done labels Mar 1, 2022

github-actions bot added [zube]: To Test and removed [zube]: To Triage labels Mar 1, 2022

Auston-Ivison-Suse added [zube]: QA Next up and removed [zube]: To Test labels Mar 1, 2022

Auston-Ivison-Suse added [zube]: Reopened and removed [zube]: QA Working labels Mar 1, 2022

richard-cox added [zube]: To Test and removed [zube]: Reopened labels Mar 2, 2022

Auston-Ivison-Suse closed this as completed Mar 2, 2022

Auston-Ivison-Suse added [zube]: Done and removed [zube]: To Test labels Mar 2, 2022

jtravee added the release-note label Mar 16, 2022

sowmyav27 mentioned this issue Sep 13, 2022

[Rancher2] Documentation for RKE2 provisioning rancher/rancher-docs#79

Open

37 tasks

zube bot removed the [zube]: Done label Jun 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to scale down individual node(s) for RKE2-provisioned clusters #4446

Ability to scale down individual node(s) for RKE2-provisioned clusters #4446

snasovich commented Oct 25, 2021 •

edited

Loading

gaktive commented Nov 5, 2021

gaktive commented Nov 5, 2021

richard-cox commented Jan 21, 2022

snasovich commented Jan 25, 2022

thedadams commented Jan 25, 2022

Auston-Ivison-Suse commented Mar 1, 2022 •

edited

Loading

Auston-Ivison-Suse commented Mar 1, 2022

richard-cox commented Mar 2, 2022

Auston-Ivison-Suse commented Mar 2, 2022

Auston-Ivison-Suse commented Mar 2, 2022

jtravee commented Mar 16, 2022

Ability to scale down individual node(s) for RKE2-provisioned clusters #4446

Ability to scale down individual node(s) for RKE2-provisioned clusters #4446

Comments

snasovich commented Oct 25, 2021 • edited Loading

gaktive commented Nov 5, 2021

gaktive commented Nov 5, 2021

richard-cox commented Jan 21, 2022

snasovich commented Jan 25, 2022

thedadams commented Jan 25, 2022

Auston-Ivison-Suse commented Mar 1, 2022 • edited Loading

Auston-Ivison-Suse commented Mar 1, 2022

richard-cox commented Mar 2, 2022

Auston-Ivison-Suse commented Mar 2, 2022

Auston-Ivison-Suse commented Mar 2, 2022

jtravee commented Mar 16, 2022

snasovich commented Oct 25, 2021 •

edited

Loading

Auston-Ivison-Suse commented Mar 1, 2022 •

edited

Loading