Skip to content

Managed Cluster Topology and cluster-autoscaler #8217

Closed
kubernetes/autoscaler
#8126
@MaxFedotov

Description

@MaxFedotov

Hi @elmiko and @sbueringer.

Want to go back to this issue after #7990 was merged.
If I understand correctly, now it is possible to add a new machineDeployment to managed Cluster and specify cluster-autoscaler annotations in spec like:

    workers:
      machineDeployments:
      - class: default-worker
        metadata:
          annotations:
            cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "4"
            cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "2"
        name: md-1

and defaulting webhook will set replicas field of created machineDeployment to a value specified in cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size annotation.

But if I will update this field, for example, set cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "3" in Cluster spec, defaulting webhook will not update replicas field, due to this check

func calculateMachineDeploymentReplicas(ctx context.Context, oldMD *MachineDeployment, newMD *MachineDeployment, dryRun bool) (int32, error) {
	// If replicas is already set => Keep the current value.
	if newMD.Spec.Replicas != nil {
		return *newMD.Spec.Replicas, nil
	}

And this completely prevents an ability to use cluster-autoscaler annotations when using managed Cluster topologies and machineDeployment is already created. Because if machineDeployments are managed by topology controller and workers .machineDeployments .[].metadata .annotations are updated it generates a Patch which includes only updated fields. So a user doesn't have any ability to set newMD.Spec.Replicas to nil value.

Maybe it is possible to modify this check and exclude machineDeployments created by topology controller from it?

Thanks!

/kind feature

Activity

added
kind/featureCategorizes issue or PR as related to a new feature.
needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.
on Mar 2, 2023
sbueringer

sbueringer commented on Mar 2, 2023

@sbueringer
Member

But if I will update this field, for example, set cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "3" in Cluster spec, defaulting webhook will not update replicas field, due to this check

In my opinion this is intended behavior and a limitation/feature of the autoscaler.

If you use a MachineDeployment without ClusterClass you would get the same effect.

Please also see my comment here: #7293 (comment)

elmiko

elmiko commented on Mar 2, 2023

@elmiko
Contributor

If you use a MachineDeployment without ClusterClass you would get the same effect.

i guess my question here is, if using MachineDeployment without ClusterClass, i could update the min size annotation and then update the replicas to the new value. how would i do that with a ClusterClass managing the MachineDeployment?

sbueringer

sbueringer commented on Mar 2, 2023

@sbueringer
Member

You would update the annotation, then set the replica field in Cluster.spec.topology and then unset it again.

sbueringer

sbueringer commented on Mar 2, 2023

@sbueringer
Member

Not ideal, but I think it's better then if we try to fixup the replica field on the MD based on the annotation and thus break the functionality that autoscaler doesn't act when replicas is outside of the range

sbueringer

sbueringer commented on Mar 2, 2023

@sbueringer
Member

or instead of setting / unsetting the replica field in Cluster.spec.topology you can also only set it directly on the MD

elmiko

elmiko commented on Mar 2, 2023

@elmiko
Contributor

You would update the annotation, then set the replica field in Cluster.spec.topology and then unset it again.

i thought we had to leave this field unset if we are using with the autoscaler? (sorry if i've got the details wrong here)

Not ideal, but I think it's better then if we try to fixup the replica field on the MD based on the annotation and thus break the functionality that autoscaler doesn't act when replicas is outside of the range

this is my concern too

or instead of setting / unsetting the replica field in Cluster.spec.topology you can also only set it directly on the MD

ack, i'm curious to hear what @MaxFedotov has to say as he is using this feature much more than i am currently.

sbueringer

sbueringer commented on Mar 2, 2023

@sbueringer
Member

You would update the annotation, then set the replica field in Cluster.spec.topology and then unset it again.

i thought we had to leave this field unset if we are using with the autoscaler? (sorry if i've got the details wrong here)

It has to be unset for autoscaler to be able to control the field continuously (otherwise they both would try to set the field continuously). But if the goal is basically to temporarily take control of the field to bring it in the min/max range you can set it and directly unset it afterwards

elmiko

elmiko commented on Mar 2, 2023

@elmiko
Contributor

But if the goal is basically to temporarily take control of the field to bring it in the min/max range you can set it and directly unset it afterwards

that sounds like the user can at least work around this issue manually

sbueringer

sbueringer commented on Mar 2, 2023

@sbueringer
Member

I think it's definitely not worse than with the standalone MachineDeployment.

In both cases they:

  1. change the min-size annotation
  2. set the replica field to bring it in range (with Cluster topology they can either do it also directly on the MD or by setting and unsetting via Cluster.spec.topology)
MaxFedotov

MaxFedotov commented on Mar 2, 2023

@MaxFedotov
ContributorAuthor

Not ideal, but I think it's better then if we try to fixup the replica field on the MD based on the annotation and thus break the functionality that autoscaler doesn't act when replicas is outside of the range

Seems like I don't understand what can go wrong in this case :( If we will set replicas on MD equal to autoscaler min annotation, how it can be outside of a range?

I think it's definitely not worse than with the standalone MachineDeployment.

I have to argue about it. In the case of standalone MD, a user needs a single atomic update operation (set annotations, remove replicas). In the case of a Cluster Topology user needs at least 2 operations (but really it will be 3 - update\get\update), which can be hard to make atomically.

sbueringer

sbueringer commented on Mar 2, 2023

@sbueringer
Member

Seems like I don't understand what can go wrong in this case :( If we will set replicas on MD equal to autoscaler min annotation, how it can be outside of a range?

Autoscaler today intentionally does not act when the value of the replica field is outside of the min/max range. If we want to preserve this behavior we can't implement something that always changes the replica field to be inside of the range.

I think it's definitely not worse than with the standalone MachineDeployment.

I have to argue about it. In the case of standalone MD, a user needs a single atomic update operation (set annotations, remove replicas). In the case of a Cluster Topology user needs at least 2 operations (but really it will be 3 - update\get\update), which can be hard to make atomically.

That's a valid point

MaxFedotov

MaxFedotov commented on Mar 3, 2023

@MaxFedotov
ContributorAuthor

Autoscaler today intentionally does not act when the value of the replica field is outside of the min/max range. If we want to preserve this behavior we can't implement something that always changes the replica field to be inside of the range.

But if the user specified intentionally using cluster autoscaler annotation that he wants min number of nodes in nodegroup to be equal to some number, wouldn't it be the correct behavior to set a current number of replicas to this number (because autoscaler won't do it, it will only ensure that the number of replicas won't drop below this number)?

40 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Labels

kind/featureCategorizes issue or PR as related to a new feature.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.priority/backlogHigher priority than priority/awaiting-more-evidence.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @elmiko@sbueringer@fabriziopandini@MaxFedotov@k8s-ci-robot

    Issue actions

      Managed Cluster Topology and cluster-autoscaler · Issue #8217 · kubernetes-sigs/cluster-api