Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Min at zero - design doc #57

Merged
merged 1 commit into from May 11, 2017
Merged

Conversation

mwielgus
Copy link
Contributor

Ref: #43
cc: @MaciekPytel

@mwielgus mwielgus added this to the CA-0.6 milestone May 11, 2017
@mwielgus mwielgus requested a review from MaciekPytel May 11, 2017 10:23
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 11, 2017

# Introduction

One of the common requests for Cluster Autoscaler (for example: [1], [2]) is the ability to scale some node groups to zero. This would definitely be a very useful feature but the implementation is somehow problematic in ScaleUP due to couple reasons:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either actually link examples or remove 'for example' part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/somehow/somewhat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


One of the common requests for Cluster Autoscaler (for example: [1], [2]) is the ability to scale some node groups to zero. This would definitely be a very useful feature but the implementation is somehow problematic in ScaleUP due to couple reasons:

* [P1] There is no live example of what a new node would look like if the currently zero-sized node group was expanded. The node shape is defined as:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use some other letter for numbering problems (maybe I for 'issue')? I think most people interpret P1 as priority.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* [P3] There is no live example of what DaemonSets would be run on the new node.

In general the above can be summarized as that the full definition of a new node needs to be somehow known before the node is actually created in order to decide whether the creation of a new node from a particular node group makes sense or not. Scale down has no issues with min@0.
Design
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/s/Design/# Design

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

In general the above can be summarized as that the full definition of a new node needs to be somehow known before the node is actually created in order to decide whether the creation of a new node from a particular node group makes sense or not. Scale down has no issues with min@0.
Design

Problems P1, P1A, P1B, P1C, P2, P3 needs to be solved. The primary focus is to create a solution for GCE/GKE but the proposed option should be generic enough to allow to expand this feature to other cloud providers if found needed and business-justified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/needs/need

custom-<cpu_count>-<memory_in_mb>
```

So it also quite easy to get all of capacity information from it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/it/it is


### [P1B] - Node allocatable

In GKE 1.5.6 allocatable for new nodes is equal to capacity. For simplicity we could assume that the new node will have the 90% (or -0.1cpu/-200mb) of capacity. Being wrong or underestimating here is not fatal, most users will probably be OK with this. Once some nodes are present we will have more precise estimates. The worst thing that can happen is that the scale up may not be triggered if the request is exactly at the node capacity - system pods.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not true on non-GKE cluster on GKE. We could at least mention that fact and put some sort of TODO to revisit this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.


# Solution

Given the all information above it should be relatively simple to write a module that given the access to GCP Api and Kubernetes API server. We will expand the NodeGroup interface (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/cloud_provider.go#L40) with a method EstimateNodeShape, taking no parameters and returning NodeInfo (containing api.Node and all pods running by default on the node) or error if unable to do so.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the method name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


only if the current size of the node group is 0 or all of the nodes are unready/broken. Otherwise CA will try to estimate the shape of the node using live examples to avoid repeating any mis-estimation errors.

The EstimateNodeShape will also be run on CA startup to ensure that CA is able to build an example for the node pool should the node group min size was set to 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the node group min size was set to 0 - that is not grammatical

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


# Design

Problems P1, P1A, P1B, P1C, P2, P3 needs to be solved. The primary focus is to create a solution for GCE/GKE but the proposed option should be generic enough to allow to expand this feature to other cloud providers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/P1/1, etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/primary focus/primary focus of this document

custom-<cpu_count>-<memory_in_mb>
```

So it ia also quite easy to get all of the capacity information from it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ia/is

@MaciekPytel
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2017
@mwielgus mwielgus merged commit e17f350 into kubernetes:master May 11, 2017
frobware pushed a commit to frobware/autoscaler that referenced this pull request Mar 19, 2019
…-TestNodeGroupResize

UPSTREAM: <carry>: openshift: Rework TestNodeGroupResize
yaroslava-serdiuk pushed a commit to yaroslava-serdiuk/autoscaler that referenced this pull request Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/provider/gcp Issues or PRs related to gcp provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. documentation lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants