Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📖 add proposal for opt-in autoscaling from zero #4283

Merged

Conversation

elmiko
Copy link
Contributor

@elmiko elmiko commented Mar 10, 2021

What this PR does / why we need it:

This PR adds a proposal which describes a mechanism to enable scale from zero operation with the cluster autoscaler.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 10, 2021
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 10, 2021
@elmiko elmiko force-pushed the add-scale-from-zero-proposal branch 2 times, most recently from 1208a3b to 997583a Compare March 10, 2021 17:39
@fabriziopandini
Copy link
Member

/milestone v0.4.0

@k8s-ci-robot k8s-ci-robot added this to the v0.4.0 milestone Mar 16, 2021
@vincepri vincepri modified the milestones: v0.4.0, v0.4.x Mar 22, 2021
@CecileRobertMichon CecileRobertMichon modified the milestones: v0.4.x, v0.4 Mar 22, 2021
@elmiko elmiko force-pushed the add-scale-from-zero-proposal branch from 997583a to 37e9800 Compare March 31, 2021 20:29
@elmiko
Copy link
Contributor Author

elmiko commented Mar 31, 2021

updated to address comments

  • spelling fixes
  • added mention of MachinePools to non-goals/future work
  • removed some ambiguity around descriptions
  • changed name on annotation to include autoscaler

the more i work on the PoC, the more i am thinking it might be appropriate to have the cluster autoscaler retrieve the infrastructure ref instead of reconciling the resource hints into the status on machinesets and machinedeployments.

@elmiko elmiko force-pushed the add-scale-from-zero-proposal branch from 37e9800 to add9087 Compare April 1, 2021 13:08
@enxebre
Copy link
Member

enxebre commented Apr 8, 2021

this seems reasonable to me overall.

@elmiko
Copy link
Contributor Author

elmiko commented Apr 16, 2021

just curious if there are any further thoughts about the cluster-autoscaler directly inspecting the infrastructure templates instead of reconciling that information into the Machine[Set|Deployment]?

@elmiko
Copy link
Contributor Author

elmiko commented May 4, 2021

since it seems like there is no further discussion on the implementation details, i am going to make some changes here to reflect what i am seeing in testing. specifically i would like to not have the machine[sets|deployments] be reconciled and just leave the hinting information as an optional field in the infrastructure ref template.

@elmiko
Copy link
Contributor Author

elmiko commented Jul 7, 2021

no update here, i still need to fixup this enhancement to be in line with my proof of concept. after i have the PoC working well, i would like to demo for the group. hopefully i can finish this work in the next few weeks.

@fabriziopandini
Copy link
Member

/lgtm
thank you really much for the hard work and the perseverance in moving forward this effort. Really appreciated!

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 23, 2021
@elmiko
Copy link
Contributor Author

elmiko commented Sep 23, 2021

update

  • add more descriptive users to user stories
  • change type on ResourceName and ResourceList to better reflect their provenance

i think the only outstanding discussion is #4283 (review) , unless there is a blocker there i think i've addressed all the concerns.

@elmiko
Copy link
Contributor Author

elmiko commented Sep 29, 2021

update

  • removed autoscaler prefix from API spec
  • modified annotations to follow suggestions from review

@elmiko
Copy link
Contributor Author

elmiko commented Sep 29, 2021

update

  • changed GPU API reference to include type
  • added gpu-type and gpu-count annotations

@elmiko elmiko force-pushed the add-scale-from-zero-proposal branch from 1bb2009 to 69d4703 Compare October 6, 2021 16:34
@elmiko elmiko force-pushed the add-scale-from-zero-proposal branch from 69d4703 to 287efed Compare October 6, 2021 16:37
@elmiko
Copy link
Contributor Author

elmiko commented Oct 6, 2021

update

  • change annotations to use capacity.cluster-autoscaler.kubernetes.io prefix
  • removed extra scoping on capacity status object

Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 6, 2021
@randomvariable
Copy link
Member

/lgtm

@sbueringer
Copy link
Member

As discussed in the CAPI meeting today, lazy consensus until ~Friday

@sbueringer
Copy link
Member

/lgtm

1 similar comment
@fabriziopandini
Copy link
Member

/lgtm

@vincepri
Copy link
Member

vincepri commented Oct 9, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2021
@k8s-ci-robot k8s-ci-robot merged commit ee1704a into kubernetes-sigs:main Oct 9, 2021
@elmiko elmiko deleted the add-scale-from-zero-proposal branch October 11, 2021 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet