Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vertical Scaling of Pods #21

Closed
13 of 18 tasks
MikeSpreitzer opened this issue Jul 12, 2016 · 69 comments
Closed
13 of 18 tasks

Vertical Scaling of Pods #21

MikeSpreitzer opened this issue Jul 12, 2016 · 69 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/node Categorizes an issue or PR as relevant to SIG Node. stage/beta Denotes an issue tracking an enhancement targeted for Beta status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team

Comments

@MikeSpreitzer
Copy link
Member

MikeSpreitzer commented Jul 12, 2016

Description

Make it possible to vary the resource limits on a pod over its lifetime. In particular, this is valuable for pets (i.e., pods that are very costly to destroy and re-create).

This was discussed in the Node SIG meeting of 12 July 2016, where it was noted that this is a big cross-cutting issue and that @ncdc might be an appropriate owner.

Progress Tracker

  • Before Alpha
    • Design Approval
    • Write (code + tests + docs) then get them merged. ALL-PR-NUMBERS
      • Code needs to be disabled by default. Verified by code OWNERS
      • Minimal testing
      • Minimal docs
        • cc @kubernetes/docs on docs PR
        • cc @kubernetes/feature-reviewers on this issue to get approval before checking this off
        • New apis: Glossary Section Item in the docs repo: kubernetes/kubernetes.github.io
      • Update release notes
  • Before Beta
    • Testing is sufficient for beta
    • User docs with tutorials
      • Updated walkthrough / tutorial in the docs repo: kubernetes/kubernetes.github.io
      • cc @kubernetes/docs on docs PR
      • cc @kubernetes/feature-reviewers on this issue to get approval before checking this off
    • Thorough API review
      • cc @kubernetes/api
  • Before Stable
    • docs/proposals/foo.md moved to docs/design/foo.md
      • cc @kubernetes/feature-reviewers on this issue to get approval before checking this off
    • Soak, load testing
    • detailed user docs and examples
      • cc @kubernetes/docs
      • cc @kubernetes/feature-reviewers on this issue to get approval before checking this off

FEATURE_STATUS is used for feature tracking and to be updated by @kubernetes/feature-reviewers.
FEATURE_STATUS: IN_DEVELOPMENT

More advice:

Design

  • Once you get LGTM from a @kubernetes/feature-reviewers member, you can check this checkbox, and the reviewer will apply the "design-complete" label.

Coding

  • Use as many PRs as you need. Write tests in the same or different PRs, as is convenient for you.
  • As each PR is merged, add a comment to this issue referencing the PRs. Code goes in the http://github.com/kubernetes/kubernetes repository,
    and sometimes http://github.com/kubernetes/contrib, or other repos.
  • When you are done with the code, apply the "code-complete" label.
  • When the feature has user docs, please add a comment mentioning @kubernetes/feature-reviewers and they will
    check that the code matches the proposed feature and design, and that everything is done, and that there is adequate
    testing. They won't do detailed code review: that already happened when your PRs were reviewed.
    When that is done, you can check this box and the reviewer will apply the "code-complete" label.

Docs

  • Write user docs and get them merged in.
  • User docs go into http://github.com/kubernetes/kubernetes.github.io.
  • When the feature has user docs, please add a comment mentioning @kubernetes/docs.
  • When you get LGTM, you can check this checkbox, and the reviewer will apply the "docs-complete" label.
@erictune
Copy link
Member

@MikeSpreitzer Did you reach any kind of consensus within Node SIG about how to solve this issue? Do you have a group of people who are ready to start coding up that agreed-upon thing? If not, it might be a bit early to open an issue.

@MikeSpreitzer
Copy link
Member Author

I am new here and was told this has been a long-standing desire, some work has already been accomplished, and some planning has already been done. I asked how to pull all the existing thinking together and organize to get onto a path to making it happen, and was told to start here.

@davidopp
Copy link
Member

davidopp commented Jul 12, 2016

@timothysc
Copy link
Member

@idvoretskyi idvoretskyi modified the milestone: v1.4 Jul 18, 2016
@idvoretskyi idvoretskyi added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Aug 4, 2016
@alex-mohr alex-mohr modified the milestones: v1.5, v1.4 Aug 17, 2016
@alex-mohr
Copy link

Doesn't appear to have had any traction in 1.4, so pushing to 1.5 -- chime in if that's incorrect.

@ConnorDoyle
Copy link
Member

Hi @MikeSpreitzer, we (Intel) would like to help out with this in the 1.5 timeline. Can we start by listing the goals/requirements as currently understood, maybe in a shared doc?

This feature is quite large. Previous discussions suggest breaking down into phases.

It seems like some dependencies can be broken off and parallelized, for example enabling in-place update for compressible resources.

@MikeSpreitzer
Copy link
Member Author

Clearly this is not going to land in 1.4. Yes, let's start by breaking this big thing down into phases and pieces. Would someone with more background on this like to take a whack at that?

@davidopp
Copy link
Member

davidopp commented Aug 24, 2016

A design doc would be a good start, but even before that, we need some open-ended discussion to discuss what the goal and requirements are. Maybe we should discuss in kubernetes/kubernetes#10782 ? That discussion is more than a year old but I'd like to avoid opening another issue (and the issues repo is definitely not the right place for design discussions).

@fgrzadkowski
Copy link

@idvoretskyi
Copy link
Member

@MikeSpreitzer @kubernetes/sig-node can you clarify the actual status of the feature?

@fgrzadkowski
Copy link

@kubernetes/autoscaling

@fgrzadkowski
Copy link

Btw, I think this feature should be discussed and sponsored by sig-autoscaling, not sig-node. Obviously there are number of features/changes on the node level to make this work correctly, but I strongly believe we should keep it within aforementioned sig. Any thoughts around that?

@idvoretskyi
Copy link
Member

@fgrzadkowski if you as an SIG-Autoscaling lead would like to sponsor the feature, I have no objections.

@idvoretskyi idvoretskyi added the sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. label Oct 13, 2016
@idvoretskyi
Copy link
Member

@kubernetes/sig-node @dchen1107 are you going to cooperate with @kubernetes/autoscaling on this feature work or you'd prefer the SIG Autoscaling only to work on?

@idvoretskyi
Copy link
Member

@MikeSpreitzer @kubernetes/autoscaling any updates on this feature?

@DirectXMan12
Copy link
Contributor

cc @derekwaynecarr

From the autoscaling side, we're blocked on the node changes. From the node side, I think that needs an interface in the CRI to vary resource limits, which we might not see for a while.

@idvoretskyi
Copy link
Member

@MikeSpreitzer Does this feature target alpha for 1.5?

@fgrzadkowski
Copy link

This feature will not land in 1.5. Removing milestone.

We will be working on this feature for 1.6. Reassigning to folks who are already working on a design and will pursue implementation in Q1.

@fgrzadkowski fgrzadkowski modified the milestones: next-milestone, v1.5 Nov 17, 2016
@davidopp
Copy link
Member

IIUC pre-requisites for at least part of this are

  1. historical data from Infrastore
  2. kubelet in-place resource update

I'm not sure what your plan is for (1) but from my last chat with Dawn (2) wouldn't be feasible to begin implementing before Q2. (It's not a trivial feature.)

cc/ @dchen1107

@fgrzadkowski
Copy link

As explained in kubernetes/kubernetes#10782 (comment):

  • We don't need in-place update for MVP of vertical pod autoscaler. We can just be more conservative and recreate pods via deployments
  • Infrastore would be useful, but for MVP we can just aggregate this data in VPA controller if we don't have infrastore before that time or we can read this information from a monitoring pipeline

@k8s-ci-robot k8s-ci-robot removed the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Aug 4, 2018
@justaugustus justaugustus added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Aug 4, 2018
@zparnold
Copy link
Member

Hey there! @MikeSpreitzer I'm the wrangler for the Docs this release. Is there any chance I could have you open up a docs PR against the release-1.12 branch as a placeholder? That gives us more confidence in the feature shipping in this release and gives me something to work with when we start doing reviews/edits. Thanks! If this feature does not require docs, could you please update the features tracking spreadsheet to reflect it?

@jimangel
Copy link
Member

@mwielgus @kgrygiel Bump for docs ☝️

@justaugustus
Copy link
Member

@mwielgus @kgrygiel --
Any update on docs status for this feature? Are we still planning to land it for 1.12?
At this point, code freeze is upon us, and docs are due on 9/7 (2 days).
If we don't here anything back regarding this feature ASAP, we'll need to remove it from the milestone.

cc: @zparnold @jimangel @tfogo

@mwielgus
Copy link
Contributor

mwielgus commented Sep 7, 2018

This is landing around 1.12 however it is a launch of an independent addon. It is not included in 1.12 Kubernetes release. Sig-Architecture, at the beginning of this cycle, decided to keep the VPA api as CRD and thus not bind it to any particular K8S release.

@justaugustus
Copy link
Member

Thanks for the update!

@karthickrajamani
Copy link

karthickrajamani commented Sep 7, 2018

@justaugustus, so can we continue to use this issue for tracking live, in-place vertical scaling ( kubernetes/community#1719) which is what it was created for originally by Mike, given that VPA is not bound to a particular K8S release?

@justaugustus
Copy link
Member

@karthickrajamani -- yep, it's fine to keep tracking here.

@zparnold
Copy link
Member

@mwielgus Are we going to have some documentation for this feature before the release date of 1.12? Since it's independent, I'm willing to let it not be counted in this release as long as it does get to have some attention before it's officially released as an add-on. Does that sound good?

@mwielgus
Copy link
Contributor

@zparnold There will be no extra documentation to include in 1.12.

@zparnold
Copy link
Member

zparnold commented Sep 12, 2018 via email

@claurence
Copy link

Kubernetes 1.13 is going to be a 'stable' release since the cycle is only 10 weeks. We encourage no big alpha features and only consider adding this feature if you have a high level of confidence it will make code slush by 11/09. Are there plans for this enhancement to graduate to alpha/beta/stable within the 1.13 release cycle? If not, can you please remove it from the 1.12 milestone or add it to 1.13?

We are also now encouraging that every new enhancement aligns with a KEP. If a KEP has been created, please link to it in the original post. Please take the opportunity to develop a KEP

@kacole2 kacole2 added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Oct 8, 2018
@kacole2
Copy link
Member

kacole2 commented Oct 8, 2018

Hi. Following up from @claurence.
This enhancement has been tracked before, so we'd like to check in and see if there are any plans for this to graduate stages in Kubernetes 1.13. This release is targeted to be more ‘stable’ and will have an aggressive timeline. Please only include this enhancement if there is a high level of confidence it will meet the following deadlines:

  • Docs (open placeholder PRs): 11/8
  • Code Slush: 11/9
  • Code Freeze Begins: 11/15
  • Docs Complete and Reviewed: 11/27

Please take a moment to update the milestones on your original post for future tracking and ping @kacole2 if it needs to be included in the 1.13 Enhancements Tracking Sheet

Thanks!

@mwielgus mwielgus removed this from the v1.12 milestone Oct 11, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 9, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 7, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ingvagabund pushed a commit to ingvagabund/enhancements that referenced this issue Apr 2, 2020
brahmaroutu pushed a commit to brahmaroutu/enhancements that referenced this issue Aug 4, 2020
@sftim
Copy link
Contributor

sftim commented Jul 28, 2023

See kubernetes/kubernetes#116214

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/node Categorizes an issue or PR as relevant to SIG Node. stage/beta Denotes an issue tracking an enhancement targeted for Beta status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Projects
None yet
Development

No branches or pull requests