Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-Place Update of Pod Resources #1287

Open
24 of 26 tasks
vinaykul opened this issue Oct 8, 2019 · 177 comments
Open
24 of 26 tasks

In-Place Update of Pod Resources #1287

vinaykul opened this issue Oct 8, 2019 · 177 comments
Assignees
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lead-opted-in Denotes that an issue has been opted in to a release sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status
Milestone

Comments

@vinaykul
Copy link
Contributor

vinaykul commented Oct 8, 2019

Enhancement Description

Please to keep this description up to date. This will help the Enhancement Team track efficiently the evolution of the enhancement

  1. Identify CRI changes needed for UpdateContainerResources API, define response message for UpdateContainerResources

    • Extend UpdateContainerResources API to return info such as ‘not supported’, ‘not enough memory’, ‘successful’, ‘pending page evictions’ etc.
    • Define expected behavior for runtime when UpdateContainerResources is invoked. Define timeout duration of the CRI call.
      • Resolution: Separate KEP for CRI changes.
        • Discussed draft CRI changes with SIG-Node on Oct 22, and we agreed to do this as an incremental change outside the scope of this KEP, in a new mini-KEP. It does not block implementation of this KEP.
  2. Define behavior when multiple containers are being resized, and UpdateContainerResources fails for one or more containers.

    • One Possible solution:
      • Do not update Status.Resources.Limits if UpdateContainerResources API fails, and keep retrying until it succeeds.
  3. Check with API reviewers if we can keep maps instead list of named sub-objects for ResizePolicy.

    • After discussion with @liggitt , we are going to use list of named subobjects for extensibility.
  4. Can we find a more intuitive name for ResizePolicy?

  5. Can we use ResourceVersion to figure out the ordering of Pod resize requests?

  6. Do we need to add back the ‘RestartPod’ resize policy? Is there a strong use-case for it?

    • Resolution: No.
      • Discussed with SIG-Node on Oct 15th, not adding RestartPod policy for simplicity, will revisit if we encounter problems.

Alpha Feature Code Issues:
These are Items and issues discovered during code review that need further discussion and need to be addressed before Beta.

  1. Can we figure out GetPodQOS differently once it is determined on pod create? See In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
  2. How do we deal with a pod that requests 1m/1m cpu requests/limits. See In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
  3. Add internal representation of ContainerStatus.Resources in kubeContainer. Convert it to ContainerStatus.Resources in kubelet_pods generate functions. See In-place Pod Vertical Scaling feature kubernetes#102884 (comment) and In-place Pod Vertical Scaling feature kubernetes#102884 (comment) and In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
  4. Can we get rid of resize mutex? Is there a better way to handle resize retries? See In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
  5. Can we recover from resize checkpoint store failures? See In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
  6. CRI clarification for ContainerStatus.Resources and how to handle runtimes that don't support it. See In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
  7. Add real values to dockershim test for ContainerStatus.Resources In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
    • Resolution: Not required due to dockershim deprecation.
  8. Change PodStatus.Resources from v1.ResourceRequirements to *v1.ResourceRequirements
    • Resolution: Fixed
  9. Address all places in the code that has 'TODO(vinaykul)'
  10. Current implementation does not work with node toploogy manager enabled. This limitation is not capturedi in the KEP. Add this to the release documentation for alpha, we will address this in beta. See In-place Pod Vertical Scaling feature kubernetes#102884 (comment)
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Oct 8, 2019
@vinaykul
Copy link
Contributor Author

vinaykul commented Oct 8, 2019

/assign @vinaykul

@jeremyrickard
Copy link
Contributor

jeremyrickard commented Oct 9, 2019

👋 Hey there @vinaykul. I'm a shadow on the 1.17 Release Team, working on Enhancements. We're tracking issues for the 1.17 release and I wanted to reach out and ask we should track this (or more specifically I guess the In-Place Update of Pod Resources feature) for 1.17?

The current release schedule is:

Monday, September 23 - Release Cycle Begins
Tuesday, October 15, EOD PST - Enhancements Freeze
Thursday, November 14, EOD PST - Code Freeze
Tuesday, November 22 - Docs must be completed and reviewed
Monday, December 9 - Kubernetes 1.17.0 Released

We're only 5 days away from the Enhancements Freeze, so if you intend to graduate this capability in the 1.17 release, here are the requirements that you'll need to satisfy:

  • KEP must be merged in implementable state
  • KEP must define graduation criteria
  • KEP must have a test plan defined

Thanks @vinaykul

@vinaykul
Copy link
Contributor Author

  • KEP must be merged in implementable state
  • KEP must define graduation criteria
  • KEP must have a test plan defined

Hi @jeremyrickard I'll do my best to get this KEP to implementable state by next Tuesday, but it looks like a stretch at this point - the major item is to complete API review with @thockin , and that depends on his availability.

The actual code changes are not that big. Nevertheless, the safe option would be to track this for 1.18.0 release, I'll update you by next Monday.

CC: @dashpole @derekwaynecarr @dchen1107

@mrbobbytables mrbobbytables added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Oct 14, 2019
@k8s-ci-robot k8s-ci-robot removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 14, 2019
@mrbobbytables mrbobbytables added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team and removed tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team labels Oct 14, 2019
@mrbobbytables mrbobbytables added this to the v1.17 milestone Oct 14, 2019
@vinaykul
Copy link
Contributor Author

@jeremyrickard @mrbobbytables This KEP will take some more discussion - key thing is API review. It does not look like @thockin or another API reviewer is available soon. Could we please track this KEP for v1.18?
Thanks,

@jeremyrickard
Copy link
Contributor

/milestone v1.18

@k8s-ci-robot k8s-ci-robot modified the milestones: v1.17, v1.18 Oct 14, 2019
@jeremyrickard jeremyrickard added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Oct 14, 2019
@vinaykul
Copy link
Contributor Author

@PatrickLang Here's a first stab at the proposed CRI change to allow UpdateContainerResources to work with Windows. Please take a look.. let's discuss in tomorrow's sig meeting

root@skibum:~/km16/staging/src/k8s.io/cri-api# git diff --cached .
diff --git a/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto b/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto
index 0290d0f..b05bb56 100644
--- a/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto
+++ b/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto
@@ -924,14 +924,33 @@ message ContainerStatusResponse {
     map<string, string> info = 2;
 }
 
+// ContainerResources holds the fields representing a container's resource limits
+message ContainerResources {
+    // Resource configuration specific to Linux container.
+    LinuxContainerResources linux = 1;
+    // Resource configuration specific to Windows container.
+    WindowsContainerResources windows = 2;
+}
+
 message UpdateContainerResourcesRequest {
     // ID of the container to update.
     string container_id = 1;
-    // Resource configuration specific to Linux containers.
+    // Resource configuration specific to Linux container.
     LinuxContainerResources linux = 2;
+    // Resource configuration specific to Windows container.
+    WindowsContainerResources windows = 3;
 }
 
-message UpdateContainerResourcesResponse {}
+message UpdateContainerResourcesResponse {
+    // ID of the container that was updated.
+    string container_id = 1;
+    // Resource configuration currently applied to the Linux container.
+    LinuxContainerResources linux = 2;
+    // Resource configuration currently applied to the Windows container.
+    WindowsContainerResources windows = 3;
+    // Error message if UpdateContainerResources fails in the runtime.
+    string error_message = 4;
+}
 
 message ExecSyncRequest {
     // ID of the container.
diff --git a/staging/src/k8s.io/cri-api/pkg/apis/services.go b/staging/src/k8s.io/cri-api/pkg/apis/services.go
index 9a22ecb..9f1d893 100644
--- a/staging/src/k8s.io/cri-api/pkg/apis/services.go
+++ b/staging/src/k8s.io/cri-api/pkg/apis/services.go
@@ -44,7 +44,7 @@ type ContainerManager interface {
        // ContainerStatus returns the status of the container.
        ContainerStatus(containerID string) (*runtimeapi.ContainerStatus, error)
        // UpdateContainerResources updates the cgroup resources for the container.
-       UpdateContainerResources(containerID string, resources *runtimeapi.LinuxContainerResources) error
+       UpdateContainerResources(containerID string, resources *runtimeapi.ContainerResources) error
        // ExecSync executes a command in the container, and returns the stdout output.
        // If command exits with a non-zero exit code, an error is returned.
        ExecSync(containerID string, cmd []string, timeout time.Duration) (stdout []byte, stderr []byte, err error)

@dashpole
Copy link
Contributor

dashpole commented Oct 24, 2019

@vinaykul It looks like since the above PR was merged, this was removed from the API review queue. I believe you need to open a new PR that moves the state to implementable, and then add the API-review label to get it back in the queue and get a reviewer.

Edit: you should also include any other changes (e.g. windows CRI changes) required to move the feature to implementable in the PR as well.

@vinaykul
Copy link
Contributor Author

@vinaykul It looks like since the above PR was merged, this was removed from the API review queue. I believe you need to open a new PR that moves the state to implementable, and then add the API-review label to get it back in the queue and get a reviewer.

Edit: you should also include any other changes (e.g. windows CRI changes) required to move the feature to implementable in the PR as well.

@dashpole Thanks!

I've started a provisional mini-KEP per our discussion last week for the CRI changes (Dawn mentioned last week that we should take that up separately). imho the CRI changes does not block the implementation of this KEP, as it is between Kubelet and runtime, and user is not affected by it.

In a second commit to the same PR, I've addressed another key issue (update api failure handling), and requested change to move primary KEP to implementable.

With this, everything is in one place, and we can use it for API review.

@palnabarun
Copy link
Member

palnabarun commented Jan 13, 2020

Hey there @vinaykul -- 1.18 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating to alpha in 1.18?

The current release schedule is:

  • Monday, January 6th - Release Cycle Begins
  • Tuesday, January 28th EOD PST - Enhancements Freeze
  • Thursday, March 5th, EOD PST - Code Freeze
  • Monday, March 16th - Docs must be completed and reviewed
  • Tuesday, March 24th - Kubernetes 1.18.0 Released

To be included in the release,

  1. The KEP PR must be merged
  2. The KEP must be in an implementable state
  3. The KEP must have test plans and graduation criteria.

If you would like to include this enhancement, once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍

We'll be tracking enhancements here: http://bit.ly/k8s-1-18-enhancements

Thanks! :)

@vinaykul
Copy link
Contributor Author

Hey there @vinaykul -- 1.18 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating to alpha in 1.18?

The current release schedule is:

  • Monday, January 6th - Release Cycle Begins
  • Tuesday, January 28th EOD PST - Enhancements Freeze
  • Thursday, March 5th, EOD PST - Code Freeze
  • Monday, March 16th - Docs must be completed and reviewed
  • Tuesday, March 24th - Kubernetes 1.18.0 Released

To be included in the release,

  1. The KEP PR must be merged
  2. The KEP must be in an implementable state
  3. The KEP must have test plans and graduation criteria.

If you would like to include this enhancement, once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍

We'll be tracking enhancements here: http://bit.ly/k8s-1-18-enhancements

Thanks! :)

@palnabarun Yes, I'm planning to work towards alpha code targets for this feature in 1.18. I've updated the KEP adding test plan and graduation criteria sections that I will be reviewing with SIG-Node this week and hope to get it implementable before Jan 28. I'll update this thread if anything changes.

@palnabarun
Copy link
Member

Thank you @vinaykul for the updates. :)

@palnabarun
Copy link
Member

/stage alpha

@k8s-ci-robot k8s-ci-robot added the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Jan 14, 2020
@palnabarun
Copy link
Member

/milestone v1.18

@palnabarun palnabarun removed the tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team label Jan 14, 2020
@npolshakova
Copy link

npolshakova commented Sep 22, 2023

Hello @vinaykul @Jeffwan 👋, 1.29 Enhancements team here!

Just checking in as we approach enhancements freeze on 01:00 UTC, Friday, 6th October, 2023.

This enhancement is targeting for stage alpha for 1.29 (correct me, if otherwise)

Here's where this enhancement currently stands:

  • KEP readme using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable for latest-milestone: 1.29. KEPs targeting stable will need to be marked as implemented after code PRs are merged and the feature gates are removed.
  • KEP readme has up-to-date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements. (For more information on the PRR process, check here).

For this KEP, we would just need to update the following:

  • Update the latest-milestone to 1.29
  • As this enhancement just changed its milestone to 1.29 there isn't a need for a new PRR unless there are major changes to the KEP questionnaire. It looks like there is some on going discussion so this may require a re-review depending on what changes you intend to make.

The status of this enhancement is marked as at risk for enhancement freeze. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

@Jeffwan
Copy link
Contributor

Jeffwan commented Sep 22, 2023

@npolshakova Thanks for the update and I will file a PR to make the KEP update. At the same time, I had some discussion with @LingyanYin and build up a issue/bug list for v1.29. @LingyanYin please help reorganize the contents and publish a list here for communities to discuss and collaborate.

@LingyanYin
Copy link

@npolshakova Thanks for the update and I will file a PR to make the KEP update. At the same time, I had some discussion with @LingyanYin and build up a issue/bug list for v1.29. @LingyanYin please help reorganize the contents and publish a list here for communities to discuss and collaborate.

sure. Will do

@LingyanYin
Copy link

I collects the whole list of features & bugs, we should discuss which ones are the beta-blocker
@Jeffwan
Inplace VPA issue list until today:

feature list:

@Karthik-K-N
Copy link

I collects the whole list of features & bugs, we should discuss which ones are the beta-blocker @Jeffwan Inplace VPA issue list until today:

* [ ]  [pod_status_manager_state: checkpoint is corrupted kubernetes#117589](https://github.com/kubernetes/kubernetes/issues/117589)
  **related  #PR** [Fix: do not assign an empty value to the resource (CPU or memory) if it's not defined in the container kubernetes#117615](https://github.com/kubernetes/kubernetes/pull/117615)

* [ ]  [[FG:InPlacePodVerticalScaling] Pod Resize - long delay in updating `apiPodStatus.Resources` kubernetes#112264](https://github.com/kubernetes/kubernetes/issues/112264)
  **related #PR** [Enhance InPlacePodVerticalScaling performance kubernetes#120432](https://github.com/kubernetes/kubernetes/pull/120432)

* [ ]  [InPlace VPA: wrong CRI updates after lack of resources limits  kubernetes#120709](https://github.com/kubernetes/kubernetes/issues/120709)

* [ ]  [K8s fails to throttle pod so that it follows resized CPU limit kubernetes#118371](https://github.com/kubernetes/kubernetes/issues/118371)

* [ ]  [Container resize policy feature introduced in v1.27 doesn't interrupt CrashLoopBackOff kubernetes#119838](https://github.com/kubernetes/kubernetes/issues/119838)

* [ ]  [In place update trigger container restart when upgrade k8s cluster kubernetes#119187](https://github.com/kubernetes/kubernetes/issues/119187)

* [ ]  [Kubelet failed to start after rebooting all nodes with InPlacePodVerticalScaling feature gate enabled kubernetes#119029](https://github.com/kubernetes/kubernetes/issues/119029)

* [ ]  [warning or decline pod creation with RestartContainer resizePolicy  if the restartPolicy is Never kubernetes#118674](https://github.com/kubernetes/kubernetes/issues/118674)

* [ ]  [[FG:InPlacePodVerticalScaling] Scheduler should use max(Spec.Resources.Requests, Status.Resources.Requests) instead of Status.AllocatedResources kubernetes#117765](https://github.com/kubernetes/kubernetes/issues/117765)

* [ ]  [In place pod resizing should be designed into the kubelet config state loop, not alongside it kubernetes#116971](https://github.com/kubernetes/kubernetes/issues/116971)

* [ ]  [Components in the kubelet are incorrectly pulling state from "desired" instead of the pod worker's "actual" kubernetes#116970](https://github.com/kubernetes/kubernetes/issues/116970)

* [ ]  [Configuring resizePolicy for in-place-pod-vpa has no effect kubernetes#116890](https://github.com/kubernetes/kubernetes/issues/116890)

* [ ]  [In-Place Update of Pod Resources - resizing of pod may race with other pod updates kubernetes#116826](https://github.com/kubernetes/kubernetes/issues/116826)

* [ ]  [Scheduler focused in-place pod resize test is flaky and attempts negative resource values kubernetes#116415](https://github.com/kubernetes/kubernetes/issues/116415)

* [ ]  [Failure cluster [86bd7641...] InPlacePodVerticalScaling: Guaranteed QoS pod, one container - decrease CPU & increase memory kubernetes#116175](https://github.com/kubernetes/kubernetes/issues/116175)

* [ ]  [kubelet's calculation of whether a container has changed can cause cluster-wide outages kubernetes#63814](https://github.com/kubernetes/kubernetes/issues/63814)

feature list:

* [ ]  [[FG:InPlacePodVerticalScaling] Implement version skew handling for in-place pod resize kubernetes#117767](https://github.com/kubernetes/kubernetes/issues/117767)

* [ ]  [Let K8s workload(deployment,sts,etc.) support In-place Pod Vertical Scaling feature kubernetes#116214](https://github.com/kubernetes/kubernetes/issues/116214)

* [ ]  [[FG:InPlacePodVerticalScaling] If pod resize request exceeds node allocatable, fail it in admission handler kubernetes#114203](https://github.com/kubernetes/kubernetes/issues/114203)

* [ ]  [[FG:InPlacePodVerticalScaling] Get MemoryRequest from ContainerStatus for cgroupv2 kubernetes#114202](https://github.com/kubernetes/kubernetes/issues/114202)

* [ ]  [[FG:InPlacePodVerticalScaling] Verify extended resources are correctly reported in ContainerStatus Resources kubernetes#114159](https://github.com/kubernetes/kubernetes/issues/114159)

* [ ]  [[FG:InPlacePodVerticalScaling] Handle pod CPU resize where caller requests CPU value of 1m kubernetes#114123](https://github.com/kubernetes/kubernetes/issues/114123)

* [ ]  [[FG:InPlacePodVerticalScaling] Tracking issue for TODO - future removal of pull-kubernetes-e2e-inplace-pod-resize-containerd-main-v2 job kubernetes#113838](https://github.com/kubernetes/kubernetes/issues/113838)

* [ ]  [[FG:InPlacePodVerticalScaling] Add Pod Resize Node E2E test using framework in test/e2e_node  kubernetes#111877](https://github.com/kubernetes/kubernetes/issues/111877)

* [ ]  [[FG:InPlacePodVerticalScaling] Pod Resize E2E test - add test for scheduler, fix code review items kubernetes#110490](https://github.com/kubernetes/kubernetes/issues/110490)

* [ ]  [[FG:InPlacePodVerticalScaling] Add E2E test case to revert resource resize patch kubernetes#109905](https://github.com/kubernetes/kubernetes/issues/109905)

@LingyanYin, Vinay has listed few issues here https://github.com/vinaykul/kubernetes/wiki/In-Place-Pod-Vertical-Scaling-Issues-and-Status, Have you considered them as well.

@npolshakova
Copy link

Hi @LingyanYin, just checking in once more as we approach the 1.29 enhancement freeze deadline this week on 01:00 UTC, Friday, 6th October, 2023. The status of this enhancement is marked as at risk for enhancement freeze.

Please update the latest-milestone in the KEP yaml to 1.29. If there are major changes to the KEP you will also need a new production readiness review.

Let me know if I missed anything. Thanks!

@Jeffwan
Copy link
Contributor

Jeffwan commented Oct 3, 2023

#4267 @npolshakova Here's the PR to update the latest-milestone to v1.29. We will spend some time recently to address above issues as much as possible

@deads2k
Copy link
Contributor

deads2k commented Oct 3, 2023

#4267 @npolshakova Here's the PR to update the latest-milestone to v1.29. We will spend some time recently to address above issues as much as possible

Is this staying in alpha or moving to beta? To move to beta there are PRR sections missing https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md?plain=1#L988, the stage needs updating, and a PRR update is expected https://github.com/kubernetes/enhancements/blob/master/keps/prod-readiness/sig-node/1287.yaml

@SergeyKanzhelev
Copy link
Member

/unassign @vinaykul

@SergeyKanzhelev
Copy link
Member

This is staying in alpha as of the last discussions

@npolshakova
Copy link

With KEP PR #4267 approved, the enhancement is ready for the enhancements freeze. The status is now marked as tracked for enhancement freeze for 1.29. 🚀 Thank you!

@drewhagen
Copy link
Member

Hello @vinaykul @Jeffwan 👋, v1.29 Docs Shadow here.
Does this enhancement work planned for v1.29 require any new docs or modification to existing docs?
If so, please follows the steps here to open a PR against dev-1.29 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thursday, 19 October 2023.
Also, take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release.
Thank you!

@npolshakova
Copy link

Hi again @vinaykul @Jeffwan @LingyanYin, 👋, 1.29 Enhancements team here! Just checking in as we approach code freeze at 01:00 UTC Wednesday 1st November 2023: .

Here's where this enhancement currently stands:

  • All PRs to the Kubernetes repo that are related to your enhancement are linked in the above issue description (for tracking purposes).
  • All PR/s are ready to be merged (they have approved and lgtm labels applied) by the code freeze deadline. This includes tests.

It looks like https://github.com/vinaykul/kubernetes/wiki/In-Place-Pod-Vertical-Scaling-Issues-and-Status tracks the alpha blocker issues. It looks like these PRs have already merged:

Are there additional code related PRs that need to be merged for 1.29? It looks like kubernetes/kubernetes#112599 is mentioned in the issue description and kubernetes/kubernetes#121218 has been opened recently.

Also, please let me know if there are other PRs in k/k we should be tracking for this KEP.
As always, we are here to help if any questions come up. Thanks!

@vinaykul
Copy link
Contributor Author

@npolshakova Two related PRs that are low risk fixes were recently merged should be tracked for release purposes (I have updated the description above).
kubernetes/kubernetes#119665
kubernetes/kubernetes#118768

PR kubernetes/kubernetes#117615 looks ready to merge with additional review. cc: @mrunalp @Random-Liu

@Jeffwan may have additional PRs on the way. Jiaxin please LMK the key ones that I need to look at and I'll do my best to find the time to review.

@drewhagen
Copy link
Member

Hi @vinaykul @Jeffwan! The deadline to open a placeholder PR against k/website for required documentation is Thursday, 19 October. Could you please update me on the status of docs for this enhancement? Thank you!

@Barakmor1
Copy link

Hey @Huang-Wei @SergeyKanzhelev @liggitt

The Pod Scheduling Readiness feature empowers users to implement their custom resource quotas.
In-place-update-pod-resources should align with Pod Scheduling Readiness enabling users to define
and apply their specific resourceQuota implementations.

There is a need to incorporate the ability to add a scaling readiness gate, acting as a finalizer/scheduling gate. This enables users to dynamically remove it using their own controller, ensuring the validity of newly allocated resources.

@a-mccarthy
Copy link

Hi @vinaykul @Jeffwan @LingyanYin, 👋 from the v1.29 Release Team-Communications! We would like to check if you have any plans to publish a blog for this KEP regarding new features, removals, and deprecations for this release.

If so, you need to open a PR placeholder in the website repository.
The deadline will be on Tuesday 14th November 2023 (after the Docs deadline PR ready for review)

Here is the 1.29 calendar

@Jeffwan
Copy link
Contributor

Jeffwan commented Oct 26, 2023

@drewhagen yeah. I confirmed with you offline and there won't be doc change yet. @a-mccarthy Not yet as well. current fixed are not enough to publish a blog yet.

@npolshakova
Copy link

npolshakova commented Oct 30, 2023

Thanks! With those PRs merged and the issue description updated this is tracked for code freeze for 1.29 🚀

@vinaykul
Copy link
Contributor Author

vinaykul commented Oct 31, 2023

Please also track PRs:
kubernetes/kubernetes#117615 (merged)
kubernetes/kubernetes#112599 (expecting to merge today 🎉 )
Description has been updated.

@Barakmor1
Copy link

Barakmor1 commented Nov 1, 2023

Hey @Huang-Wei @SergeyKanzhelev @liggitt

The Pod Scheduling Readiness feature empowers users to implement their custom resource quotas. In-place-update-pod-resources should align with Pod Scheduling Readiness enabling users to define and apply their specific resourceQuota implementations.

There is a need to incorporate the ability to add a scaling readiness gate, acting as a finalizer/scheduling gate. This enables users to dynamically remove it using their own controller, ensuring the validity of newly allocated resources.

I already have opened an issue about this:
#4304
Shouldn't this be part of this issue?
@vinaykul @liggitt @deads2k @Jeffwan

@Jeffwan
Copy link
Contributor

Jeffwan commented Nov 1, 2023

@npolshakova I really like kubernetes/kubernetes#120432 this one to be included in v1.29 and it's pending review and needs approval. This resolves a critical performance issue. Can we have an exception for this one?

/cc @vinaykul Can you help take a look at this link?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lead-opted-in Denotes that an issue has been opted in to a release sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status
Projects
Status: Net New
Status: Tracked
Status: Removed from Milestone
Status: Tracked for Code Freeze
Status: Backlog
Development

No branches or pull requests