feat: check only controller ref to decide if a pod is replicated #5507

vadasambar · 2023-02-14T06:37:26Z

This PR is a follow-up to #5419 (comment)

~~This PR is still WIP. I will mention the reviewers here once I am done. Although it might not be in the best shape as a WIP PR but if the reviewers have any feedback, I would love to have it. 🙏~~

Signed-off-by: vadasambar surajrbanakar@gmail.com

What type of PR is this?

/kind feature

What this PR does / why we need it:

scale down is blocked on the pods where the controller reference is not understood by the CA (CA as of now only understands controller refs for Daemonsets, ReplicaSets, Jobs, ReplicationControllers, Deployments and StatefulSets). This is because CA treats pods with non-understood controller references as non-replicated and hence deems scale-down as not-safe.
this PR introduces a change where all pods with ownerReference controller: true are treated as replicated and scale-down is not blocked

Which issue(s) this PR fixes:

Fixes #5387

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Added: New flag `--allow-scale-down-on-custom-controller-owned-pods`. If this flag is set to true cluster-autoscaler doesn't block node scale-down if a pod owned by a custom controller is running on the node.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

TBD

vadasambar · 2023-02-21T06:44:57Z

cluster-autoscaler/utils/drain/drain_test.go

+			// Using names like FooController is discouraged
+			// https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#naming-conventions
+			// vadasambar: I am using it here just because `FooController`` is easier to understand than say `FooSet`
+			OwnerReferences: GenerateOwnerReferences("Foo", "FooController", "apps/v1", ""),


Any other suggestions in place of FooController are welcome.

x13n · 2023-02-28T15:20:11Z

/assign

I would preserve the existing behavior for known controllers - perhaps behind a flag - so that there's no regression. We can eventually delete the flag after a few releases, but I'd like to avoid breaking existing users.

vadasambar · 2023-03-01T06:32:17Z

/assign

I would preserve the existing behavior for known controllers - perhaps behind a flag - so that there's no regression. We can eventually delete the flag after a few releases, but I'd like to avoid breaking existing users.

Makes sense to me (don't want to surprise users with a new default behavior). Let me think about this.
Thank you for the feedback (I always get to learn something because of your feedback).

vadasambar · 2023-03-01T06:35:47Z

I had a few hiccups around trying to get a custom CA run properly on GKE (all good now). I want to test this PR manually on a GKE cluster. The rough idea is something like this:

add debug log to show pod with an owner reference was skipped during scale down
create custom CA in a GKE cluster with PR image tag and --v flag to set log level high enough to show the debug log for owner reference
schedule a workload with N replicas that needs a new node e.g., set pod anti-affinity so that N-1 nodes scale out to N nodes
run a pod with owner reference
delete the N replicas workload
tail logs for cluster-autoscaler to make sure it skips the pod with owner reference
if I see the owner reference log, screenshot it and paste it in the ticket PR
change the PR from draft to review
ask for review

x13n · 2023-03-03T10:31:02Z

Rather than on logs, I'd just rely on behavior. In this case you could do something like this:

Scale cluster to N nodes using N replica workload you described.
Put static pods on N-1 nodes and one custom controlled pod on the last node.
Delete the workload.
Observe if the node with custom controlled pod got rescheduled.

Note: N=2 might be problematic due to system workloads, but N=3 should work. Alternatively, you can use a separate nodepool to put all non-DS system workloads there using taints/tolerations, so that they don't interfere with the test.

vadasambar · 2023-03-06T06:33:49Z

cluster-autoscaler/main.go

+	maxFreeDifferenceRatio                    = flag.Float64("max-free-difference-ratio", config.DefaultMaxFreeDifferenceRatio, "Maximum difference in free resources between two similar node groups to be considered for balancing. Value is a ratio of the smaller node group's free resource.")
+	maxAllocatableDifferenceRatio             = flag.Float64("max-allocatable-difference-ratio", config.DefaultMaxAllocatableDifferenceRatio, "Maximum difference in allocatable resources between two similar node groups to be considered for balancing. Value is a ratio of the smaller node group's allocatable resource.")
+	forceDaemonSets                           = flag.Bool("force-ds", false, "Blocks scale-up of node groups too small for all suitable Daemon Sets pods.")
+	allowScaleDownOnCustomControllerOwnedPods = flag.Bool("allow-scale-down-on-custom-controller-owned-pods", false, "Don't block node scale-down if a pod owned by a custom controller is running on the node.")


Better naming suggestions are welcome (the flag name feels a little too long).

The default is set to false right now to provide backwards compatibility. This flag would be set to true in the future (we might remove it completely and use true one as the default behavior).

@vadasambar Just an idea, maybe you could follow the skip-nodes-with-* scheme, for instance,
skip-nodes-with-custom-controller-pods?

@gregth thank you for the suggestion. I thought the new flag name skip-nodes-with-custom-controller-pods wouldn't tell the user about if the skipping happens during scale-up or scale-down but looks like we already use skip-nodes-with-* flag naming for skipping things when scaling down nodes:

autoscaler/cluster-autoscaler/main.go

Lines 214 to 215 in 1acb6b2

skipNodesWithSystemPods = flag.Bool("skip-nodes-with-system-pods", true, "If true cluster autoscaler will never delete nodes with pods from kube-system (except for DaemonSet or mirror pods)")

skipNodesWithLocalStorage = flag.Bool("skip-nodes-with-local-storage", true, "If true cluster autoscaler will never delete nodes with pods with local storage, e.g. EmptyDir or HostPath")

Making the naming consistent makes sense to me. I will change it to skip-nodes-with-custom-controller-pods. Thank you.

P.S.: If you have any other suggestions, I would love to know.

vadasambar · 2023-03-06T06:36:39Z

cluster-autoscaler/utils/drain/drain.go

-			} else {
-				replicated = true
+		} else {
+			checkReferences := listers != nil


The code in the else part is just copy-paste of our current code. I have moved a couple of variable declarations into the else part so that it's easier to remove in the future.

vadasambar · 2023-03-08T04:43:10Z

Keywords which can automatically close issues and at(@) or hashtag(#) mentions are not allowed in commit messages.

The list of commits with invalid commit messages:
* [18902e8](https://github.com/kubernetes/autoscaler/commits/18902e8fd48a0f789608b69f42d25d02ab140406) fix: remove `@` in `@vadasambar`

📝

P.S.: Fixed

vadasambar · 2023-03-09T10:30:15Z

Testing

Created a new nodepool in GKE with taints (pool-1)

Set the flag to true:

Scaled up the pool-1 nodepool using the following workload:

# +kubectl
apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-scale-up-with-pod-anti-affinity
  namespace: default
spec:
  selector:
    matchLabels:
      app: node-scale-up-with-pod-anti-affinity
  replicas: 1
  template:
    metadata:
      labels:
        app: node-scale-up-with-pod-anti-affinity
    spec:
      nodeSelector:
        cloud.google.com/gke-nodepool: pool-1
      tolerations:
        - effect: NoSchedule
          key: test
          operator: Equal
          value: "true"
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - node-scale-up-with-pod-anti-affinity
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: node-scale-up-with-pod-anti-affinity
        image: registry.k8s.io/pause:2.0

Deployed the test workload pod with custom controller owner ref

apiVersion: v1
kind: Pod
metadata:
  name: custom-controller-owned-pod
  ownerReferences:
  - apiVersion: foos
    kind: FooSet
    controller: true
    name: custom-controller-owned-pod-56fdfd787b
    uid: 1c6544a7-12e7-426c-bd2d-7ac858d18d7d 
spec:
  nodeName: gke-cluster-1-pool-1-b66d130e-9mrw # name of the node that was created to accomodate scale-up workload
  tolerations:
    - effect: NoSchedule 
      key: test  # pool-1 nodepool is tainted with this taint
      operator: Equal
      value: "true"
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent

Removed the scale up workload to see how CA behaves. Sure enough, it removes the pod with custom controller owner reference:

You can see some logs like inside new loop and allowScaleDownOnCustomController... This was to debug a problem where CA didn't skip the custom controller owned pod. Turns out I forgot to add controller: true in the owner reference.

vadasambar · 2023-03-15T05:41:19Z

cluster-autoscaler/main.go

+	maxNodeGroupBinpackingDuration         = flag.Duration("max-nodegroup-binpacking-duration", 10*time.Second, "Maximum time that will be spent in binpacking simulation for each NodeGroup.")
+	skipNodesWithSystemPods                = flag.Bool("skip-nodes-with-system-pods", true, "If true cluster autoscaler will never delete nodes with pods from kube-system (except for DaemonSet or mirror pods)")
+	skipNodesWithLocalStorage              = flag.Bool("skip-nodes-with-local-storage", true, "If true cluster autoscaler will never delete nodes with pods with local storage, e.g. EmptyDir or HostPath")
+	scaleDownNodesWithCustomControllerPods = flag.Bool("scale-down-nodes-with-custom-controller-pods", false, "If true cluster autoscaler will delete nodes with pods owned by custom controllers")


This is the new flag. Looks like adding the new flag added extra space in every flag definition.

vadasambar · 2023-03-15T05:42:00Z

cluster-autoscaler/main.go

+		NodeGroupBackoffResetTimeout:           *nodeGroupBackoffResetTimeout,
+		MaxScaleDownParallelism:                *maxScaleDownParallelismFlag,
+		MaxDrainParallelism:                    *maxDrainParallelismFlag,
+		GceExpanderEphemeralStorageSupport:     *gceExpanderEphemeralStorageSupport,


Same here. Extra space was added because I added ScaleDownNodesWithCustomControllerPods on line 332 below.

vadasambar · 2023-03-15T05:44:50Z

cluster-autoscaler/utils/drain/drain.go

@@ -231,6 +158,104 @@ func GetPodsForDeletionOnNodeDrain(
 	return pods, daemonSetPods, nil, nil
 }

+func legacyCheckForReplicatedPods(listers kube_util.ListerRegistry, pod *apiv1.Pod, minReplica int32) (replicated bool, isDaemonSetPod bool, blockingPod *BlockingPod, err error) {


The name legacyCheckForReplicatedPods doesn't account for checking for daemon set pods or blocking pods but using legacyCheckForReplicatedAndDaemonSetAndBlockingPods feels too long so I have kept it as is for now.

Note that the return type replicated bool, isDaemonSetPod bool, blockingPod *BlockingPod, err error has names. I did this because I think it improves readability when you hover over the function call in IDEs e.g.,

This might not be the best approach since it can confuse someone when they look at line 162 and 165 below (because the variables are already defined due to them being named return types). Open to other ideas here.

vadasambar · 2023-03-15T06:02:18Z

cluster-autoscaler/utils/drain/drain.go

+	checkReferences := listers != nil
+	isDaemonSetPod = false
+
+	controllerRef := ControllerRef(pod)


This is the same code as our current code here abstracted into a function. I have done some minor changes like:

Change return values to include replicated and isDaemonSetPod. Also, remove []*apiv1.Pod since we don't need to return it from this function. Returning pod list is done in the else part here.

Don't do appending to daemonSetPods slice here. Instead just return if isDaemonSetPod and let the calling function do the appending because I thought it was cleaner and need the appending part even if we remove this function in the future.

vadasambar · 2023-03-15T06:04:58Z

cluster-autoscaler/utils/drain/drain_test.go

-	for _, test := range tests {
+	// run all tests for scaleDownNodesWithCustomControllerPods=false
+	// TODO(vadasambar): remove this when we get rid of scaleDownNodesWithCustomControllerPods
+	for _, test := range append(tests, testOpts{


I am doing append in the for loop declaration itself. This might not be the best approach. I didn't want to create 2 variables called testsWithScaleDownNodesWithCustomControllerPodsDisabled and testsWithScaleDownNodesWithCustomControllerPodsEnabled. Any better suggestions are welcome.

What is wrong about being explicit? I'd something along the lines of:

Make custom controller pods flag a part of the test case struct.

Define a list of shared test cases.

Define a list of flag-disabled test cases (copy the shared ones and add the extra one)

Define a list of flag-enabled test cases (copy the shared ones, flip the flag on all and add the extra one)

Define the body of the test once and run it for a combined list (flag-enabled + flag-disabled).

WDYT?

Alternatively, just copy all test cases and have a single list in the first place :)

Thank you for the suggestion. I like the idea of putting the flag on the test struct and combining all tests into one. Let me try it out.

Updated the code https://github.com/kubernetes/autoscaler/pull/5507/files#diff-cd1a98b8de97414dad2b66a3cf3229b1572a169086d9b7141a266f8ddb597f68

vadasambar · 2023-03-15T06:09:44Z

cluster-autoscaler/utils/drain/drain_test.go

+	}
+
+	// run all tests for scaleDownNodesWithCustomControllerPods=true
+	for _, test := range append(tests, testOpts{


Note that I am running the same tests twice. Code here is practically the same as code beyond line above. The only difference is scaleDownNodesWithCustomControllerPods is set to false here and it is set to true in the test cases beyond the above line.

I did think of abstracting duplicate code into a separate function but I am not so sure after thinking about the comment here since abstracting the tests into a function would add another level of indirection. I am a little on the fence about doing that now. Any suggestions here are welcome.

vadasambar · 2023-03-20T05:46:58Z

cluster-autoscaler/simulator/drain_test.go

+		SkipNodesWithSystemPods:           true,
+		SkipNodesWithLocalStorage:         true,
+		MinReplicaCount:                   0,
+		SkipNodesWithCustomControllerPods: true,


I have set SkipNodesWithCustomControllerPods: true to preserve current behavior of these test.

x13n · 2023-03-20T13:29:49Z

cluster-autoscaler/utils/drain/drain.go

+			if err != nil {
+				return []*apiv1.Pod{}, []*apiv1.Pod{}, blockingPod, err
+			}
+		} else {
 			if controllerRef != nil {


Looks like controllerRef is not used anywhere else, so maybe just directly call ControllerRef() here?

Updated. Thank you for pointing this out.

x13n · 2023-03-20T13:33:24Z

cluster-autoscaler/utils/drain/drain_test.go

-		if len(pods) != len(test.expectPods) {
-			t.Fatalf("Wrong pod list content: %v", test.description)
-		}
+	for i := range tests {


My suggestion was to duplicate the test cases and flip the flag on the copy to ensure the new flag doesn't affect the vast majority of test cases.

I think there's another problem with the code where the test cases for flag=false are not there.

My suggestion was to duplicate the test cases and flip the flag on the copy to ensure the new flag doesn't affect the vast majority of test cases.

I see. Just to confirm my understanding. You mean something like this right?

tests := []{ // *flag: true test cases* { // shared test case 1 flag: true }, { // shared test case 2 flag: true }, { // additional test case 1 flag: true }, // *flag: false test cases* { // shared test case 1 flag: false }, { // shared test case 2 flag: false }, { // additional test case 1 flag: false }, } for _, test := range tests { // execute test case }

shared test case 1, shared test case 2 and additional test case 1 are duplicated with some minor tweaks (if required) and flipped flags.

If my understanding is correct, one problem I see is we would have to change name of the duplicated test cases slightly to differentiate them from original test cases so that it's easy to identify which test case failed.

Which I think should be fine since it would be a one time thing for now.

But, if we have a similar flag in the future say --skip-nodes-with-ignore-local-vol-storage-annotation, we would need 4 sets of test cases ([current-flag=true,future-flag=true], [true, false], [false, false], [false, true]) which are almost similar to each other with slight differences. This will get difficult to maintain as we go forward. One argument here can be, we can deal with it once we get to it.

I think we might need a more future-proof solution here but it would also increase the code complexity.

Maybe I can duplicate the test cases for now and there can be another issue to work through a better solution for test cases duplication/maintainability problem.

we would have to change name of the duplicated test cases

This can be solved by adding the flag in the error log output here so that the error output includes the flag value. Something like this:

drain_test.go:955: Custom-controller-managed non-blocking pod: unexpected non-error, skipNodesWithCustomControllerPods: false

Not sure if this is what you wanted to tell me. This can be a possible solution:

sharedTests := []testOpts{ { // shared test case 1 }, { // shared test case 2 }, } allTests = []testOpts{} for _, sharedTest := range sharedTests { sharedTest.skipNodesWithCustomControllerPods = true allTests = append(allTests, sharedTest) sharedTest.skipNodesWithCustomControllerPods = false allTests = append(allTests, sharedTest) } allTests = append(allTests, testOpts{ // additional test case 1 flag: true }) allTests = append(allTests, testOpts{ // additional test case 1 flag: false }) for _, sharedTest := range sharedTests { sharedTest.newFlagInFuture = true allTests = append(allTests, sharedTest) sharedTest.newFlagInFuture = false allTests = append(allTests, sharedTest) } allTests = append(allTests, testOpts{ // additional test case 2 for new flag in future flag: false }) allTests = append(allTests, testOpts{ // additional test case 2 for new flag in future flag: true }) for _, test := range tests { // execute test case }

I have updated this PR with a reference implementation based on the above idea. Happy to change it based on your review comments.

LGTM, thanks!

x13n · 2023-03-21T09:50:44Z

cluster-autoscaler/utils/drain/drain_test.go

+	for _, sharedTest := range sharedTests {
+		// to execute the same shared tests for when the skipNodesWithCustomControllerPods flag is true
+		// and when the flag is false
+		sharedTest.skipNodesWithCustomControllerPods = true


nit: You could update sharedTest.description to append a flag here, so it only affects the shared tests.

Updated. 👍

x13n · 2023-03-21T09:53:29Z

Thanks for the changes! The code looks good to me now, can you just squash the commits before merging?

vadasambar · 2023-03-22T05:18:03Z

cluster-autoscaler/utils/drain/drain_test.go

+		// make sure you shallow copy the test like this
+		// before you modify it
+		// (so that modifying one test doesn't affect another)
+		enabledTest := sharedTest


Note that I am creating a shallow copy of the sharedTest so that if I modify sharedTest once it doesn't affect the second modification. e.g.,

for _, sharedTest := range sharedTests { sharedTest.skipNodesWithCustomControllerPods = true sharedTest.description = fmt.Sprintf("%s with skipNodesWithCustomControllerPods: %v", sharedTest.description, sharedTest.skipNodesWithCustomControllerPods) allTests = append(allTests, sharedTest) sharedTest.skipNodesWithCustomControllerPods = false sharedTest.description = fmt.Sprintf("%s with skipNodesWithCustomControllerPods: %v", sharedTest.description, sharedTest.skipNodesWithCustomControllerPods) allTests = append(allTests, sharedTest) } fmt.Println("allTests[0]", allTests[0]) fmt.Println("allTests[1]", allTests[1])

Prints

allTests[0] {RC-managed pod with skipNodesWithCustomControllerPods:true [0xc00018ad80] [] [0xc000498c60] [] false [0xc00018ad80] [] <nil> true} allTests[1] {RC-managed pod with skipNodesWithCustomControllerPods:true with skipNodesWithCustomControllerPods:false [0xc00018ad80] [] [0xc000498c60] [] false [0xc00018ad80] [] <nil> false}

Note the description and the last bool (which is our flag)

To overcome this, I use shallow copy.

Output when we use shallow copy

allTests[0] {RC-managed pod with skipNodesWithCustomControllerPods:true [0xc000202d80] [] [0xc0004b54a0] [] false [0xc000202d80] [] <nil> true} allTests[1] {RC-managed pod with skipNodesWithCustomControllerPods:false [0xc000202d80] [] [0xc0004b54a0] [] false [0xc000202d80] [] <nil> false}

Signed-off-by: vadasambar <surajrbanakar@gmail.com> (cherry picked from commit 144a64a) fix: set `replicated` to true if controller ref is set to `true` - forgot to add this in the last commit Signed-off-by: vadasambar <surajrbanakar@gmail.com> (cherry picked from commit f8f4582) fix: remove `checkReferences` - not needed anymore Signed-off-by: vadasambar <surajrbanakar@gmail.com> (cherry picked from commit 5df6e31) test(drain): add test for custom controller pod Signed-off-by: vadasambar <surajrbanakar@gmail.com> feat: add flag to allow scale down on custom controller pods - set to `false` by default - `false` will be set to `true` by default in the future - right now, we want to ensure backwards compatibility and make the feature available if the flag is explicitly set to `true` - TODO: this code might need some unit tests. Look into adding unit tests. Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: remove `at` symbol in prefix of `vadasambar` - to keep it consistent with previous such mentions in the code Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(utils): run all drain tests twice - once for `allowScaleDownOnCustomControllerOwnedPods=false` - and once for `allowScaleDownOnCustomControllerOwnedPods=true` Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs(utils): add description for `testOpts` struct Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: update FAQ with info about `allow-scale-down-on-custom-controller-owned-pods` flag Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: rename `allow-scale-down-on-custom-controller-owned-pods` -> `skip-nodes-with-custom-controller-pods` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: rename `allowScaleDownOnCustomControllerOwnedPods` -> `skipNodesWithCustomControllerPods` Signed-off-by: vadasambar <surajrbanakar@gmail.com> test(utils/drain): fix failing tests - refactor code to add cusom controller pod test Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: fix long code comments - clean-up print statements Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: move `expectFatal` right above where it is used - makes the code easier to read Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: fix code comment wording Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: address PR comments - abstract legacy code to check for replicated pods into a separate function so that it's easier to remove in the future - fix param info in the FAQ.md - simplify tests and remove the global variable used in the tests - rename `--skip-nodes-with-custom-controller-pods` -> `--scale-down-nodes-with-custom-controller-pods` Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: rename flag `--scale-down-nodes-with-custom-controller-pods` -> `--skip-nodes-with-custom-controller-pods` - refactor tests Signed-off-by: vadasambar <surajrbanakar@gmail.com> docs: update flag info Signed-off-by: vadasambar <surajrbanakar@gmail.com> fix: forgot to change flag name on a line in the code Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: use `ControllerRef()` directly instead of `controllerRef` - we don't need an extra variable Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: create tests consolidated test cases - from looping over and tweaking shared test cases - so that we don't have to duplicate shared test cases Signed-off-by: vadasambar <surajrbanakar@gmail.com> refactor: append test flag to shared test description - so that the failed test is easy to identify - shallow copy tests and add comments so that others do the same Signed-off-by: vadasambar <surajrbanakar@gmail.com>

vadasambar · 2023-03-22T05:23:13Z

Thanks for the changes! The code looks good to me now, can you just squash the commits before merging?

Thank you for bearing with me. I have squashed the commits and added a comment.

vadasambar · 2023-03-23T09:19:05Z

@x13n unless you have any other comments, I think I'm done from my side.

x13n · 2023-03-24T09:55:50Z

Great, thank you for the changes!

/lgtm
/approve

k8s-ci-robot · 2023-03-24T09:56:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vadasambar, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [x13n]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot requested review from feiskyer and x13n February 14, 2023 06:37

k8s-ci-robot added area/cluster-autoscaler size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 14, 2023

vadasambar commented Feb 21, 2023

View reviewed changes

vadasambar mentioned this pull request Feb 24, 2023

Feb 2023 vadafoss/daily-updates#6

Closed

k8s-ci-robot assigned x13n Feb 28, 2023

This was referenced Mar 6, 2023

Mar 2023 vadafoss/daily-updates#7

Closed

docs: add info about mirror pods #5565

Closed

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 6, 2023

vadasambar commented Mar 6, 2023

View reviewed changes

k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Mar 6, 2023

vadasambar force-pushed the feature/5387/allow-scale-down-with-custom-controller-pods-2 branch from b718749 to 13921ae Compare March 8, 2023 05:27

k8s-ci-robot removed do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 8, 2023

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 11, 2023

vadasambar force-pushed the feature/5387/allow-scale-down-with-custom-controller-pods-2 branch from 3d24178 to 88c4a2d Compare March 13, 2023 04:24

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 13, 2023

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 15, 2023

vadasambar commented Mar 15, 2023

View reviewed changes

vadasambar requested review from x13n and removed request for gregth March 15, 2023 06:26

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 20, 2023

vadasambar commented Mar 20, 2023

View reviewed changes

vadasambar mentioned this pull request Mar 20, 2023

How do I gain access to the cluster autoscaler on GKE? #966

Closed

x13n requested changes Mar 20, 2023

View reviewed changes

x13n reviewed Mar 21, 2023

View reviewed changes

vadasambar commented Mar 22, 2023

View reviewed changes

vadasambar force-pushed the feature/5387/allow-scale-down-with-custom-controller-pods-2 branch from dee5eae to ff6fe58 Compare March 22, 2023 05:21

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 24, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2023

k8s-ci-robot merged commit b8ba233 into kubernetes:master Mar 24, 2023

BigDarkClown mentioned this pull request Apr 4, 2023

Fix drain logic when skipNodesWithCustomControllerPods=false, set NodeDeleteOptions correctly #5653

Merged

vadasambar mentioned this pull request May 9, 2023

REQUEST: New membership for @vadasambar kubernetes/org#4203

Closed

9 tasks

vadasambar mentioned this pull request Jan 30, 2024

chore: add vadasambar to cluster-autoscaler reviewers #6483

Merged

	skipNodesWithSystemPods = flag.Bool("skip-nodes-with-system-pods", true, "If true cluster autoscaler will never delete nodes with pods from kube-system (except for DaemonSet or mirror pods)")
	skipNodesWithLocalStorage = flag.Bool("skip-nodes-with-local-storage", true, "If true cluster autoscaler will never delete nodes with pods with local storage, e.g. EmptyDir or HostPath")

feat: check only controller ref to decide if a pod is replicated #5507

feat: check only controller ref to decide if a pod is replicated #5507

Conversation

vadasambar commented Feb 14, 2023 • edited Loading

This PR is a follow-up to #5419 (comment)

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

vadasambar Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

x13n commented Feb 28, 2023

vadasambar commented Mar 1, 2023 • edited Loading

vadasambar commented Mar 1, 2023

x13n commented Mar 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadasambar commented Mar 8, 2023 • edited Loading

vadasambar commented Mar 9, 2023 • edited Loading

Testing

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadasambar Mar 15, 2023 • edited Loading

Choose a reason for hiding this comment

vadasambar Mar 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadasambar Mar 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadasambar Mar 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadasambar Mar 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

x13n commented Mar 21, 2023

vadasambar Mar 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vadasambar commented Mar 22, 2023

vadasambar commented Mar 23, 2023

x13n commented Mar 24, 2023

k8s-ci-robot commented Mar 24, 2023

vadasambar commented Feb 14, 2023 •

edited

Loading

vadasambar Feb 21, 2023 •

edited

Loading

vadasambar commented Mar 1, 2023 •

edited

Loading

vadasambar commented Mar 8, 2023 •

edited

Loading

vadasambar commented Mar 9, 2023 •

edited

Loading

vadasambar Mar 15, 2023 •

edited

Loading

vadasambar Mar 15, 2023 •

edited

Loading

vadasambar Mar 15, 2023 •

edited

Loading

vadasambar Mar 20, 2023 •

edited

Loading

vadasambar Mar 21, 2023 •

edited

Loading

vadasambar Mar 22, 2023 •

edited

Loading