[WIP] Allow Node strategies to run with informers #488

damemi · 2021-01-25T18:47:48Z

/kind feature

k8s-ci-robot · 2021-01-25T18:48:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damemi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [damemi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ingvagabund

As long as an operator is used to deploy both instances of descheduler (in both modes) and assuming the same config file gets used (each mode will just filter out what's not relevant), this change should be seemingly transparent for a user. Without an operator, one will need to maintain both a CronJob and a Deployment.

In any case there will be two instances of a descheduler running which might interfere with each other. With strategies no longer running in sequence, we need to revisit the code for potential races. Also, improve some error message where an eviction failed (e.g. due to non-existing pod). We might also need to add a mechanism which will make sure both mode instances are ran over separate sets of pods (e.g. label selector filtering) to minimize interference.

ingvagabund · 2021-01-26T10:29:27Z

pkg/descheduler/strategies/types.go

+)
+
+// StrategyFunction defines the function signature for each strategy's main function
+type StrategyFunction func(


I'd rather keep this type private until we discuss how to refactor the way strategies are initialized.

ingvagabund · 2021-01-26T10:37:29Z

pkg/descheduler/strategies/types.go

+	sharedInformerFactory informers.SharedInformerFactory
+	stopChannel           chan struct{}
+
+	f            StrategyFunction


s/f/func to avoid one letter variable (at least two letters to make searching for it easy please :))

ingvagabund · 2021-01-26T10:39:57Z

pkg/descheduler/strategies/node_affinity.go

+			c.nodes = nodes
+			c.queue.Add(workQueueKey)
+		},
+		UpdateFunc: func(old, new interface{}) {


Can you put down a comment saying DeleteFunc is not needed since pods are automatically evicted (or similar)

damemi · 2021-01-26T13:31:47Z

Without an operator, one will need to maintain both a CronJob and a Deployment. In any case there will be two instances of a descheduler running which might interfere with each other.

@ingvagabund this does not require 2 descheduler instances. The informed strategies are spun off into separate, non-blocking goroutines while the main wait loop handles the iterative strategies. This is done with 1 descheduler, run as a deployment with deschedulingInterval.

ingvagabund · 2021-01-27T08:48:43Z

This is done with 1 descheduler, run as a deployment with deschedulingInterval.

I see. So you suggest to drop the cron job and move back to the original way of deploying the descheduler (or keep cron but provide deployment as well as the main manifest). I recall it was more practical to use cron job than to have a descheduler instance stopped on time.Sleep for deschedulingInterval. Deployment makes more sense now.

Also, Kubelet reports node status every 10 seconds by default. There's gonna be a lot of "empty" iterations so it might make sense to add a check to UpdateFunc for each strategy to compare what actually changed so a strategy does not have to be run every 10 seconds going through all pods and nodes just because a node changed e.g. its labels.

lixiang233

It seems every informed strategies has its own StrategyController, then we may run several strategies at the same time, this may cause some data synchronization problems. I think we can keep only one StrategyController and work queue. When we got an event, EventHandler should decide which strategy(s) should be called by the event's type and add the strategy's name to work queue, worker will process them one by one.

damemi · 2021-01-27T14:06:40Z

keep cron but provide deployment as well as the main manifest

Yeah, this is my thinking. There are advantages and disadvantages to either way of running the scheduler, so no reason to not provide them both.

it might make sense to add a check to UpdateFunc for each strategy to compare what actually changed so a strategy does not have to be run every 10 seconds

That is a good idea. This is why I did not make the eventHandler a function of StrategyController, with the idea that each informed strategy may want to implement its own logic for filtering out noop events (or making modifications prior to operating on the event)

It seems every informed strategies has its own StrategyController, then we may run several strategies at the same time, this may cause some data synchronization problems. I think we can keep only one StrategyController and work queue. When we got an event, EventHandler should decide which strategy(s) should be called by the event's type and add the strategy's name to work queue, worker will process them one by one.

This is interesting, and sort of goes along with what @ingvagabund suggested above.

I thought it made sense (at least from a code organization standpoint) to have individual strategies responsible for their own EventHandlers, but I see your point about data synchronization. A central event handler "registry" which then processes the informed strategies serially would be safer. I would probably like to add some way to pass the event itself to each strategy, where the strategy's code can decide whether to run or ignore it.

This brings up another data synchronization point, what if an event comes in right around the same time as a deschedulingInterval? We shouldn't run the periodic strategies at the same time as the informed strategies.

Maybe this could be solved with some kind of mutex? That way if the lock is held by either the StrategyController or the wait loop (whichever is triggered first) we don't hit multithreading data sync issues.

I think this design (of running informed and default strategies in the same descheduler instance) is critical for both usability and further development of reactive descheduling. So I would like to make sure that this can run smoothly.

lixiang233 · 2021-01-28T08:34:59Z

pkg/descheduler/strategies/node_affinity.go

+	return c
+}
+
+func nodeEventHandler(c *StrategyController) cache.ResourceEventHandler {


Should this event handler be moved to types.go or somewhere else? It looks like a common event handler and will be used by other strategies. If it's a custom event handler, a name like nodeAffinityNodeEventHandler is better.

It is a common handler, but only between the node strategies (taints+affinity). I actually considered putting these strategies into their own subpackage (like pkg/descheduler/strategies/node/) but that seems a bit overcomplicated.

lixiang233 · 2021-01-28T09:06:00Z

pkg/descheduler/strategies/node_affinity.go

+func nodeEventHandler(c *StrategyController) cache.ResourceEventHandler {
+	return cache.ResourceEventHandlerFuncs{
+		AddFunc: func(obj interface{}) {
+			nodes, err := nodeutil.ReadyNodes(c.ctx, c.client, c.sharedInformerFactory.Core().V1().Nodes(), c.nodeSelector)


If strategies run serially, the time between listing nodes and running strategy may be long, if another event comes during this period, we'll list nodes again, acturally we only need to do this once. So like what was disscussed in #469 , listing and filtering nodes could be done in each strategies, we can also custom nodeSelector for each strategies by doing so.

if another event comes during this period, we'll list nodes again, acturally we only need to do this once

I'm not sure, because the point of reacting to Node events as they happen is to get a real-time updated view of the cluster to operate on. So each event should trigger a re-list so the strategy doesn't have old data.

I do agree though with refactoring this a bit like was previously mentioned.

All the "reactive" strategies listening to node informers need to take into account strategies like LowNodeUtilization. LowNodeUtilization needs to take into account entire cluster (or its reasonable subset). So if the strategies are ran in incorrect order, e.g. LowNodeUtilization followed by running PodLifeTime removing many old pods, running LowNodeUtilization before might be actually nullified and make the overall resource consumption worst. So far, the order of strategies was kinda hardcoded (depending on the map iterator). We might compute some static impact/score of each strategy to overall resource utilization and run the strategies in some "practical" order starting with strategies changing utilization the most to strategies changing it the least (just a though, it might not be possible).

I doubt that LowNodeUtilization will be able to easily convert to be reactive. That is the point of the design I proposed here, that we can begin to convert some of the strategies while keeping the hardcoded order of others.

I think @lixiang233's idea of a single strategy controller with a master registry of event handlers would solve any ordering issues. That with a mutex ensures we aren't running 2 strategies at the same time.

For example, with an interval of 60 minutes we could have:

0min - interval run of strategies 12 min - strategycontroller runs nodeaffinity 60 min - interval run 119 min - strategycontroller run (mutex lock) 121 min - interval (blocked waiting for mutex) runs 181 min - interval run

And since each periodic interval does a node re-list, those are working with an up-to-date list each time. Any results from NodeAffinity/NodeTaints won't interfere. So really, we are just triggering a descheduling run at dynamic intervals along with the hardcoded/cron intervals

seanmalloy · 2021-02-04T05:15:34Z

/cc

k8s-ci-robot · 2021-04-06T20:32:17Z

@damemi: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

denkensk · 2021-06-01T07:01:18Z

/cc

k8s-triage-robot · 2022-01-10T13:39:17Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-ci-robot · 2022-01-18T04:09:17Z

@JaneLiuL: GitHub didn't allow me to request PR reviews from the following users: JaneLiuL.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

JaneLiuL · 2022-01-18T05:42:47Z

@damemi very good PR~~ Very happy to see that descheduler support Default and Informer mode.
I can help to separate severl PR if you not mind :)

Dentrax · 2022-01-20T12:43:23Z

Hey @damemi, According to issue #696, we can get to this PR if you show us what kind of things needs to be done. 🙏 We can be whether work on this PR or create a new carry PR by cherry-picking your commits. Can you please enlighten us to see our next step?

cc @developer-guy @eminaktas @yasintahaerol @necatican @f9n

k8s-ci-robot · 2022-01-20T12:43:29Z

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: login-issues@jira.linuxfoundation.org

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot · 2022-02-03T19:46:57Z

@damemi: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-descheduler-test-e2e-k8s-master-1-21	`f01c473`	link	`/test pull-descheduler-test-e2e-k8s-master-1-21`
pull-descheduler-helm-test	`f01c473`	link	`/test pull-descheduler-helm-test`
pull-descheduler-test-e2e-k8s-master-1-22	`f01c473`	link	`/test pull-descheduler-test-e2e-k8s-master-1-22`
pull-descheduler-test-e2e-k8s-1-21-1-21	`f01c473`	link	`/test pull-descheduler-test-e2e-k8s-1-21-1-21`
pull-descheduler-unit-test-master-master	`f01c473`	link	true	`/test pull-descheduler-unit-test-master-master`
pull-descheduler-test-e2e-k8s-master-1-23	`f01c473`	link	true	`/test pull-descheduler-test-e2e-k8s-master-1-23`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-triage-robot · 2022-03-05T20:31:54Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-04-04T21:31:35Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-04-04T21:31:44Z

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

damemi added 6 commits January 25, 2021 11:56

Add strategy controller types

bf30288

Move strategy function definitions to strategies pkg

4c02bdc

Add strategy RunMode setting

3f80158

Add strategy controller workqueue functions

cb756f0

Implement informers for Node strategies

3861a73

Update README with informed strategies

848180f

k8s-ci-robot requested review from ingvagabund and k82cn January 25, 2021 18:48

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 25, 2021

damemi mentioned this pull request Jan 25, 2021

Proposal: Informed strategies #489

Closed

Pass strategy names to controller

f01c473

ingvagabund reviewed Jan 26, 2021

View reviewed changes

lixiang233 reviewed Jan 27, 2021

View reviewed changes

lixiang233 reviewed Jan 28, 2021

View reviewed changes

k8s-ci-robot requested a review from seanmalloy February 4, 2021 05:15

lixiang233 mentioned this pull request Mar 20, 2021

Avoid unnecessary aborting #470

Closed

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 6, 2021

k8s-ci-robot requested a review from denkensk June 1, 2021 07:01

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 10, 2022

damemi mentioned this pull request Jan 14, 2022

Proposal: Long-lived deployment: Triggering Descheduler with Events (SharedIndexInformer rules, endpoints, etc) #696

Open

k8s-ci-robot removed the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 20, 2022

k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Jan 20, 2022

damemi mentioned this pull request Mar 2, 2022

Descheduler Framework Proposal #753

Closed

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 5, 2022

k8s-ci-robot closed this Apr 4, 2022

knelasevero mentioned this pull request Jul 11, 2023

Descheduling framework wrap up #1187

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Allow Node strategies to run with informers #488

[WIP] Allow Node strategies to run with informers #488

damemi commented Jan 25, 2021

k8s-ci-robot commented Jan 25, 2021

ingvagabund left a comment

ingvagabund Jan 26, 2021

ingvagabund Jan 26, 2021

ingvagabund Jan 26, 2021

damemi commented Jan 26, 2021

ingvagabund commented Jan 27, 2021

lixiang233 left a comment

damemi commented Jan 27, 2021

lixiang233 Jan 28, 2021

damemi Jan 28, 2021

lixiang233 Jan 28, 2021

damemi Jan 28, 2021

ingvagabund Feb 3, 2021 •

edited

damemi Feb 3, 2021

seanmalloy commented Feb 4, 2021

k8s-ci-robot commented Apr 6, 2021

denkensk commented Jun 1, 2021

k8s-triage-robot commented Jan 10, 2022

k8s-ci-robot commented Jan 18, 2022

JaneLiuL commented Jan 18, 2022

Dentrax commented Jan 20, 2022

k8s-ci-robot commented Jan 20, 2022

k8s-ci-robot commented Feb 3, 2022

k8s-triage-robot commented Mar 5, 2022

k8s-triage-robot commented Apr 4, 2022

k8s-ci-robot commented Apr 4, 2022

[WIP] Allow Node strategies to run with informers #488

[WIP] Allow Node strategies to run with informers #488

Conversation

damemi commented Jan 25, 2021

k8s-ci-robot commented Jan 25, 2021

ingvagabund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

damemi commented Jan 26, 2021

ingvagabund commented Jan 27, 2021

lixiang233 left a comment

Choose a reason for hiding this comment

damemi commented Jan 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ingvagabund Feb 3, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanmalloy commented Feb 4, 2021

k8s-ci-robot commented Apr 6, 2021

denkensk commented Jun 1, 2021

k8s-triage-robot commented Jan 10, 2022

k8s-ci-robot commented Jan 18, 2022

JaneLiuL commented Jan 18, 2022

Dentrax commented Jan 20, 2022

k8s-ci-robot commented Jan 20, 2022

k8s-ci-robot commented Feb 3, 2022

k8s-triage-robot commented Mar 5, 2022

k8s-triage-robot commented Apr 4, 2022

k8s-ci-robot commented Apr 4, 2022

ingvagabund Feb 3, 2021 •

edited