Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: framework: initialize indexers in scheduler core with non-nil map #110663

Merged
merged 1 commit into from Jul 19, 2022

Conversation

fromanirh
Copy link
Contributor

@fromanirh fromanirh commented Jun 20, 2022

What type of PR is this?

/kind bug

What this PR does / why we need it:

Allow scheduler plugins to add their indexers, if they want to.

Which issue(s) this PR fixes:

Fixes #110660

Special notes for your reviewer:

Not sure it was intentional for the scheduler framework to prevent plugins to add indexers. Please check the linked issue for more context.

Does this PR introduce a user-facing change?

For scheduler plugin developers: the scheduler framework's shared PodInformer is now initialized with empty indexers. This enables scheduler plugins to add their extra indexers. Note that only non-conflict indexers are allowed to be added.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 20, 2022
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jun 20, 2022

@fromanirh: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jun 20, 2022
@fromanirh
Copy link
Contributor Author

fromanirh commented Jun 20, 2022

/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 20, 2022
@k8s-ci-robot k8s-ci-robot requested review from damemi and Huang-Wei Jun 20, 2022
Copy link
Contributor

@damemi damemi left a comment

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Jun 21, 2022
@ahg-g
Copy link
Member

ahg-g commented Jun 21, 2022

thanks, I think it is reasonable to allow adding indexers. Can a test be added to verify that this actually fixes it?

@fromanirh
Copy link
Contributor Author

fromanirh commented Jun 21, 2022

thanks, I think it is reasonable to allow adding indexers. Can a test be added to verify that this actually fixes it?

thanks for the comment. Good point, I'll think about a suitable test.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 21, 2022
@Huang-Wei
Copy link
Member

Huang-Wei commented Jun 21, 2022

@fromanirh Thanks for bringing up this.

Not sure it was intentional for the scheduler framework to prevent plugins to add indexers. Please check the linked issue for more context.

No particular reason. Historically, the core scheduler doesn't leverage the indexer so it's never initialized.

Please note that we need to add a caveat in the code, as well as in this PR to claim that duplicated (by the same key) indexer registries can cause conflicts:

func (c *threadSafeMap) AddIndexers(newIndexers Indexers) error {
c.lock.Lock()
defer c.lock.Unlock()
if len(c.items) > 0 {
return fmt.Errorf("cannot add indexers to running index")
}
oldKeys := sets.StringKeySet(c.indexers)
newKeys := sets.StringKeySet(newIndexers)
if oldKeys.HasAny(newKeys.List()...) {
return fmt.Errorf("indexer conflict: %v", oldKeys.Intersection(newKeys))
}
for k, v := range newIndexers {
c.indexers[k] = v
}
return nil
}

pkg/scheduler/scheduler_test.go Outdated Show resolved Hide resolved
pkg/scheduler/scheduler_test.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 22, 2022
@fromanirh fromanirh changed the title sched: schedfwk: init indexers with non-nil map scheduler: framework: initialize indexers in scheduler core with non-nil map Jun 22, 2022
@fromanirh
Copy link
Contributor Author

fromanirh commented Jun 22, 2022

/retest

@fromanirh
Copy link
Contributor Author

fromanirh commented Jun 22, 2022

Please note that we need to add a caveat in the code, as well as in this PR to claim that duplicated (by the same key) indexer registries can cause conflicts:

This is the only comment non explicitely addressed (added unit tests kinda do, however). What could be the best way to convey this information?

@Huang-Wei
Copy link
Member

Huang-Wei commented Jun 22, 2022

What could be the best way to convey this information?

In the release note, document it like "Scheduler framework's shared PodInformer is now initialized with an indexer. Note that only non-conflict indexers are allowed to be added." And also add it as a comment in the code.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Jun 22, 2022
@fromanirh
Copy link
Contributor Author

fromanirh commented Jun 22, 2022

What could be the best way to convey this information?

In the release note, document it like "Scheduler framework's shared PodInformer is now initialized with an indexer. Note that only non-conflict indexers are allowed to be added." And also add it as a comment in the code.

Done, thanks!

@fromanirh
Copy link
Contributor Author

fromanirh commented Jun 22, 2022

@Huang-Wei just asking is this fix suitable for backporting down to 1.23? I think it is small and safe, with high enough ROI.

@Huang-Wei
Copy link
Member

Huang-Wei commented Jun 22, 2022

/retest

@Huang-Wei
Copy link
Member

Huang-Wei commented Jun 24, 2022

is this fix suitable for backporting down to 1.23?

We only backport critical bug fixes :(

@sftim
Copy link
Contributor

sftim commented Jun 25, 2022

For the changelog, please can we write this from the cluster operator's point of view?

For example, explain that we've fixed a crash that happened with some kinds of scheduling plugin.

@fromanirh
Copy link
Contributor Author

fromanirh commented Jun 26, 2022

For the changelog, please can we write this from the cluster operator's point of view?

For example, explain that we've fixed a crash that happened with some kinds of scheduling plugin.

I'll review the changelog, but this change is expected to have no impact from the cluster operator's PoV, but from scheduler plugin developer PoV: without this change, scheduler plugins could not use indexers, so they had to use workarounds. But this is expected to affect efficiency - not stability - of the plugins

@sftim
Copy link
Contributor

sftim commented Jun 26, 2022

There's another outcome: a plugin built to assume that this bug is fixed / enhancement is made will not work in an earlier Kubernetes version.

Maybe we should consider a backport?

@ahg-g
Copy link
Member

ahg-g commented Jun 27, 2022

There's another outcome: a plugin built to assume that this bug is fixed / enhancement is made will not work in an earlier Kubernetes version.

Maybe we should consider a backport?

Plugins are generally versioned and released in lockstep with k8s; I don't think this is a candidate for backport

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2022
Using a nil map to initialize the pod indexers will
cause runtime failure when trying to add indexers
in scheduler plugin.
We use a empty map to enable scheduler plugins
to add their indexers.

Signed-off-by: Francesco Romani <fromani@redhat.com>
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 13, 2022
@Huang-Wei
Copy link
Member

Huang-Wei commented Jul 18, 2022

/retest

@Huang-Wei
Copy link
Member

Huang-Wei commented Jul 18, 2022

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 18, 2022
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jul 18, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fromanirh, Huang-Wei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 18, 2022
@k8s-ci-robot k8s-ci-robot merged commit b52705a into kubernetes:master Jul 19, 2022
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

the scheduler framework does not allow adding indexers in scheduler plugins
6 participants