Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init cache with assigned non-terminated pods before scheduling #45453

Merged
merged 3 commits into from
May 10, 2017

Conversation

k82cn
Copy link
Member

@k82cn k82cn commented May 7, 2017

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #45220

Release note:

The fix makes scheduling go routine waiting for cache (e.g. Pod) to be synced.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 7, 2017
@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels May 7, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k82cn
Copy link
Member Author

k82cn commented May 7, 2017

/cc @bsalamat

@k8s-ci-robot k8s-ci-robot requested a review from bsalamat May 7, 2017 05:47
@k82cn
Copy link
Member Author

k82cn commented May 7, 2017

I'm going to add some test for it.

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 7, 2017
@wanghaoran1988
Copy link
Contributor

why not wait scheduledPodPopulator.hasSynced ?

@k82cn
Copy link
Member Author

k82cn commented May 7, 2017

why not wait scheduledPodPopulator.hasSynced ?

reasonable, let me check the detail of Informer.

@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 7, 2017
@BlueBlue-Lee
Copy link

5908108#diff-77bdee21550bc698bfa654de11c64324L455

I think it will never return because the corresponding controller is not started.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 8, 2017
@k82cn
Copy link
Member Author

k82cn commented May 8, 2017

5908108#diff-77bdee21550bc698bfa654de11c64324L455
I think it will never return because the corresponding controller is not started.

Thanks for the comments, addressed.

@k82cn
Copy link
Member Author

k82cn commented May 8, 2017

Tested in local cluster by following steps:

  1. Start cluster with ./hack/local-cluster-up.sh (8 cpu, 2G mem)
  2. Submit 10 pods (1 cpu, 300m mem): 7 pod will be running and 3 pods will be pending (dns consumed 260m cpu)
  3. Keep restarting "hyperkube scheduler":
    • without fix, it may report OutOfCpu
    • with fix, not status changed (7 running, 3 pending)

The reason is that we did not wait for cache to be synced by before scheduling, so the running pods may not be added into cache when scheduling. We did not need to wait for unassigned tasks, scheduler will get it one by one to dispatch.

/cc @bsalamat @wojtek-t

@k8s-ci-robot k8s-ci-robot requested a review from wojtek-t May 8, 2017 05:23
@k82cn k82cn changed the title [WIP] Init cache with assigned non-terminated pods before scheduling Init cache with assigned non-terminated pods before scheduling May 8, 2017
@bsalamat
Copy link
Member

bsalamat commented May 8, 2017

/lgtm

Thanks for the fix, Klaus!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 8, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bsalamat, k82cn
We suggest the following additional approver: wojtek-t

Assign the PR to them by writing /assign @wojtek-t in a comment when ready.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@wojtek-t wojtek-t assigned ncdc, timothysc, wojtek-t and bsalamat and unassigned bsalamat May 8, 2017
@timothysc timothysc added kind/bug Categorizes issue or PR as related to a bug. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels May 8, 2017
@timothysc timothysc added this to the v1.7 milestone May 8, 2017
@timothysc
Copy link
Member

@k82cn please add a release note on this one.

@ncdc any comments? It looks good to me, re: prevents a startup race, but does this fix need to propagate to the kcm?

@k82cn
Copy link
Member Author

k82cn commented May 9, 2017

please add a release note on this one.

Done

re: prevents a startup race, but does this fix need to propagate to the kcm?

I can do some check today and append comments here.

@k82cn
Copy link
Member Author

k82cn commented May 9, 2017

@ncdc any comments? It looks good to me, re: prevents a startup race, but does this fix need to propagate to the kcm?

@ncdc / @timothysc , just check the kcm: each controller call cache.WaitForCacheSync in Run() (cronjob sync it by List).

@wanghaoran1988
Copy link
Contributor

wanghaoran1988 commented May 9, 2017

@k82cn Scheduler seems use the sharedInformerFactory for other resources like nodes,configMap,pv, except for the pod, here is the code for pods: scheduledPodIndexer, c.scheduledPodPopulator = cache.NewIndexerInformer, it's a not registered informer of the sharedInformerFactory, so I think your code sharedInformerFactory,WaitForCacheSync will not sync the scheduled pod and notScheduledPod.
Also, your code fix part of the problem, you synced the other resources except pods, I am not clear why the scheduledPodIndexer and scheduledPodPopulator not using the sharedInformer, maybe we have other concerns ? Correct me if I am wrong.
/cc @timothysc @davidopp

@k82cn
Copy link
Member Author

k82cn commented May 9, 2017

scheduledPodIndexer, c.scheduledPodPopulator = cache.NewIndexerInformer, it's a not registered informer of the sharedInformerFactory,

Good point! I'll add logic to handle it. We did not add it into informer factory because the ListerWatcher is customized (un-assigned, non-terminated).

@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels May 9, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 9, 2017
@timothysc timothysc added approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels May 10, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 45453, 45307, 44987)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduler schedules pod to node which has not enough resources
10 participants