Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to scheduler priority functions #2906

Merged
merged 8 commits into from
Jan 15, 2015

Conversation

abhgupta
Copy link

  • Modified the existing spreading priority to consider service pods explicitly
  • Added a new priority function to spread pods across zones

@abhgupta
Copy link
Author

@bgrant0607 @davidopp @mikedanese please review
This is the follow up to the scheduler configurability changes - we can now spread pods based on service selector labels and zones.

This will need to be reconciled with #2825
Depending upon which one gets merged first, the other PR will need to be rebased.

@mikedanese
Copy link
Member

This can be refactored into an algorithm provider to reconcile with #2825. It should be pretty straight forward.

// spreads pods belonging to the same service across minions
factory.AddPriority("ServiceSpreadPriority", algorithm.NewServiceSpreadPriority(factory.ServiceLister), 1)
// spreads pods belonging to the same service across minions in different zones
factory.AddPriority("ZoneSpreadPriority", algorithm.NewZoneSpreadPriority(factory.ServiceLister, "zone"), 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there precedence for reserving a specific label key? If we do start treating certain labels specially we might want to reference them from a const where the special behavior is documented (possibly like we do for ports in pkg/master/ports/ports.go) as well as documenting this behavior somewhere like the labels doc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really give much thought on how this was supposed to be done, so I just left it for the review. Important thing is that this label is applicable to minions only and should not be user modifiable (ones who create pods/replication controllers/services). This is something that the admin/installer is supposed to deal with and hence reserving a label for this purpose on the minions should be reasonable.

The label key should be configurable but does not need to be documented in a generic manner, since this does not apply to labels generically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing is, we wouldn't want this "zone" label to be hardcoded (even as a const) inside factory.go. After the plugin refactoring, this should be made a const inside the algorithm provider that defines the predicates/priorities.

If the above is acceptable, I'd say we let this be for now and change once the plugin refactoring PR merges.

@abhgupta abhgupta force-pushed the abhgupta-dev branch 2 times, most recently from 4946746 to 49d49d5 Compare December 15, 2014 18:45
@abhgupta
Copy link
Author

@davidopp Made changes based on initial review feedback - please take another look when you get a chance.

@abhgupta abhgupta force-pushed the abhgupta-dev branch 5 times, most recently from 8363041 to 4f86075 Compare December 22, 2014 23:52
@abhgupta
Copy link
Author

@davidopp @bgrant0607 @mikedanese PTAL
I have implemented a few predicates and priority functions that capture most of the common use cases that we have. Most of these have to do with multiple nested levels of infrastructure topology (regions/zones/racks). Affinity and anti-affinity (spread) are now handled for service pods.

@abhgupta abhgupta force-pushed the abhgupta-dev branch 5 times, most recently from d61b909 to 0d028ae Compare December 23, 2014 02:02
@abhgupta
Copy link
Author

This PR is now ready for review.


podCounts := map[string]int{}
numServicePods := len(pods)
if numServicePods > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check is not necessary. Iteration over an empty slice is a no-op.

// Fit is defined based on whether the minion has the specified label values as the pod being scheduled
// Alternately, if the pod does not specify any/all labels, the other pods in the service are looked at
factory.RegisterFitPredicate("ServiceAffinity", algorithm.NewServiceAffinityPredicate(factory.PodLister, factory.ServiceLister, factory.MinionLister, []string{"region"})),
// Fit is defined based on the presence/absence of a label on a minion, regardless of value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is specifically using the "region" label, the comment should probably mention that.

@abhgupta
Copy link
Author

abhgupta commented Jan 8, 2015

@davidopp PTAL. I have made changes to incorporate all the feedback.

@abhgupta abhgupta force-pushed the abhgupta-dev branch 2 times, most recently from 4c5c4a5 to 3da3229 Compare January 8, 2015 22:23
@abhgupta
Copy link
Author

@davidopp This PR is now ready for a final review.

@davidopp
Copy link
Member

Very sorry for the delay. Will take a look this evening.

return antiAffinity.CalculateAntiAffinityPriority
}

func (s *ServiceAntiAffinity) CalculateAntiAffinityPriority(pod api.Pod, podLister PodLister, minionLister MinionLister) (HostPriorityList, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment explaining how this priority function works.

@davidopp
Copy link
Member

Other than the comments I made, LGTM!

@abhgupta
Copy link
Author

@davidopp @bgrant0607 This PR should now be ready for merge.

@bgrant0607
Copy link
Member

@davidopp Please merge this if you're happy with it.

davidopp added a commit that referenced this pull request Jan 15, 2015
Enhancements to scheduler priority functions
@davidopp davidopp merged commit 2675cfa into kubernetes:master Jan 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants