More pod-affinity code cleanup and prepare for parallelization #29109

wojtek-t · 2016-07-18T13:13:18Z

wojtek-t · 2016-07-20T11:55:31Z

plugin/pkg/scheduler/algorithm/predicates/predicates.go

 }

 // AnyPodMatchesPodAffinityTerm checks if any of given pods can match the specific podAffinityTerm.
-func (checker *PodAffinityChecker) AnyPodMatchesPodAffinityTerm(pod *api.Pod, allPods []*api.Pod, node *api.Node, podAffinityTerm api.PodAffinityTerm) (bool, error) {
+// TODO: Do we really need any pod matching, or all pods matching? I think the latter.
+func (checker *PodAffinityChecker) AnyPodMatchesPodAffinityTerm(pod *api.Pod, allPods []*api.Pod, node *api.Node, podAffinityTerm api.PodAffinityTerm) (bool, bool, error) {


This additionally returns number of pods that match the given term, to avoid checking it again in the next function.

(Deleted earlier comment.) Ah, I think I understand. matchingPodExists means "matching pod exists anywhere" while the first return value means "matching pod exists on a node that matches the topology key" ?

If so, please add a comment like

// First return value indicates whether a matching pod exists on a node that matches the topology key, // while the second return value indicates whether a matching pod exists anywhere.

Yes exactly - will extend the comment as you suggested.

davidopp · 2016-07-21T21:57:52Z

plugin/pkg/scheduler/algorithm/predicates/predicates.go

 }

 // AnyPodMatchesPodAffinityTerm checks if any of given pods can match the specific podAffinityTerm.
-func (checker *PodAffinityChecker) AnyPodMatchesPodAffinityTerm(pod *api.Pod, allPods []*api.Pod, node *api.Node, podAffinityTerm api.PodAffinityTerm) (bool, error) {
+// TODO: Do we really need any pod matching, or all pods matching? I think the latter.


Why do you think it should be all pods matching?

When called from NodeMatchesHardPodAffinity() it is enforcing a rule like "only put the new pod on a node that is already running a pod from the storage service, since the new pod needs very low-latency communication with the storage service." You don't need all the pods on the node to match, just one, for the node to be accepted.

When called from NodeMatchesHardPodAntiAffinity() it is enforcing a rule like like "do not put the new pod on a node that is already running (some pod that the new pod would conflict with)." You don't need all the pods on the node to match, just one, for the node to be rejected.

I definitely agree for the "Anti-Affinity" that it should be "some pod".

However, for "Affinity", I'm not that convinced. If this is the case for the service (that you described above), a request to a service may be redirected to any pod from it. So my feeling is that it should be "close to all pods".

OK, that's a valid point, let me refine my example.

"Run on the same node" but where the two pods communicate via a hard-coded host port or have some out-of-band way to find each other and use localhost to communicate when they discover they share a node

"Run in same zone" where we assume service affinity (Proposal: Service connection affinity #15675) has been implemented (service routes you to instance in your same zone, if one exists)

I don't understand the "close to all pods" use case. Can you give an example?

OK, so what I had in mind is:

assume that we have a service and it has 2 underlying pods: 1 in zone A and 1 in zone B

then if the pod we are scheduling is going to talk to this service, its connections will be forwarded to random from those 2 pods (we don't have service affinity, right?)

then ideally, we would like to be close to both pods, because we don't know which one we will be talking to

However, in the example above, this is impossible, so maybe it is enough to actually be close to "any of those"?

davidopp · 2016-07-21T23:44:29Z

Looks good except the few comments.

davidopp · 2016-07-22T06:50:56Z

LGTM

k8s-bot · 2016-07-22T07:21:53Z

GCE e2e build/test passed for commit fad876b.

k8s-github-robot · 2016-07-22T10:12:00Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-bot · 2016-07-22T10:44:28Z

GCE e2e build/test passed for commit fad876b.

k8s-github-robot · 2016-07-22T10:44:42Z

Automatic merge from submit-queue

wojtek-t added the release-note-none Denotes a PR that doesn't merit a release note. label Jul 18, 2016

wojtek-t assigned davidopp Jul 18, 2016

googlebot added the cla: yes label Jul 18, 2016

k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 18, 2016

wojtek-t force-pushed the pod_affinity_predicate branch from 9e3b270 to 1c8162e Compare July 18, 2016 13:32

wojtek-t reviewed Jul 20, 2016
View reviewed changes

wojtek-t force-pushed the pod_affinity_predicate branch from 1c8162e to 0c08ddc Compare July 21, 2016 12:08

davidopp reviewed Jul 21, 2016
View reviewed changes

wojtek-t force-pushed the pod_affinity_predicate branch from 0c08ddc to 00761aa Compare July 22, 2016 06:11

PodAffinity code refinements

fad876b

wojtek-t force-pushed the pod_affinity_predicate branch from 00761aa to fad876b Compare July 22, 2016 06:49

davidopp added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 22, 2016

k8s-github-robot merged commit ab8f772 into kubernetes:master Jul 22, 2016

wojtek-t deleted the pod_affinity_predicate branch August 16, 2016 13:46

aronchick mentioned this pull request Aug 26, 2016

Workload spreading across failure domains (fix pod anti-affinity performance problem) kubernetes/enhancements#51

Closed

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More pod-affinity code cleanup and prepare for parallelization #29109

More pod-affinity code cleanup and prepare for parallelization #29109

wojtek-t commented Jul 18, 2016

wojtek-t Jul 20, 2016

davidopp Jul 21, 2016

wojtek-t Jul 22, 2016

davidopp Jul 21, 2016

wojtek-t Jul 22, 2016

davidopp Jul 22, 2016

wojtek-t Jul 22, 2016

davidopp commented Jul 21, 2016

davidopp commented Jul 22, 2016

k8s-bot commented Jul 22, 2016

k8s-github-robot commented Jul 22, 2016

k8s-bot commented Jul 22, 2016

k8s-github-robot commented Jul 22, 2016

More pod-affinity code cleanup and prepare for parallelization #29109

More pod-affinity code cleanup and prepare for parallelization #29109

Conversation

wojtek-t commented Jul 18, 2016

wojtek-t Jul 20, 2016

Choose a reason for hiding this comment

davidopp Jul 21, 2016

Choose a reason for hiding this comment

wojtek-t Jul 22, 2016

Choose a reason for hiding this comment

davidopp Jul 21, 2016

Choose a reason for hiding this comment

wojtek-t Jul 22, 2016

Choose a reason for hiding this comment

davidopp Jul 22, 2016

Choose a reason for hiding this comment

wojtek-t Jul 22, 2016

Choose a reason for hiding this comment

davidopp commented Jul 21, 2016

davidopp commented Jul 22, 2016

k8s-bot commented Jul 22, 2016

k8s-github-robot commented Jul 22, 2016

k8s-bot commented Jul 22, 2016

k8s-github-robot commented Jul 22, 2016