New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an issue in inter-pod affinity predicate that cause affinity to self being processed incorrectly #62591

Merged
merged 2 commits into from Apr 17, 2018

Conversation

@bsalamat
Contributor

bsalamat commented Apr 15, 2018

What this PR does / why we need it:
Fixes the anti-affinity issue explained in #62232.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #62232

Special notes for your reviewer:

Release note:

Fix an issue in inter-pod affinity predicate that cause affinity to self being processed incorrectly

/sig scheduling

if term.TopologyKey == kubeletapis.LabelHostname {
pods = nodeInfo.Pods()
affinityTermPropertiesMatch := podMatchesAffinityTermProperties(targetPod, props)
if !affinityTermPropertiesMatch {

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Apr 16, 2018

Contributor

I think it would be better if we add logging here because we are going to return false before check. For example by the time we reach here, we did not check the toplogyKey, so returning false may be not clear to someone who is debugging it.

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

I am not sure what you mean by "return false before check". We do check affinity term namespace and selector and if they don't match the pod, we return false here.

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Apr 17, 2018

Contributor

I am not sure what you mean by "return false before check".

Sorry for the confusion - We have 3 return values(topologies, namespace and selector, error). The false we are returning here is for namespace and selector check but we are returning false for topologies as well, which we did not check yet. So, I think logging that information might be helpful while debugging.

This comment has been minimized.

@bsalamat

bsalamat Apr 17, 2018

Contributor

To be clear, the first return value is for checking "namespaces AND selectors AND topologies". So, it makes sense to return false when namespaces and selectors are not a match.

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Apr 17, 2018

Contributor

Sorry got confused here. I thought first return value is for topologyKey and we check this using NodesHaveSameTopologyKey, second is for namespaces AND selectors.

This comment has been minimized.

@bsalamat

bsalamat Apr 17, 2018

Contributor

You have all the rights to be confused! The logic is quite complex. I tried to simplify as much as I could, but most of the complexity is imposed by the API and can hardly be avoided in the implementation.

return ErrPodAntiAffinityRulesNotMatch, nil
if !matchFound && len(affinityTerms) > 0 {
// We have not been able to find any matches for the pod's affinity rules.
// This pod may the first pod in a series that have affinity to themselves. In order

This comment has been minimized.

@ravisantoshgudimetla

ravisantoshgudimetla Apr 16, 2018

Contributor

nit: s/may/ may be

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

Done.

return false, false, fmt.Errorf("empty topologyKey is not allowed except for PreferredDuringScheduling pod anti-affinity")
}
if !priorityutil.NodesHaveSameTopologyKey(nodeInfo.Node(), targetPodNode, term.TopologyKey) {
return false, affinityTermPropertiesMatch, nil

This comment has been minimized.

@dixudx

dixudx Apr 16, 2018

Member

Since L1181 will return directly when affinityTermPropertiesMatch is false, here affinityTermPropertiesMatch will always be true, right ?

If so, I'd prefer to explicitly use true.

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

You are right. I was not sure if using true would be more readable or the variable. I changed these to true.

}
}
return false, matchingPodExists, nil
return true, affinityTermPropertiesMatch, nil

This comment has been minimized.

@dixudx

dixudx Apr 16, 2018

Member

Ditto.

}
// Check if pod matches its own affinity properties (namespace and label selector).
if !targetPodMatchesAffinityOfPod(pod, pod) {
glog.V(10).Infof("Cannot schedule pod %+v onto node %v, because of PodAffinity",

This comment has been minimized.

@dixudx

dixudx Apr 16, 2018

Member

Ditto.

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

What is this "ditto" for?

This comment has been minimized.

@wgliang

wgliang Apr 17, 2018

Member

His meaning may be that the log information is exactly the same as above.

if term.TopologyKey == kubeletapis.LabelHostname {
pods = nodeInfo.Pods()
affinityTermPropertiesMatch := podMatchesAffinityTermProperties(targetPod, props)
if !affinityTermPropertiesMatch {

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

I am not sure what you mean by "return false before check". We do check affinity term namespace and selector and if they don't match the pod, we return false here.

return false, false, fmt.Errorf("empty topologyKey is not allowed except for PreferredDuringScheduling pod anti-affinity")
}
if !priorityutil.NodesHaveSameTopologyKey(nodeInfo.Node(), targetPodNode, term.TopologyKey) {
return false, affinityTermPropertiesMatch, nil

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

You are right. I was not sure if using true would be more readable or the variable. I changed these to true.

return ErrPodAntiAffinityRulesNotMatch, nil
if !matchFound && len(affinityTerms) > 0 {
// We have not been able to find any matches for the pod's affinity rules.
// This pod may the first pod in a series that have affinity to themselves. In order

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

Done.

}
// Check if pod matches its own affinity properties (namespace and label selector).
if !targetPodMatchesAffinityOfPod(pod, pod) {
glog.V(10).Infof("Cannot schedule pod %+v onto node %v, because of PodAffinity",

This comment has been minimized.

@bsalamat

bsalamat Apr 16, 2018

Contributor

What is this "ditto" for?

@ravisantoshgudimetla

LGTM. I would wait if @dixudx has any other comments as lgtm would initiate merge.

@dixudx

This comment has been minimized.

Member

dixudx commented Apr 17, 2018

/lgtm

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Apr 17, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, dixudx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bsalamat

This comment has been minimized.

Contributor

bsalamat commented Apr 17, 2018

/retest

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Apr 17, 2018

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Apr 17, 2018

@bsalamat: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce-device-plugin-gpu 4f2155a link /test pull-kubernetes-e2e-gce-device-plugin-gpu

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Apr 17, 2018

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-merge-robot k8s-merge-robot merged commit 7966265 into kubernetes:master Apr 17, 2018

13 of 15 checks passed

pull-kubernetes-e2e-gce-device-plugin-gpu Job failed.
Details
Submit Queue Required Github CI test is not green: pull-kubernetes-e2e-gce
Details
cla/linuxfoundation bsalamat authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment