Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run - possibly invalid test #87191

Open
ingvagabund opened this issue Jan 14, 2020 · 4 comments · May be fixed by #87242
Assignees

Comments

@ingvagabund
Copy link
Contributor

@ingvagabund ingvagabund commented Jan 14, 2020

What happened:

I am questioning validity of the test. Comments under test/e2e/scheduling/predicates.go of validates resource limits of pods that are allowed to run say:

// This test verifies we don't allow scheduling of pods in a way that sum of
// limits of pods is greater than machines capacity.
// It assumes that cluster add-on pods stay stable and cannot be run in parallel
// with any other test that touches Nodes or Pods.
// It is so because we need to have precise control on what's running in the cluster.
// Test scenario:
// 1. Find the amount CPU resources on each node.
// 2. Create one pod with affinity to each node that uses 70% of the node CPU.
// 3. Wait for the pods to be scheduled.
// 4. Create another pod with no affinity to any node that need 50% of the largest node CPU.
// 5. Make sure this additional pod is not scheduled.
/*
	Release : v1.9
	Testname: Scheduler, resource limits
	Description: Scheduling Pods MUST fail if the resource limits exceed Machine capacity.
*/

The test scenario described as it is sounds reasonable. The only nitpick I have is the very first sentence saying "This test verifies we don't allow scheduling of pods in a way that sum of limits of pods is greater than machines capacity". Pod's resource limits are ignored in predicates. They take effect in the scoring phase when priorities are computed. Also, priorities are computed only when there are at least two candidate nodes. So after revisiting the step number 5 it's impossible to make sure the additional pod does or does not get scheduled when setting the resource limits. There are either no candidate nodes or one. In which case the limits are ignored and the test passes in the first case and fails in the second. In case there are at least two candidate nodes, one of them will always get picked. In which case the additional pod gets scheduled (no matter what the resource limits field says).

In the framework:

  • resource limits are accessed through ResourceLimits.PostFilter method [1]
  • ResourceLimits.PostFilter is used to perform pre-scoring [2]
  • in the scoring phase score is computed through ResourceLimits.Score method [3]

The test is alternating between passing and failing state. It passes when predicates give empty list of candidate nodes and failing when the list of candidate nodes is non-empty. Thus, making the pod resource limits ineffective.

[1]

func getResourceLimits(pod *v1.Pod) *schedulernodeinfo.Resource {
result := &schedulernodeinfo.Resource{}
for _, container := range pod.Spec.Containers {
result.Add(container.Resources.Limits)
}
// take max_resource(sum_pod, any_init_container)
for _, container := range pod.Spec.InitContainers {
result.SetMaxResource(container.Resources.Limits)
}
return result
}

[2] https://kubernetes.io/docs/concepts/configuration/scheduling-framework/#post-filter
[3]
func (rl *ResourceLimits) Score(ctx context.Context, state *framework.CycleState, pod *v1.Pod, nodeName string) (int64, *framework.Status) {
nodeInfo, err := rl.handle.SnapshotSharedLister().NodeInfos().Get(nodeName)
if err != nil || nodeInfo.Node() == nil {
return 0, framework.NewStatus(framework.Error, fmt.Sprintf("getting node %q from Snapshot: %v, node is nil: %v", nodeName, err, nodeInfo.Node() == nil))
}
allocatableResources := nodeInfo.AllocatableResource()
podLimits, err := getPodResource(state)
if err != nil {
return 0, framework.NewStatus(framework.Error, err.Error())
}
cpuScore := computeScore(podLimits.MilliCPU, allocatableResources.MilliCPU)
memScore := computeScore(podLimits.Memory, allocatableResources.Memory)
score := int64(0)
if cpuScore == 1 || memScore == 1 {
score = 1
}
return score, nil
}

What you expected to happen:

Have another node schedule the additional pod validating that all saturated nodes (with filler-pods-* scheduled) have lower score (0 in case of ResourceLimits plugin). Or, invalidate the test by removing it.

Can anyone check the test validity and confirm/disprove my statement about ineffectiveness of the resource limits?

How to reproduce it (as minimally and precisely as possible):
Make sure there is at least one node with non-zero free cpu resource before running the step 5.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): master HEAD
  • Cloud provider or hardware configuration: N/A
  • OS (e.g: cat /etc/os-release): N/A
  • Kernel (e.g. uname -a): N/A
  • Install tools: N/A
  • Network plugin and version (if this is a network-related bug): N/A
  • Others: N/A
@ingvagabund

This comment has been minimized.

Copy link
Contributor Author

@ingvagabund ingvagabund commented Jan 14, 2020

@ingvagabund

This comment has been minimized.

Copy link
Contributor Author

@ingvagabund ingvagabund commented Jan 14, 2020

@bsalamat I have noticed you refactored the test in #53169. Can you still recall the original intent of the test? I can't grasp how the limits are actually taken into account.

@ahg-g

This comment has been minimized.

Copy link
Member

@ahg-g ahg-g commented Jan 14, 2020

I agree, I think the test is wrong, the scheduler doesn't make fit decisions on resource limits. In the test, the pod that gets created at the end should have Requests set.

do you mind fixing the test and update the test comment please?

@ingvagabund

This comment has been minimized.

Copy link
Contributor Author

@ingvagabund ingvagabund commented Jan 14, 2020

do you mind fixing the test and update the test comment please?

On it
/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.