Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add more scheduler benchmark testcases #69898

Merged
merged 1 commit into from
Oct 24, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
128 changes: 118 additions & 10 deletions test/integration/scheduler_perf/scheduler_bench_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ import (
"github.com/golang/glog"
)

var (
defaultNodeStrategy = &testutils.TrivialNodePrepareStrategy{}
)

// BenchmarkScheduling benchmarks the scheduling rate when the cluster has
// various quantities of nodes and scheduled pods.
func BenchmarkScheduling(b *testing.B) {
Expand All @@ -45,41 +49,90 @@ func BenchmarkScheduling(b *testing.B) {
for _, test := range tests {
name := fmt.Sprintf("%vNodes/%vPods", test.nodes, test.existingPods)
b.Run(name, func(b *testing.B) {
benchmarkScheduling(test.nodes, test.existingPods, test.minPods, setupStrategy, testStrategy, b)
benchmarkScheduling(test.nodes, test.existingPods, test.minPods, defaultNodeStrategy, setupStrategy, testStrategy, b)
})
}
}

// BenchmarkSchedulingAntiAffinity benchmarks the scheduling rate of pods with
// BenchmarkSchedulingPodAntiAffinity benchmarks the scheduling rate of pods with
// PodAntiAffinity rules when the cluster has various quantities of nodes and
// scheduled pods.
func BenchmarkSchedulingAntiAffinity(b *testing.B) {
func BenchmarkSchedulingPodAntiAffinity(b *testing.B) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure where the code for the perf dashboard lives, but it is worth checking to make sure that renaming this function does not affect the dashboard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. From below snapshot, it's very likely it's running -bench=. option:

image

@wojtek-t @shyamjvs If function BenchmarkSchedulingAntiAffinity is renamed to BenchmarkSchedulingPodAntiAffinity, I guess old datapoints would stop growing and become stale? And maybe we need to do some trick (manually rename the local dataset file) to make old data connect with new data, so that the graph is still consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with perf-dash.
@krzysied - can you please suggest something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess old datapoints would stop growing and become stale

Yes. The new datapoints will by assigned to new function name. The old one will remain unchanged.

And maybe we need to do some trick (manually rename the local dataset file) to make old data connect with new data, so that the graph is still consistent.

Can be done. The question is whether we need this.
Regarding benchmarks, Perfdash presents 100 newest results. Assuming that there is one test every hour, after a week there will be only results with the new function name available.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @krzysied @wojtek-t , then don't bother to "migrate" old dataset :)

tests := []struct{ nodes, existingPods, minPods int }{
{nodes: 500, existingPods: 250, minPods: 250},
{nodes: 500, existingPods: 5000, minPods: 250},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you are at it, could you add nodes: 1000, existing: 1000, minPods: 500 test here and in the next test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bsalamat to confirm, would you like the following tests to be added this 1000-node test entry?

  • BenchmarkSchedulingPodAntiAffinity
  • BenchmarkSchedulingPodAffinity
  • BenchmarkSchedulingNodeAffinity

And after that, our cases would be like this:

  • BenchmarkScheduling (have no affinity)

    {nodes: 100, existingPods: 0, minPods: 100},
    {nodes: 100, existingPods: 1000, minPods: 100},
    {nodes: 1000, existingPods: 0, minPods: 100},
    {nodes: 1000, existingPods: 1000, minPods: 100},
  • BenchmarkSchedulingPodAntiAffinity

    {nodes: 500, existingPods: 250, minPods: 250},
    {nodes: 500, existingPods: 5000, minPods: 250},
    {nodes: 1000, existingPods: 1000, minPods: 500},
  • BenchmarkSchedulingPodAffinity

    {nodes: 500, existingPods: 250, minPods: 250},
    {nodes: 500, existingPods: 5000, minPods: 250},
    {nodes: 1000, existingPods: 1000, minPods: 500},
  • BenchmarkSchedulingNodeAffinity

    {nodes: 500, existingPods: 250, minPods: 250},
    {nodes: 500, existingPods: 5000, minPods: 250},
    {nodes: 1000, existingPods: 1000, minPods: 500},

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @bsalamat ^^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable from my side :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Huang-Wei Looks good for now. I want to be able to try a larger number of nodes, but it has the risk of timing out in our CI/CD. Let's go with 1000 for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks @resouer @bsalamat .

{nodes: 1000, existingPods: 1000, minPods: 500},
}
// The setup strategy creates pods with no affinity rules.
setupStrategy := testutils.NewSimpleWithControllerCreatePodStrategy("setup")
// The test strategy creates pods with anti-affinity for each other.
testBasePod := makeBasePodWithAntiAffinity(
testBasePod := makeBasePodWithPodAntiAffinity(
map[string]string{"name": "test", "color": "green"},
map[string]string{"color": "green"})
// The test strategy creates pods with anti-affinity for each other.
testStrategy := testutils.NewCustomCreatePodStrategy(testBasePod)
for _, test := range tests {
name := fmt.Sprintf("%vNodes/%vPods", test.nodes, test.existingPods)
b.Run(name, func(b *testing.B) {
benchmarkScheduling(test.nodes, test.existingPods, test.minPods, defaultNodeStrategy, setupStrategy, testStrategy, b)
})
}
}

// BenchmarkSchedulingPodAffinity benchmarks the scheduling rate of pods with
// PodAffinity rules when the cluster has various quantities of nodes and
// scheduled pods.
func BenchmarkSchedulingPodAffinity(b *testing.B) {
tests := []struct{ nodes, existingPods, minPods int }{
{nodes: 500, existingPods: 250, minPods: 250},
{nodes: 500, existingPods: 5000, minPods: 250},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit curious why we had different number of nodes and pods in BenchmarkScheduling VS BenchmarkSchedulingPodAffinity/AntiAffinity from the beginning? It should've been a good chance to do cross comparison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past, we could hardly go beyond 500 nodes for inter-pod affinity/anti-affinity benchmarks without having test timeouts. With the improved performance of the feature, I think we should be able to run the test for 1000 node clusters with no problem.

{nodes: 1000, existingPods: 1000, minPods: 500},
}
// The setup strategy creates pods with no affinity rules.
setupStrategy := testutils.NewSimpleWithControllerCreatePodStrategy("setup")
testBasePod := makeBasePodWithPodAffinity(
map[string]string{"foo": ""},
map[string]string{"foo": ""},
)
// The test strategy creates pods with affinity for each other.
testStrategy := testutils.NewCustomCreatePodStrategy(testBasePod)
nodeStrategy := testutils.NewLabelNodePrepareStrategy(apis.LabelZoneFailureDomain, "zone1")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just put this simple scenario as a start.

But feel free to comment if we'd like a more typical scenario like "all nodes are split into X regions, Y zones", etc.

for _, test := range tests {
name := fmt.Sprintf("%vNodes/%vPods", test.nodes, test.existingPods)
b.Run(name, func(b *testing.B) {
benchmarkScheduling(test.nodes, test.existingPods, test.minPods, setupStrategy, testStrategy, b)
benchmarkScheduling(test.nodes, test.existingPods, test.minPods, nodeStrategy, setupStrategy, testStrategy, b)
})
}
}

// BenchmarkSchedulingNodeAffinity benchmarks the scheduling rate of pods with
// NodeAffinity rules when the cluster has various quantities of nodes and
// scheduled pods.
func BenchmarkSchedulingNodeAffinity(b *testing.B) {
tests := []struct{ nodes, existingPods, minPods int }{
{nodes: 500, existingPods: 250, minPods: 250},
{nodes: 500, existingPods: 5000, minPods: 250},
{nodes: 1000, existingPods: 1000, minPods: 500},
}
// The setup strategy creates pods with no affinity rules.
setupStrategy := testutils.NewSimpleWithControllerCreatePodStrategy("setup")
testBasePod := makeBasePodWithNodeAffinity(apis.LabelZoneFailureDomain, []string{"zone1", "zone2"})
// The test strategy creates pods with node-affinity for each other.
testStrategy := testutils.NewCustomCreatePodStrategy(testBasePod)
nodeStrategy := testutils.NewLabelNodePrepareStrategy(apis.LabelZoneFailureDomain, "zone1")
for _, test := range tests {
name := fmt.Sprintf("%vNodes/%vPods", test.nodes, test.existingPods)
b.Run(name, func(b *testing.B) {
benchmarkScheduling(test.nodes, test.existingPods, test.minPods, nodeStrategy, setupStrategy, testStrategy, b)
})
}
}

// makeBasePodWithAntiAffinity creates a Pod object to be used as a template.
// makeBasePodWithPodAntiAffinity creates a Pod object to be used as a template.
// The Pod has a PodAntiAffinity requirement against pods with the given labels.
func makeBasePodWithAntiAffinity(podLabels, affinityLabels map[string]string) *v1.Pod {
func makeBasePodWithPodAntiAffinity(podLabels, affinityLabels map[string]string) *v1.Pod {
basePod := &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "affinity-pod-",
GenerateName: "anit-affinity-pod-",
Labels: podLabels,
},
Spec: testutils.MakePodSpec(),
Expand All @@ -99,11 +152,66 @@ func makeBasePodWithAntiAffinity(podLabels, affinityLabels map[string]string) *v
return basePod
}

// makeBasePodWithPodAffinity creates a Pod object to be used as a template.
// The Pod has a PodAffinity requirement against pods with the given labels.
func makeBasePodWithPodAffinity(podLabels, affinityZoneLabels map[string]string) *v1.Pod {
basePod := &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "affinity-pod-",
Labels: podLabels,
},
Spec: testutils.MakePodSpec(),
}
basePod.Spec.Affinity = &v1.Affinity{
PodAffinity: &v1.PodAffinity{
RequiredDuringSchedulingIgnoredDuringExecution: []v1.PodAffinityTerm{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar like https://github.com/kubernetes/kubernetes/pull/69898/files#r225712449, it's the most basic case. We can extend it to have multiple terms, like region in X, zone in Y, etc.

{
LabelSelector: &metav1.LabelSelector{
MatchLabels: affinityZoneLabels,
},
TopologyKey: apis.LabelZoneFailureDomain,
},
},
},
}
return basePod
}

// makeBasePodWithNodeAffinity creates a Pod object to be used as a template.
// The Pod has a NodeAffinity requirement against nodes with the given expressions.
func makeBasePodWithNodeAffinity(key string, vals []string) *v1.Pod {
basePod := &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "node-affinity-",
},
Spec: testutils.MakePodSpec(),
}
basePod.Spec.Affinity = &v1.Affinity{
NodeAffinity: &v1.NodeAffinity{
RequiredDuringSchedulingIgnoredDuringExecution: &v1.NodeSelector{
NodeSelectorTerms: []v1.NodeSelectorTerm{
{
MatchExpressions: []v1.NodeSelectorRequirement{
{
Key: key,
Operator: v1.NodeSelectorOpIn,
Values: vals,
},
},
},
},
},
},
}
return basePod
}

// benchmarkScheduling benchmarks scheduling rate with specific number of nodes
// and specific number of pods already scheduled.
// This will schedule numExistingPods pods before the benchmark starts, and at
// least minPods pods during the benchmark.
func benchmarkScheduling(numNodes, numExistingPods, minPods int,
nodeStrategy testutils.PrepareNodeStrategy,
setupPodStrategy, testPodStrategy testutils.TestPodCreateStrategy,
b *testing.B) {
if b.N < minPods {
Expand All @@ -115,7 +223,7 @@ func benchmarkScheduling(numNodes, numExistingPods, minPods int,

nodePreparer := framework.NewIntegrationTestNodePreparer(
c,
[]testutils.CountToStrategy{{Count: numNodes, Strategy: &testutils.TrivialNodePrepareStrategy{}}},
[]testutils.CountToStrategy{{Count: numNodes, Strategy: nodeStrategy}},
"scheduler-perf-",
)
if err := nodePreparer.PrepareNodes(); err != nil {
Expand Down