Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a churnOp to scheduler perf testing framework #98900

Merged
merged 1 commit into from Mar 11, 2021

Conversation

Huang-Wei
Copy link
Member

@Huang-Wei Huang-Wei commented Feb 9, 2021

What type of PR is this?

/kind feature
/sig scheduling

What this PR does / why we need it:

Introduce a churn operator to scheduler perf testing framework. The churn operator is aimed to provide configurable options to continuously churn the cluster by creating/recreating API objects. Below is an explanation for new fields:

  • opcode: the value must be churn
  • templatePaths: an array specifying spec files for different API objects. For instance:
    templatePaths:
    - config/churn/node-default.yaml
    - config/churn/service-default.yaml
  • mode: two modes are supported for now:
    • recreate: creates N objects and then deletes afore-create N objects. N is specified by number
    • create: continuously create objects until a threshold is met. The threshold is specified by number
  • number: cycles of operations to run-to-stop (for mode create) or an infinite create-N-objs-delete-N-objs iteration (for mode recreate).

Which issue(s) this PR fixes:

Fix the 3rd issue of #98898.

Special notes for your reviewer:

Some examples:

  • templatePaths = [churn/node.yaml, churn/service.yaml], mode = recreate, number = 1
    [DEBUG] create name = node-churn-rgx94
    [DEBUG] create name = service-churn-vjcp7
    [DEBUG] create name = node-churn-ztvgz
    [DEBUG] create name = service-churn-c4tnc
    [DEBUG] delete name = node-churn-rgx94
    [DEBUG] delete name = service-churn-vjcp7
    [DEBUG] delete name = node-churn-ztvgz
    [DEBUG] delete name = service-churn-c4tnc
    [DEBUG] create name = node-churn-82lp8
    [DEBUG] create name = service-churn-nmz8l
    [DEBUG] create name = node-churn-lggct
    [DEBUG] create name = service-churn-r6cfr
    [DEBUG] delete name = node-churn-82lp8
    [DEBUG] delete name = service-churn-nmz8l
    [DEBUG] delete name = node-churn-lggct
    [DEBUG] delete name = service-churn-r6cfr
    [DEBUG] create name = node-churn-rzdzv
    [DEBUG] create name = service-churn-g7dgf
    # benchmark test stops.
    
  • templatePaths = [churn/node.yaml, churn/service.yaml], mode = recreate, number = 3
    [DEBUG] create name = node-churn-52ptg
    [DEBUG] create name = service-churn-nxhqj
    [DEBUG] create name = node-churn-tmvwz
    [DEBUG] create name = service-churn-hxnnz
    [DEBUG] create name = node-churn-lj9w5
    [DEBUG] create name = service-churn-kc5cd
    [DEBUG] delete name = node-churn-52ptg
    [DEBUG] delete name = service-churn-nxhqj
    [DEBUG] delete name = node-churn-tmvwz
    [DEBUG] delete name = service-churn-hxnnz
    [DEBUG] delete name = node-churn-lj9w5
    [DEBUG] delete name = service-churn-kc5cd
    # benchmark test stops.
    
  • templatePaths = [churn/node.yaml, churn/service.yaml], mode = create, number = 0 (0 = infinite)
    [DEBUG] create name = node-churn-vkvlt
    [DEBUG] create name = service-churn-tbkvf
    [DEBUG] create name = node-churn-hpfp8
    [DEBUG] create name = service-churn-c5xl5
    [DEBUG] create name = node-churn-lddzf
    [DEBUG] create name = service-churn-czxfw
    [DEBUG] create name = node-churn-pnf6x
    [DEBUG] create name = service-churn-fpd97
    [DEBUG] create name = node-churn-mjp4w
    [DEBUG] create name = service-churn-gpssn
    # benchmark test stops.
    
  • templatePaths = [churn/node.yaml, churn/service.yaml], mode = create, number = 2
    [DEBUG] create name = node-churn-txcbv
    [DEBUG] create name = service-churn-wlpwj
    [DEBUG] create name = node-churn-fq4xd
    [DEBUG] create name = service-churn-vfv2s
    # benchmark test not stop yet
    # no ouptut from churn opeartor
    # benchmark test stops.
    

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 9, 2021
@Huang-Wei
Copy link
Member Author

/assign @adtac

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Feb 9, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Huang-Wei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 9, 2021
@alculquicondor
Copy link
Member

Why does creating a service generate requeuing? Is this because of a non-default plugin?

If so, this requeuing event won't be very helpful in the future. Rather, we can test the most critical paths: node creation, tolerations change, pod affinity, etc.

@Huang-Wei
Copy link
Member Author

Why does creating a service generate requeuing? Is this because of a non-default plugin?

Our current logic moves pods anyways upon service events, so a Service generator can be an easy way to simulate churn.

If so, this requeuing event won't be very helpful in the future.

Probably, esp. if we remove the service event handlers when ServiceAffinity gets deprecated.

Rather, we can test the most critical paths: node creation, tolerations change, pod affinity, etc.

We definitely should exercise to introduce those events as well as Volume events, but again, a Service generator can be a good start.

@alculquicondor
Copy link
Member

Probably, esp. if we remove the service event handlers when ServiceAffinity gets deprecated

Not just that, which is happening soon regardless. With your change, since the default plugins don't include ServiceAffinity, we would effective remove any reactions to service events.

So, what I'm saying is that this test is not very future proof.

@Huang-Wei Huang-Wei force-pushed the churn-cluster-op branch 2 times, most recently from 6397b83 to b553c83 Compare February 11, 2021 05:17
@Huang-Wei
Copy link
Member Author

@alculquicondor I updated the PR to support add/deleting API objects interleavingly. API objects include Nodes/Pods/Services for now.

@@ -43,6 +44,7 @@ const (
configFile = "config/performance-config.yaml"
createNodesOpcode = "createNodes"
createPodsOpcode = "createPods"
churnOpcode = "churn"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's name this in a way that test configurations are more readable. Perhaps continuosRecreates or noiseRecreates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used churn a lot and it's the most readable wording to me :) But I'm not a naming expert, @ahg-g @adtac any naming thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer churn too. Maybe "backgroundChurn" to emphasise that this is non-blocking? would also somewhat cover @alculquicondor's suggestion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  - opcode: churn
    serviceTemplatePath: config/churn/service-default.yaml
    intervalMilliseconds: 1000

this doesn't make sense to me. It almost feel like networking churn. Maybe recreatesChurn

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pure-service churn is an example of evaluating how Service events may or may not churn the cluster. I removed it to postpone the debate of how practical it could be - I'd like to get to a consensus of how a churn op looks like first, and in follow-up PRs we can add/discuss test cases on demand, also how wild the test case can churn the cluster.

The current logic is to create & delete customizable API objects interleavingly. Other options are:

  • create & delete N API objects interleavingly
  • keep creating API objects (without deleting them)

@alculquicondor @adtac WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have something like this:

  - opcode: apiChurn
    mode: recreate
    serviceTemplatePath: config/churn/service-default.yaml
    intervalMilliseconds: 1000

Note that I'm not referring to the specific test case, just how the test description and configuration looks like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG. So we have different churn modes/patterns to choose from.

  • recreate (or continuous-create)
  • ephemeral-create (create & delete a single object)
  • round-robin-create (create N objects, and then delete N objects, N can be configurable)
  • and probably incorporate update events

BTW: when you say recreate, do you mean continuously creating API objects?

@@ -449,3 +449,32 @@
initNodes: 5000
initPods: 2000
measurePods: 5000

- name: PendingPodsWithMixedChurn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another idea, which perhaps is more realistic and something we would like tested.

Create pods first (they are unschedulable). Then create Nodes (perhaps have a ticker for it). We would have to combine that with pending pods that are unschedulable for different reasons (like waiting for pod affinity).

@adtac didn't we have a test similar to what I'm describing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another very important one:
New nodes show up, but with not-ready taint, then the taint is removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create pods first (they are unschedulable). Then create Nodes (perhaps have a ticker for it).

SG. However, in that case we need to enable both SkipWaitToCompletion and collectMetrics, which is not supported yet:

// CollectMetrics and SkipWaitToCompletion can never be true at the
// same time, so if we're here, it means that all pods have been
// scheduled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't we have a test similar to what I'm describing?

no, I don't think we did. multiple createNodes in one workload was one of my motivations for the scheduler_perf rewrite, but I never got around to opening a PR with such a workloda even though I tested it locally :/

However, in that case we need to enable both SkipWaitToCompletion and collectMetrics

True. Unfortunately, adding support for specifying both parameters seems to be quite difficult because of how metrics/throughput collection works.

The node taint one is a good idea 👍 but, of course, it requires a modifyNodes op or something like that.

@Huang-Wei
Copy link
Member Author

/retest

test/integration/scheduler_perf/scheduler_perf_test.go Outdated Show resolved Hide resolved
test/integration/scheduler_perf/scheduler_perf_test.go Outdated Show resolved Hide resolved
countParam: $initPods
podTemplatePath: config/pod-high-priority-large-cpu.yaml
skipWaitToCompletion: true
- opcode: churn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the churn pods be in the same namespace as the createPods op? I can't remember if that makes a difference in the scheduler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the users can provide namespace name in the configuration if they think namespace is correlated.

@@ -43,6 +44,7 @@ const (
configFile = "config/performance-config.yaml"
createNodesOpcode = "createNodes"
createPodsOpcode = "createPods"
churnOpcode = "churn"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer churn too. Maybe "backgroundChurn" to emphasise that this is non-blocking? would also somewhat cover @alculquicondor's suggestion

@@ -449,3 +449,32 @@
initNodes: 5000
initPods: 2000
measurePods: 5000

- name: PendingPodsWithMixedChurn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't we have a test similar to what I'm describing?

no, I don't think we did. multiple createNodes in one workload was one of my motivations for the scheduler_perf rewrite, but I never got around to opening a PR with such a workloda even though I tested it locally :/

However, in that case we need to enable both SkipWaitToCompletion and collectMetrics

True. Unfortunately, adding support for specifying both parameters seems to be quite difficult because of how metrics/throughput collection works.

The node taint one is a good idea 👍 but, of course, it requires a modifyNodes op or something like that.

@Huang-Wei
Copy link
Member Author

/retest

@Huang-Wei
Copy link
Member Author

Huang-Wei commented Mar 2, 2021

Follow-up of #98900 (comment), I rewrote this PR to provide different modes (currently create and recreate) to create/recreate API objects, also introduced dynamic client to accept arbitrary spec file ins templatePaths.

@alculquicondor @adtac Please see updated description section for more details.

barrierOpcode = "barrier"

Recreate = "recreate"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these ones deserve a comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

b.Fatalf("op %d: unable to parse the %v-th template path: %v", opIndex, i, err)
}
// Obtain GVR.
mapping, err := restMapper.RESTMapping(gvk.GroupKind(), gvk.Version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I didn't know this was possible.

client := clientset.NewForConfigOrDie(cfg)
dynClient := dynamic.NewForConfigOrDie(cfg)
cachedClient := cacheddiscovery.NewMemCacheClient(client.Discovery())
restMapper := restmapper.NewDeferredDiscoveryRESTMapper(cachedClient)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: calculate restMapper outside of this function, as only one test cares about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Member

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good for squash to me.
Anything to add @adtac?

barrierOpcode = "barrier"

// Two modes supported in "churn" operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add empty line, as this comment refers to more than one constant

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

- support two modes: recreate and create
- use DynmaicClient to create API objects
@Huang-Wei
Copy link
Member Author

/retest

@Huang-Wei
Copy link
Member Author

/priority important-soon
/triage accepted

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 3, 2021
Copy link
Member

@adtac adtac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold for nit since this is approved, feel free to remove and merge
sorry I couldn't review this earlier, I was busy with the code freeze :) we're still within the test freeze, so this PR should be fine for 1.21

@@ -116,7 +116,7 @@ type testConfig struct {

// getBaseConfig returns baseConfig after initializing number of nodes and pods.
func getBaseConfig(nodes int, pods int) *testConfig {
destroyFunc, podInformer, clientset := mustSetupScheduler()
destroyFunc, podInformer, clientset, _ := mustSetupScheduler()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: client

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO there is not much difference among the names "client", "cs" and "clientset" - they're usually used interchangeably. As this file is deprecated, so I leave the name as is. In other places that jointly used clientset and dynamic clientset, I renamed them to client and dynamicClient.

So I'm going to:
/hold cancel

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 10, 2021
@ahg-g
Copy link
Member

ahg-g commented Mar 10, 2021

/milestone v1.21

This PR improves testing, hence eligible for the milestone.

@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Mar 10, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 10, 2021
@Huang-Wei
Copy link
Member Author

/retest

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Mar 11, 2021

@Huang-Wei: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-bazel-build 10676015aad2f00c39ac09604070e0d129a4bc07 link /test pull-kubernetes-bazel-build

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Huang-Wei
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot merged commit 823fa75 into kubernetes:master Mar 11, 2021
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants