Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add job load test: create multiple jobs based on the number of nodes #1998

Merged
merged 2 commits into from Mar 17, 2022

Conversation

alculquicondor
Copy link
Member

@alculquicondor alculquicondor commented Mar 1, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #799

Special notes for your reviewer:

It was not trivial to gather metrics from kcm, so I'm leaving that as a follow up.

Because of this, I increased the resolution of WaitForJobsToFinish gather interval.

Does this PR introduce a user-facing change?:

Add a load test for the Job controller

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 1, 2022
@wojtek-t
Copy link
Member

wojtek-t commented Mar 2, 2022

/assign @marseel

@alculquicondor
Copy link
Member Author

Thanks, but no need to review just yet. I'm still trying to get it running :)

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 14, 2022
@alculquicondor alculquicondor changed the title WIP Gather metrics from the job controller Create multiple jobs based on the number of nodes Mar 14, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 14, 2022
@alculquicondor alculquicondor changed the title Create multiple jobs based on the number of nodes job load test: create multiple jobs based on the number of nodes Mar 14, 2022
@alculquicondor alculquicondor changed the title job load test: create multiple jobs based on the number of nodes Add job load test: create multiple jobs based on the number of nodes Mar 14, 2022
@alculquicondor
Copy link
Member Author

This is now ready for review @jprzychodzen

@alculquicondor alculquicondor force-pushed the job-metrics branch 2 times, most recently from 89cf0ed to 096dd71 Compare March 14, 2022 17:41
@alculquicondor
Copy link
Member Author

/retest

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 14, 2022
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 14, 2022
Separate Indexed from NonIndexed to compare.
@alculquicondor
Copy link
Member Author

or perhaps @marseel

@@ -0,0 +1 @@
MODE: Indexed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, newline

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to remove the file instead.

@@ -1,12 +1,22 @@
{{$mode := (DefaultParam .MODE "Indexed")}}
{{$pods_per_node_per_size := 20}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not 10 as default and guard it begin parameter?

30 is a suggested number of Pods per Node in highly scalable cluster. As there are 3 sizes (small/medium/large), each would get 10 pods per node.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Updated.

{{$medium_jobs_count := DivideInt $total_pods_per_size $medium_job_size}}
{{$large_job_size := 400}}
{{$large_jobs_count := DivideInt $total_pods_per_size $large_job_size}}

name: batch

namespace:
number: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support only 3000 Pods per Namespace, see [1]. This means that we need single Namespace per 100 Nodes (or 3000 Pods)

[1]https://github.com/kubernetes/community/blob/master/sig-scalability/configs-and-limits/thresholds.md

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the template to have parameters similar to load.

qpsLoad:
qps: 1
qps: 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to use so low constant value?

For CL2 test we are using --env=CL2_LOAD_TEST_THROUGHPUT=50 to determine QPS in saturation tunning test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a parameter for it.

And added parameters for nodes per namespace and throughput
@jprzychodzen
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 17, 2022
@marseel
Copy link
Member

marseel commented Mar 17, 2022

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, jprzychodzen, marseel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 17, 2022
@k8s-ci-robot k8s-ci-robot merged commit 4d7fdf3 into kubernetes:master Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load test: Test run-to-completion workflow for Job objects
6 participants