Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Direct Scheduler throughput e2e test-suite #2681

Open
hakuna-matatah opened this issue May 14, 2024 · 2 comments
Open

Direct Scheduler throughput e2e test-suite #2681

hakuna-matatah opened this issue May 14, 2024 · 2 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@hakuna-matatah
Copy link
Contributor

hakuna-matatah commented May 14, 2024

What would you like to be added:
We need a test suite for measuring direct scheduler throughput without depending on KCM QPS limits.

Why is this needed:

  • Today our load test only tests scheduler-throughput at a a very small scale i.e max-pods to 5k and also its limited by KCM throughput as current test-suite depends on deployment/replicaset controller throughput in creating pods. That would limit us from knowing the actual scheduler throughput that we can achieve out of scheduler today.

  • Also, some users of K8s today actually leverage scheduler directly to create pods without having to depend on KCM for other wrappers like deployments/jobs/replicaset controllers, we need a test-suite that would help us know the limits of scheduler throughput without having to depend on the KCM bottlenecks/regressions.

  • We could use this test-suite to run periodically in test-infra to see if we truly regress w.r.t max scheduler throughput in our e2e tests.

@hakuna-matatah hakuna-matatah added the kind/feature Categorizes issue or PR as related to a new feature. label May 14, 2024
@hakuna-matatah
Copy link
Contributor Author

hakuna-matatah commented May 14, 2024

I discussed this in Sig-scalability meeting here and got consensus

Post the sig-scalability discussion, it also appears there is need for scheduler throughput regressions from sig-scheduling folks, for context here - https://kubernetes.slack.com/archives/C09QZTRH7/p1715262959575039

/cc @wojtek-t @marseel

@hakuna-matatah
Copy link
Contributor Author

/cc @dims @mengqiy @shyamjvs - fyi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

1 participant