Skip to content
This repository has been archived by the owner on Apr 18, 2023. It is now read-only.

Benchmark ResourceRequest objects in cluster #3

Closed
ghost opened this issue Feb 8, 2022 · 6 comments
Closed

Benchmark ResourceRequest objects in cluster #3

ghost opened this issue Feb 8, 2022 · 6 comments

Comments

@ghost
Copy link

ghost commented Feb 8, 2022

etcd publishes guidance on the approximate max amount of data it can handle given cluster size / hardware. https://etcd.io/docs/v3.5/op-guide/hardware/#example-hardware-configurations

Benchmark the numbers of supported resolved ResourceRequest objects before cluster performance is negatively impacted and document here. We are specifically interested in the size of total data in the cluster so make sure to benchmark with many 1.5MB ResourceRequests.

@ghost
Copy link
Author

ghost commented Feb 8, 2022

Write scripts that:

  • Creates a lot of big resolved ResourceRequest objects
  • Creates a lot of unresolved ResourceRequest objects that succeed in resolving
  • Creates a lot of unresolved ResourceRequest objects that are invalid / fail in resolving

Things to look into:

  • Effect on overall cluster health, maybe looking at the latencies of requests to the kube api server? Or failure rates of requests? something like that.
  • Effect on memory consumption: lister caches of the resolvers / ResourceRequest reconciler

@ghost
Copy link
Author

ghost commented Mar 8, 2022

See The Four Golden Signals for an idea of where to get started monitoring the effects and outcomes of heavy Tekton Resolution usage. The signals are: Latency, Traffic, Errors and Saturation.

@tekton-robot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant