Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check watch terminations from clusters loader tests #2054

Open
wojtek-t opened this issue Apr 28, 2022 · 7 comments
Open

Check watch terminations from clusters loader tests #2054

wojtek-t opened this issue Apr 28, 2022 · 7 comments
Assignees
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@wojtek-t
Copy link
Member

One of the important metrics that may suggest overload of the control plane is the number of watches that are closed by kube-apiserver because they don't keep up (or watchcache itself is not keeping).
We want to add a check to our tests that will be validating if this metrics is not too high.

Metrics to exercise:

The easiest way to do it is probably add to prometheus-based measurement, but @marseel to confirm.

@wojtek-t wojtek-t added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. labels Apr 28, 2022
@marseel
Copy link
Member

marseel commented Apr 28, 2022

I believe the easiest way is to add new prometheus query to GenericPrometheusQuery here:


instead of creating new measurement.

@anshulinteg
Copy link

/assign

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 14, 2022
@marseel
Copy link
Member

marseel commented Aug 17, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 17, 2022
@bouaouda-achraf
Copy link
Contributor

/assign

@wojtek-t
Copy link
Member Author

The only missing thing now is to enable the new measurement in our tests, right?

@marseel
Copy link
Member

marseel commented Nov 16, 2022

Enabling measurement and then based on results possibly adding alerting to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

6 participants