Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve new control plane dashboards #206

Open
metalmatze opened this issue May 17, 2019 · 2 comments

Comments

Projects
None yet
3 participants
@metalmatze
Copy link
Member

commented May 17, 2019

With #205 merged we have a few new dashboards for the control plane (apiserver, scheduler, proxy, kubelet).

Here are a few TODOs outline for the future:

  • We should unclutter the names and separate components and workload dashboards more.
  • We should make sure that components alerts are represented in dashboards. Example: KubeAPIErrorsHigh needs to be visible in the apiserver dashboard. Reuse recording rule.
  • Reuse more recording rules for control plane dashboards (lots of similar queries across dashboards).
  • Go metrics about components should probably be separated. Either own dashboard or no need at all? Let's discuss.
  • Add more from the discussion below

/cc @povilasv @brancz

@brancz

This comment has been minimized.

Copy link
Member

commented May 17, 2019

Incoming/Outgoing HTTP requests differentiation is not super obvious right now, I think explicitly labeling those would be good.

@povilasv

This comment has been minimized.

Copy link
Contributor

commented May 17, 2019

Go metrics about components should probably be separated. Either own dashboard or no need at all? Let's discuss.

I find those really useful when you are debugging a control plane failure, OOMs / crashloops etc.

The way I envisioned this is that SREs get a single view where they look for symptoms, like "cpu is going thru the roof and we are getting tons of requests, " type of deal.

Agreed on all other points.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.