Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashboard improvements #4

Closed
7 tasks
vlerenc opened this issue Sep 22, 2020 · 3 comments
Closed
7 tasks

Dashboard improvements #4

vlerenc opened this issue Sep 22, 2020 · 3 comments
Labels
area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension status/closed Issue is closed (either delivered or triaged)

Comments

@vlerenc
Copy link
Member

vlerenc commented Sep 22, 2020

Some weeks ago, we compiled a list of dashboards improvements that seem sensible:

  • Let's add a KCM dashboard (we have CCM, MCM, etc. but no KCM dashboard so far)
  • Users could get access to K8s DaemonSets/StatefulSets/Deployments/Pods (filtered by shoot cluster resources only)
  • Users should get access to CCM, MCM, and the new KCM dashboards (ideally without resource usage, even if the dashboards then look empty, but CCM and possible KCM only look empty because some other insightful metrics that for instance MCM features are not yet shown there).
  • User should get access to the CoreDNS dashboard (as-is)
  • The nodes dashboard should also show the inodes as a metric that is eviction-relevant
  • Prometheus lacks the CPU throttling information on Kubelet and the container runtime (see also issue Node details dashboard has missing metrics with containerd gardener#2800) and when we have it, we should add it to the nodes dashboard system components
  • The pods dashboard should feature CPU throttling information ideally, as well
@vlerenc
Copy link
Member Author

vlerenc commented Sep 22, 2020

It may make sense to combine this ticket with gardener/gardener#2815.

@vlerenc vlerenc added area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension labels Sep 28, 2020
@wyb1
Copy link

wyb1 commented Nov 2, 2020

Copied task list since I cannot check the boxes

  • Let's add a KCM dashboard (we have CCM, MCM, etc. but no KCM dashboard so far) -> Dashboard exists now
  • Users should get access to CCM, MCM, and the new KCM dashboards (ideally without resource usage, even if the dashboards then look empty, but CCM and possible KCM only look empty because some other insightful metrics that for instance MCM features are not yet shown there).
    • CCM
    • KCM
    • MCM -> configure extensions see here
  • User should get access to the CoreDNS dashboard (as-is) -> users have access
  • The nodes dashboard should also show the inodes as a metric that is eviction-relevant -> check cadvisor metrics/research possible metrics
  • Prometheus lacks the CPU throttling information on Kubelet and the container runtime (see also issue Node details dashboard has missing metrics with containerd gardener#2800) and when we have it, we should add it to the nodes dashboard system components -> check for existing metrics and research what is available
    • The pods dashboard should feature CPU throttling information ideally, as well

Currently out of scope for monitoring:

  • Users could get access to K8s DaemonSets/StatefulSets/Deployments/Pods (filtered by shoot cluster resources only) -> Could be interesting in the future for user facing monitoring solution. Focus is only components managed by gardener + extensions.

@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Sep 22, 2021
@gardener-robot gardener-robot added lifecycle/rotten Nobody worked on this for 12 months (final aging stage) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Mar 24, 2022
@wyb1 wyb1 removed the lifecycle/rotten Nobody worked on this for 12 months (final aging stage) label Jun 3, 2022
@wyb1
Copy link

wyb1 commented Jun 3, 2022

Many of these items have been completed. If new improvements are required we can create a new issue

@wyb1 wyb1 closed this as completed Jun 3, 2022
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

No branches or pull requests

3 participants