Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEP-19] Add sidecar to Plutono for fetching dashboard ConfigMaps dynamically #9624

Merged
merged 11 commits into from
Apr 25, 2024

Conversation

rfranzke
Copy link
Member

@rfranzke rfranzke commented Apr 19, 2024

How to categorize this PR?

/area monitoring
/kind enhancement

What this PR does / why we need it:
This PR implements this aspect of GEP-19. Today, when Plutono is deployed, all dashboards must be collected and mounted into the pod. With this PR, dashboards can be dynamically added/removed by labelling ConfigMaps containing the dashboard JSON docs with dashboard.monitoring.gardener.cloud/{garden,seed,shoot}=true, without the necessity to restart the pod.

The approach is to make use of https://github.com/kiwigrid/k8s-sidecar, a sidecar watching for such labelled ConfigMaps, which writes the JSON docs to files in a shared folder (from which the Plutono container is picking up the dashboards). In order to make Plutono reload the dashboards when something changed, the sidecar calls the respective "reload" admin API.

This is only possible with credentials, and the API was basically disabled previously. Now, we enable it and generate a basic auth secret which is auto-rotated each 30d. The API is only accessible within the cluster itself, and blocked via Ingress. Also, no other pod can talk to Plutono (protected via network policies), hence, access to this API is basically only possible from within the pod (by the sidecar) and from human operators with port-forwarding.

Which issue(s) this PR fixes:
Part of #9065

Special notes for your reviewer:
/cc @ScheererJ
FYI @istvanballok @rickardsjp @vicwicker

Note

You can test it locally by running make kind-operator-up operator-up and then kubectl apply -f example/operator/20-garden.yaml for Gardens, or make kind-up gardener-up for seeds, or an additional kubectl apply -f example/provider-local/shoot.yaml for Shoots.

Release note:

Today's method of providing Plutono dashboards for garden or shoot clusters is deprecated and will be removed in a future release. Migrate to the new approach (see [this document](https://github.com/gardener/gardener/blob/master/docs/extensions/logging-and-monitoring.md#plutono-dashboards)) for details.

Copy link
Contributor

gardener-prow bot commented Apr 19, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@gardener-prow gardener-prow bot requested a review from ScheererJ April 19, 2024 13:29
@gardener-prow gardener-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 19, 2024
@rfranzke rfranzke marked this pull request as ready for review April 22, 2024 12:13
@gardener-prow gardener-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 22, 2024
@ialidzhikov
Copy link
Member

/assign

Copy link
Member

@ialidzhikov ialidzhikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We started a review with @dimitar-kostadinov . We managed to review the 5/10 commits so far.

docs/concepts/operator.md Outdated Show resolved Hide resolved
docs/concepts/operator.md Outdated Show resolved Hide resolved
docs/extensions/logging-and-monitoring.md Outdated Show resolved Hide resolved
docs/extensions/logging-and-monitoring.md Outdated Show resolved Hide resolved
docs/extensions/logging-and-monitoring.md Outdated Show resolved Hide resolved
docs/extensions/logging-and-monitoring.md Outdated Show resolved Hide resolved
Copy link
Member

@ialidzhikov ialidzhikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label Apr 23, 2024
Copy link
Contributor

gardener-prow bot commented Apr 23, 2024

LGTM label has been added.

Git tree hash: 10ea8b1ba35590734d854323a937b18b85e28cde

@rfranzke
Copy link
Member Author

/approve

Copy link
Contributor

gardener-prow bot commented Apr 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rfranzke

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gardener-prow gardener-prow bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 24, 2024
Previously, there were two "folders" for dashboards, "garden" and "global".
We don't follow this pattern for the seed or shoot plutono, so let's also drop it for the garden plutono for simplicity.
Now, all dashboards are added in the same folder.
When the sidecar updates the dashboards in the file system, it needs to call an API of Plutono to make it reload the dashboards. This is only available in the admin API, and we need credentials to call it.
This was disabled previously. Now, we enable it and generate a basic auth secret which is auto-rotated each `30d`. The API is only accessible within the cluster itself, and blocked via Ingress. Also, no other pod can talk to Plutono (protected via network policies), hence, access to this API is basically only possible from within the pod (by the sidecar) and from human operators with port-forwarding.
The sidecar is a controller that watches `configmaps`, so it needs access/permissions to do so.
The dashboard `ConfigMap` created by Gardener will be picked up by the sidecar and no longer directly mounted into the pod (next commit).
For a few releases, when computing this `ConfigMap`, the old (now deprecated) extensions contract for providing dashboards will be respected, i.e., they will be collected and added to this `ConfigMap`.
When an extension switches to the approach, this will no longer happen, but the sidecar will pick it up.
@gardener-prow gardener-prow bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 25, 2024
Copy link
Contributor

gardener-prow bot commented Apr 25, 2024

New changes are detected. LGTM label has been removed.

@gardener-prow gardener-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 25, 2024
@rfranzke rfranzke added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. lgtm Indicates that a PR is ready to be merged. labels Apr 25, 2024
@gardener-prow gardener-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 25, 2024
@plkokanov
Copy link
Contributor

/test pull-gardener-e2e-kind-ha-single-zone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/monitoring Monitoring (including availability monitoring and alerting) related cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. kind/enhancement Enhancement, improvement, extension lgtm Indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants