Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate S3 collector and make it run inside the seed-ctrl-mgr #9765

Merged
merged 4 commits into from May 10, 2022

Conversation

xrstf
Copy link
Contributor

@xrstf xrstf commented May 9, 2022

What does this PR do / Why do we need it:
The S3-Exporter is only here to produce Prometheus metrics tailored towards KKP's etcd backups (which is the reason we're not just using a generic S3 exporter). It was added in #1482.

The exporter has a couple of issues:

  • It is packaged as a Helm chart that strongly depends on our minio Helm chart, because the Minio chart is creating the kube-system/s3-credentials secret, which is consumed by the S3-Exporter. So even though the exporter code itself is fairly generic, its chart is only useful if you also a) installed Minio and b) installed Minio using our Helm chart.
  • It cannot deal with the modern etcd backup/restore controllers, which allow the admin to define multiple backup destinations.
  • It must be installed "manually" using the KKP installer, which will also setup Minio on the seed. If an admin doesn't want to use Minio at all (but still use a single backup location), they must install the exporter even more manuallalier.

As in the future we want to reconcile more things automatically on seeds (like #9748), so it would be nice if the tiny S3 exporter is also managed automatically and learns how to deal with the multiple backup destinations. This is what this PR achieves.

This PR duplicates the old S3 collector because the new implementation is built around having access to the S3 bucket via the credentials configured in KKP (as apposed to relying on the s3-credentials Secret). The old backup configuration doesn't allow to specify credentials (again, it relies on this Secret created by our Minio chart) and so the new exporter cannot handle the old configuration and the old exporter cannot handle the new configuration.

Until we deprecate the old backup configuration, it's simply easier to just have this new collector run side-by-side and deal with the new configuration (etcd backup/restore) only.

The new exporter also runs as part of the seed-ctrl-mgr. There is little to gain by splitting it apart into a dedicated binary (or at least I cannot see the advantage) and since the seed-ctrl-mgr already manages the backups, it makes sense IMHO to let it also handle the metrics.

Once we remove the old backup configuration stuff, we can rm -rf charts/s3-exporter cmd/s3-exporter. At that point we can then also finally get rid of the s3-credentials secret (which I have hated for a long time).

Does this PR introduce a user-facing change?:

The seed-controller-manager is now providing Prometheus metrics regarding etcd backups (only for the new etcd backup/restore controllers).

@kubermatic-bot kubermatic-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. labels May 9, 2022
@kubermatic-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xrstf

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubermatic-bot kubermatic-bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 9, 2022
@xrstf xrstf changed the title WIP - duplicate S3 collector and make it run inside the seed-ctrl-mgr Duplicate S3 collector and make it run inside the seed-ctrl-mgr May 9, 2022
@kubermatic-bot kubermatic-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 9, 2022
@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label May 10, 2022
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 07c32a7296f3ca2968cbd46b0a2a0bccf99e7717

@embik
Copy link
Member

embik commented May 10, 2022

Any thoughts on mentioning the s3-exporter by name in the release note? The relationship isn't clear to an outsider IMHO.

@xrstf
Copy link
Contributor Author

xrstf commented May 10, 2022

/retest

@kubermatic-triage-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs

Review the full test history

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@xrstf
Copy link
Contributor Author

xrstf commented May 10, 2022

/hold

temporarily

@kubermatic-bot kubermatic-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 10, 2022
@xrstf
Copy link
Contributor Author

xrstf commented May 10, 2022

/hold cancel

@kubermatic-bot kubermatic-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 10, 2022
@kubermatic-bot kubermatic-bot merged commit 3bf72e9 into kubermatic:master May 10, 2022
@xrstf xrstf deleted the absorb-the-s3-exporter branch May 10, 2022 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants