Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added optional thanos sidecar in Prometheus to ease integration of external Thanos in the stack #12425

Merged

Conversation

dharapvj
Copy link
Contributor

@dharapvj dharapvj commented Jun 30, 2023

What this PR does / why we need it:
With #11424, we removed Thanos support from our Seed monitoring. Some of the arguments for removal are important and valid. But currently, the Cortex based monitoring stack is not available for seed. So, as of now, anyone who has KKP 2.22+ installation does not have seed level metrics beyond 5 days. This makes support really difficult.

We tried to add back the Thanos helm chart by ourselves but it still needs Prometheus sidecar to push metrics to Thanos. Alternative way of using Thanos receiver is overly complicated and resource hungry.

In this PR - we propose to bring back ONLY the Thanos sidecar in prometheus and corresponding alerts so that external Thanos helm chart can get the data fed in. By default the sidecar will be kept disabled.

Which issue(s) this PR fixes:

None

What type of PR is this?

/kind feature
/kind regression

Special notes for your reviewer:

Does this PR introduce a user-facing change? Then add your Release Note here:

Made Prometheus helm chart extensible so that external metric storage solutions like Thanos can be easily integrated for seed long-term monitoring.

Documentation:

https://github.com/kubermatic/docs/pull/1496

@kubermatic-bot kubermatic-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. docs/provided Denotes a PR that has a valid documentation reference. kind/feature Categorizes issue or PR as related to a new feature. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. kind/regression Categorizes issue or PR as related to a regression from a prior release. sig/app-management Denotes a PR or issue as being assigned to SIG App Management. labels Jun 30, 2023
@dharapvj dharapvj requested a review from wurbanski June 30, 2023 06:47
@kubermatic-bot kubermatic-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 30, 2023
@wurbanski
Copy link
Contributor

Sorry for not responding earlier, but I'm wondering if explicitly adding Thanos sidecar is the best approach here - wouldn't it be better to provide documentation to enable Thanos integration using the mechanisms already in the chart? e.g. show how to add correct labels, use correct config instead of the if in Prometheus container and eventually allowing arbitrary sidecar in Prometheus?

@wurbanski wurbanski changed the title Added optional thanos sidecar in Prometheus to ease integration of exteranl Thanos in the stack Added optional thanos sidecar in Prometheus to ease integration of external Thanos in the stack Jul 3, 2023
@dharapvj
Copy link
Contributor Author

dharapvj commented Jul 4, 2023

wouldn't it be better to provide documentation to enable Thanos integration using the mechanisms already in the chart? e.g. show how to add correct labels, use correct config instead of the if in Prometheus container and eventually allowing arbitrary sidecar in Prometheus?

May be I could not see it how. The issue that I see is.. we need to start prometheus differently passing additional parameters etc when we start with Thanos sidecar. Can we discuss and then do the PR correctly?

@kubermatic-bot kubermatic-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 11, 2023
@kubermatic-bot kubermatic-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 11, 2023
@dharapvj
Copy link
Contributor Author

This PR is now ready for re-review.

Have added sidecarContainers, podLabels, minor fix for extraArgs param, unconditional enabling of web-admin-api (which remains accessible only within the cluster) to allow me plug-in thanos.

And removed thanos scraping and alert rules. Which I add myself in my own installation via values.yaml customization

@wurbanski
Copy link
Contributor

This looks much cleaner to me, thank you!

/lgtm

@kubermatic-bot kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 11, 2023
@kubermatic-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3daa8871d6b84bd4def29a9788f23762c8f3bd32

@dharapvj
Copy link
Contributor Author

/cherrypick release/v2.22
/cherrypick release/v2.23

@kubermatic-bot
Copy link
Contributor

@dharapvj: once the present PR merges, I will cherry-pick it on top of release/v2.22 in a new PR and assign it to you.

In response to this:

/cherrypick release/v2.22
/cherrypick release/v2.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dharapvj
Copy link
Contributor Author

@wurbanski - somehow LGTM did not work.. you will need to re-approve it.

@wurbanski
Copy link
Contributor

/approve

@kubermatic-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dharapvj, wurbanski

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubermatic-bot kubermatic-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 12, 2023
@kubermatic-bot kubermatic-bot merged commit f896294 into kubermatic:main Jul 12, 2023
5 checks passed
@kubermatic-bot kubermatic-bot added this to the KKP 2.24 milestone Jul 12, 2023
@kubermatic-bot
Copy link
Contributor

@dharapvj: new pull request created: #12468

In response to this:

/cherrypick release/v2.22
/cherrypick release/v2.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kubermatic-bot
Copy link
Contributor

@dharapvj: new pull request created: #12469

In response to this:

/cherrypick release/v2.22
/cherrypick release/v2.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@xmudrii xmudrii added the backport-needed Denotes a PR or issue that has not been fully backported. label Jul 17, 2023
@xmudrii
Copy link
Member

xmudrii commented Jul 17, 2023

@dharapvj Can you please update the release note to match the latest changes made to the PR? Please do the same for cherry-pick PRs.

@dharapvj
Copy link
Contributor Author

dharapvj commented Jul 19, 2023

@xmudrii - so do I edit the release note directly? or in the description of this issue? Since I have not done direct editing of release-notes before.. I thought I would check with you.

I have edited the note above.. please check if that is sufficient. I will fix similarly for cherry-picked PRs as well

@dharapvj
Copy link
Contributor Author

dharapvj commented Jul 19, 2023

I do not have access on cherry picked PRs to edit description :(

Maybe @wurbanski you can? Also can you approve those PRs please?

@xmudrii
Copy link
Member

xmudrii commented Jul 19, 2023

so do I edit the release note directly? or in the description of this issue?

You did it correctly -- the release note is in the PR description in a code block marked with release-note. To change the release note, it's sufficient just to change that block in the PR description.

I do not have access on cherry picked PRs to edit description :(

I went ahead and copied the release note from this PR. For the future reference, you can easily change release notes using the release-note-edit command: #12468 (comment)

For example:

/release-note-edit
```release-note
New release note
```

@ahmedwaleedmalik
Copy link
Member

ahmedwaleedmalik commented Jul 21, 2023

What's up with the release note on this PR? @dharapvj

New release note

@ahmedwaleedmalik
Copy link
Member

ahmedwaleedmalik commented Jul 21, 2023

The docs comment is wrong as well. It should have just been NONE instead of

- NONE (no documentation needed for this PR)

This gets wrongly labeled with docs/provided when you need it as docs/none.

@dharapvj
Copy link
Contributor Author

dharapvj commented Jul 25, 2023

What's up with the release note on this PR?

Ha ha... marco was showing me how to edit release notes.. in the process here edited existing releases with sample values 🤣

I have re-fixed the notes now.

The docs comment is wrong as well.

I have also updated the docs link with docs PR now.

@embik embik added backport-complete Denotes a PR or issue which has been fully backported to all required release branches. and removed backport-needed Denotes a PR or issue that has not been fully backported. labels Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-complete Denotes a PR or issue which has been fully backported to all required release branches. dco-signoff: yes Denotes that all commits in the pull request have the valid DCO signoff message. docs/provided Denotes a PR that has a valid documentation reference. kind/feature Categorizes issue or PR as related to a new feature. kind/regression Categorizes issue or PR as related to a regression from a prior release. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/app-management Denotes a PR or issue as being assigned to SIG App Management. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants