Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use managed Prometheus with in-cluster Alertmanager #6

Open
razvan-moj opened this issue Mar 1, 2022 · 6 comments
Open

Use managed Prometheus with in-cluster Alertmanager #6

razvan-moj opened this issue Mar 1, 2022 · 6 comments
Labels
proposed Proposed by community, to be reviewed by service team

Comments

@razvan-moj
Copy link

Prometheus (as deployed by the commonly used operator chart) is difficult to maintain and a resource hog; because of that AMP is very attractive. Alertmanager though works fine for our purposes, and we have

$ kubectl get prometheusrule -A -ojson | jq -r '.items[].spec.groups[].rules[].alert' | wc -l
    4449

alerts defined, by teams which can each access only one namespace (https://github.com/ministryofjustice/cloud-platform-environments/search?q=prometheusrule), so no shared visibility. Alerts go directly to eg Slack, each team having control of its channel and hooks.

It would be ideal for us to use the managed Prometheus but keep alert definitions and Alertmanager as they are right now (presumably, AMP would need a configuration option to reach the cluster's AM).

@ampabhi-aws
Copy link
Contributor

Hey @razvan-moj,

Thank you for posting this issue! I would love to better understand this use case, and had a few follow up questions for you.

  1. Have you seen our slack integration via SNS blog post? It walks through how to integrate AMP's alertmanager with Slack such that you have a very similar configuration paradigm to the native slack receiver. The blog can be found here. Would this solve your problem and allow you to use the AMP Alert Manager? If not, we'd love to learn a bit more as to why!

  2. We are looking to eventually add support for the native slack receiver. If AMP's alert manager had the native slack receiver support, would that enable you to use the AMP Alert Manager? If not, would love to learn a little bit more about what's driving the use case!

Thank you for submitting this feature proposal! Excited to learn a bit more about the use case from you.

@razvan-moj
Copy link
Author

Hey @ampabhi-aws !

  1. Have you seen our slack integration via SNS blog post? It walks through how to integrate AMP's alertmanager with Slack such that you have a very similar configuration paradigm to the native slack receiver. The blog can be found here. Would this solve your problem and allow you to use the AMP Alert Manager? If not, we'd love to learn a bit more as to why!

We have many (4000 odd as listed above) rules and alerts defined by users in their namespaces; those rules are picked up by Alertmanager once the .yaml is applied. Users do not have access to the AWS API. To change their setup to AMP it seems we need to need to change their workflow from kubectl apply to terraform apply and figure out a way to validate edits so they don't eg overwrite each other, something done now by the namespace isolation and a validation webhook. This work would not be needed if we could just configure the AWS Prometheus to read prometheusrules from the namespaces and hook with the in-cluster Alertmanager.

  1. We are looking to eventually add support for the native slack receiver. If AMP's alert manager had the native slack receiver support, would that enable you to use the AMP Alert Manager? If not, would love to learn a little bit more about what's driving the use case!

I don't think this is a problem for us, if we decide to use the AWS API to interact with Prometheus, we can also allow SNS and our users are familiar with how it works.

@ampabhi-aws
Copy link
Contributor

Thanks for the clarification @razvan-moj!

If we provided you a way to use CRDs to configure the rules directly via the Kubernetes APIs, and similarly configure your AMP alert manager via those CRDs, would that alleviate the need for an in-cluster Alertmanager? Trying to get a better sense of if the problem is (1) the AMP Alert manager is inconvenient to use, or (2) Doesn't fit the use case entirely.

@ampabhi-aws ampabhi-aws added the proposed Proposed by community, to be reviewed by service team label Mar 28, 2022
@razvan-moj
Copy link
Author

a way to use CRDs to configure the rules directly via the Kubernetes APIs

A definite +1 for this!

@jeromeinsf
Copy link

@ampabhi-aws what are the next steps here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposed Proposed by community, to be reviewed by service team
Projects
None yet
Development

No branches or pull requests

3 participants