Skip to content
This repository has been archived by the owner on Jan 19, 2024. It is now read-only.

Allow connection to external prometheus server (same as with prometheus-sli-service) #53

Closed
1 of 4 tasks
christian-kreuzberger-dtx opened this issue Jun 5, 2020 · 20 comments · Fixed by #117
Closed
1 of 4 tasks
Assignees

Comments

@christian-kreuzberger-dtx
Copy link
Contributor

christian-kreuzberger-dtx commented Jun 5, 2020

Right now, the prometheus-service installs a separate Prometheus instance (in namespace monitoring), while keptn-contrib/prometheus-sli-service uses a secret to connect to a Prometheus instance running anywhere.

Goal:

User Story:

From #49 (keptn configure monitoring prometheus --project=...):
As a user, I would like to configure an (external) Prometheus with Keptn to be able to add scrape jobs with the Keptn CLI or API.

Definition of Done

  • (1) prometheus-service does no longer install a separate Prometheus instance
  • (2) prometheus-service allows connecting to an external Prometheus instance (via the same means as prometheus-sli-service does)
  • (3) Tutorials need to be updated such that Prometheus is installed independently from Keptn and prometheus-service
  • (4) Docs need to be updated such that Prometheus is installed independently from Keptn and prometheus-service
@checkelmann
Copy link

Assign it to me

@johannes-b
Copy link
Collaborator

johannes-b commented Aug 3, 2020

I would like to help you and provide a list of tasks to do. Let's focus on Definition of Done - No.1 first.

  • Remove the function installPrometheus and all its helper functions:
    • CreateOrUpdatePrometheusNamespace,
    • CreateOrUpdatePrometheusConfigMap,
    • CreateOrUpdatePrometheusClusterRole,
    • CreateOrUpdatePrometheusDeployment
  • Remove the function: installPrometheusAlertManager and all its helper functions:
    • CreateOrUpdateAlertManagerConfigMap,
    • CreateOrUpdateAlertManagerTemplatesConfigMap,
    • CreateOrUpdateAlertManagerDeployment,
    • CreateOrUpdateAlertManagerService.
  • In the then part of this if add a message like that
Prometheus is not installed on cluster
# ATTENTION # ------------------------------------------------------------------------------------
The behavior has changed and Prometheus will NOT be installed automatically.
If you want to roll-out the Prometheus: 
1.) Please follow, e.g., the instructions as provided here: https://xyz.com
2.) Then, re-deploy the prometheus-service: 
    kubectl apply -f https://raw.githubusercontent.com/keptn-contrib/prometheus-service/<VERSION>/deploy/service.yaml
--------------------------------------------------------------------------------------------------

@johannes-b
Copy link
Collaborator

I can take care of updating the keptn.sh/docs.

Please find a preview of the docs here: https://deploy-preview-617--keptn.netlify.app/docs/0.7.x/monitoring/prometheus/install/

@checkelmann
Copy link

Hi,

just an update:
Part 1 ist done.

Not it comes to the tricky part:
Prometheus Operator and the Installation will use secrets instead of normal configmaps for the configuration. The Alert manager and Grafana configuration will remain the same. While the installation namespace could also be different and not being "monitoring" anymore.

Which will create the following Sub Tasks:

  • Adjustment of the service.yaml to configure the namespace for prometheus as environment variable (scanning the whole K8s cluster for the installation would need to many RBAC permissions for the service itself).
  • Adjustment of the RBAC Rules to allow the service reading and writing secrets within the prometheus namespace (hard coded yet).
  • Adjustment of the RBAC Rules to allow the service account to manage CMs etc.. within the prometheus namespace (hard coded yet).
  • All functions within the utils\prometheus.go helper needs to be rewritten to use the secrets for configuration of prometheus.
  • Removal of the delete prometheus pod function as with the operator, prometheus got a build in reload configuration mechanism.

Right now I'm really short on time and I apologize for it, but if someone got some time left to start help me with the implementation?

I've a fork of this project in my personal GitHub Account at https://github.com/checkelmann/prometheus-service

@checkelmann
Copy link

Another update:

It looks like it's not needed to add the scrape config via the prometheus.yaml.
Now it's using CRDs to configure the monitoring of the Services and Pods
https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/design.md

I'm still investigating how to do this.

@johannes-b
Copy link
Collaborator

Thanks for the detailed analysis and the breakdown into sub-tasks.
Am I right that the Prometheus Operator is the preferred way to go now?

I'm happy to help to dig into this issue.

@checkelmann
Copy link

Hi,

unfortunately I can not work on this issue at the moment. Feel free to reassign to another contributor.
I'm sorry for that.

Also I had not had a chance to dig any deeper into this issue, as I did not receive any answer from the Prometheus Community on what are the best practices to setup prometheus in K8s (Operator approach etc).

Christian

@johannes-b johannes-b added the help wanted Extra attention is needed label Oct 1, 2020
@johannes-b
Copy link
Collaborator

@checkelmann Thanks for letting us know!
I pulled it back to the backlog.

But can you please file a PR against the feature branch feature/53/connect-to-external-prometheus (https://github.com/keptn-contrib/prometheus-service/tree/feature/53/connect-to-external-prometheus) to not lose your efforts?

@anukul
Copy link

anukul commented Oct 2, 2020

@johannes-b Hi! I would like to try this.

@christian-kreuzberger-dtx
Copy link
Contributor Author

Thanks @anukul , I've re-assigned it to you.

@checkelmann
Copy link

@checkelmann Thanks for letting us know!
I pulled it back to the backlog.

But can you please file a PR against the feature branch feature/53/connect-to-external-prometheus (https://github.com/keptn-contrib/prometheus-service/tree/feature/53/connect-to-external-prometheus) to not lose your efforts?

Here you go #64

@johannes-b johannes-b linked a pull request Oct 2, 2020 that will close this issue
@johannes-b
Copy link
Collaborator

@anukul please take the PR from @checkelmann (#64) into account.

@jetzlstorfer
Copy link
Member

Any updates on this @anukul ?

@jetzlstorfer
Copy link
Member

If @anukul is not working on it anymore, we should reassign it

@anukul
Copy link

anukul commented Dec 1, 2020

@jetzlstorfer Hi! sorry I've not been able to find time for this.

@anukul anukul removed their assignment Dec 1, 2020
@christian-kreuzberger-dtx
Copy link
Contributor Author

Would it make sense to outsource the installation of prometheus by default, as detailed here: https://www.magalix.com/blog/monitoring-of-kubernetes-cluster-through-prometheus-and-grafana

I don't see any added value of us installing and maintaining a prometheus instance, when this can easily be done by the end user using a couple of commands:

kubectl create namespace prometheus
helm install prometheus stable/prometheus-operator --namespace prometheus

@jaybatra26
Copy link

Hi! Can I take this up as a part of LFX program. I have knowledge of Observability and how to register Prometheus metrics.

@christian-kreuzberger-dtx
Copy link
Contributor Author

Application for this issue will be handled via LFX mentorship program. Please apply there.

@imrajdas
Copy link
Contributor

imrajdas commented Mar 8, 2021

Hi @jetzlstorfer, Can you assign this issue to me? I will be working on this issue as part of the LFX mentorship program.

@jetzlstorfer
Copy link
Member

Here is PR for updating the Prometheus tutorial keptn/tutorials#148

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
7 participants