Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR required for monitoring setup #8

Closed
HumairAK opened this issue Dec 9, 2020 · 8 comments · Fixed by #13
Closed

ADR required for monitoring setup #8

HumairAK opened this issue Dec 9, 2020 · 8 comments · Fixed by #13
Assignees

Comments

@HumairAK
Copy link
Member

HumairAK commented Dec 9, 2020

We have made some progress with setting up monitoring, and have some issues already created in the appropriate repos for creating service monitors, prometheus / grafana deployments etc. What's lacking is a document that adds context to our setup and future plans. For this we should prepare an ADR that outlines our monitoring architecture.

@HumairAK
Copy link
Member Author

@hemajv see if the adr doc for github alerts can be incorporated here

@4n4nd
Copy link

4n4nd commented Dec 11, 2020

Since we have decided (relevant issue) to keep a single Prometheus instance that monitors all the other component namespaces.

Now we need to decide where to keep our service-monitors/pod-monitors resources.

For the ODH components these resources will probably go in the upstream odh-manifests but we will still need to add the following overlay to the upstream monitors:

spec:
    namespaceSelector:
        matchNames:
          - opf-observatorium # The namespace where the component is deployed in

@HumairAK
Copy link
Member Author

If the service monitors and pod monitors are to go in the same namespace as prometheus, then I think the monitoring folder makes sense. I think we can put them in the base since they aren't environment specific, then inherit them in all the overlays -- wdyt?

@4n4nd
Copy link

4n4nd commented Dec 14, 2020

@HumairAK @anishasthana I was thinking of keeping just one servicemonitor resource which looks into multiple namespaces. Maybe we can make this change upstream and only overlay the list of namespaces in operatefirst overlays?

@HumairAK
Copy link
Member Author

I like the idea of having one servicemonitor for odh,

It would be nice if we could specify a list of namespaces as a parameter in the kfdef that would be populated in the servicemonitor so we don't need to do an override.

@anishasthana
Copy link
Member

I am not opposed to it, but we should ask the monitoring team what they think. I wonder if it's abusing the idea of service monitors.

@4n4nd
Copy link

4n4nd commented Dec 16, 2020

I was able to test a service monitor that was able to monitor different services in multiple namespaces.

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: multiservice-monitor
  labels:
    k8s-app: prometheus
  namespace: opf-stage
spec:
  endpoints:
    - port: metrics
    - port: 8080-tcp
  namespaceSelector:
    matchNames:
      - opf-observatorium
      - opf-jupyterhub
  selector: {}

This works well. I think we should try to get something like this pushed upstream and only set the namespaceSelector as an overlay on the operatefirst side.

@4n4nd 4n4nd mentioned this issue Jan 7, 2021
@hemajv
Copy link
Member

hemajv commented Jan 7, 2021

@hemajv see if the adr doc for github alerts can be incorporated here

I think it would be easier to have a separate ADR for this.
created #16, for adding the ADR of our alerting setup

@tumido tumido added this to Backlog in Master Board via automation Feb 3, 2021
@tumido tumido moved this from Backlog to January 2021 in Master Board Feb 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Master Board
  
January 2021
Development

Successfully merging a pull request may close this issue.

4 participants