Skip to content

Commit

Permalink
Observability tools (#1563)
Browse files Browse the repository at this point in the history
- add [Prometheus](https://github.com/prometheus/prometheus) & [Grafana](https://github.com/grafana/grafana) for custom metrics and visualization (/metrics endpoints and anything else we might want to add).
- add [netdata](https://github.com/netdata/netdata) for infrastructure monitoring and alerts (redis, postgres, containers, also prometheus metrics too etc)
- configure netdata to collect postgress, redis, and container metrics.
- configure Prometheus to scrape itself, backend, and inference-server.
- optional env var of `NETDATA_CLAIM_TOKEN` to claim to [netdata cloud](https://www.netdata.cloud/) - makes it easier to work with infra and alerts to discord etc. I work there so am pretty sure can get us a free sponsored space that might be useful. Not trying to sell here or anything, just that it's a potential useful overlap given i work there :) .
- add initial sort of dummy fastapi custom dashboard in `docker/grafana/dashboards`. Idea is we can save dashboards as code in there (**NOTE**: needs much more work - anyone can add/improve dashboards as follow on PR's, my promql skills not great).
- add observability tools to `observability` docker compose profile (**NOTE**: not sure what best approach is here, would need some input from other more familiar with the docker set up).
- add Grafana on port 2000 instead of 3000 since app itself on 3000.
- add some README.md under each `/docker` folder.
  • Loading branch information
andrewm4894 committed Feb 15, 2023
1 parent acb4052 commit 34d400f
Show file tree
Hide file tree
Showing 11 changed files with 694 additions and 0 deletions.
68 changes: 68 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,71 @@ services:
deploy:
replicas: 1
profiles: ["inference"]

prometheus:
image: prom/prometheus
container_name: prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
ports:
- 9090:9090
restart: unless-stopped
volumes:
- ${PWD}/docker/prometheus:/etc/prometheus
- prom_data:/prometheus
profiles: ["observability"]

grafana:
image: grafana/grafana
container_name: grafana
ports:
- 2000:2000
restart: unless-stopped
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=grafana
- GF_SERVER_HTTP_PORT=2000
volumes:
- ${PWD}/docker/grafana/datasources:/etc/grafana/provisioning/datasources
- ${PWD}/docker/grafana/dashboards/dashboard.yaml:/etc/grafana/provisioning/dashboards/main.yaml
- ${PWD}/docker/grafana/dashboards:/var/lib/grafana/dashboards
profiles: ["observability"]

netdata:
image: netdata/netdata
container_name: netdata
pid: host
hostname: oasst-netdata
ports:
- 19999:19999
restart: unless-stopped
cap_add:
- SYS_PTRACE
- SYS_ADMIN
security_opt:
- apparmor:unconfined
volumes:
- netdataconfig:/etc/netdata
- netdatalib:/var/lib/netdata
- netdatacache:/var/cache/netdata
- /etc/passwd:/host/etc/passwd:ro
- /etc/group:/host/etc/group:ro
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /etc/os-release:/host/etc/os-release:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- ${PWD}/docker/netdata/go.d/redis.conf:/etc/netdata/go.d/redis.conf
- ${PWD}/docker/netdata/go.d/postgres.conf:/etc/netdata/go.d/postgres.conf
- ${PWD}/docker/netdata/go.d/prometheus.conf:/etc/netdata/go.d/prometheus.conf
environment:
# useful if want to claim monitoring agents into https://www.netdata.cloud/
# else ignore or leave blank to just use local netdata dashboards as localhost:19999
- NETDATA_CLAIM_TOKEN=${NETDATA_CLAIM_TOKEN:-}
- NETDATA_CLAIM_URL=https://app.netdata.cloud
profiles: ["observability"]

volumes:
prom_data:
netdataconfig:
netdatalib:
netdatacache:
14 changes: 14 additions & 0 deletions docker/grafana/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Grafana

[Grafana](https://github.com/grafana/grafana) is used to visualize custom
observabiltiy metrics and much more.

This folder contains various configuration files for Grafana.

- [`./dashboards/dashboard.yaml`](./dashboards/dashboard.yaml) - Used to tell
Grafana where some pre-configured dashboards live.
- [`./dashboards/fastapi-backend.json`](./dashboards/fastapi-backend.json) - A
json representation of a saved Grafana dashboard focusing on some high level
api endpoint metrics etc.
- [`./datasources/datasource.yml`](./datasources/datasource.yml) - A config file
to set up Grafana to read from the local Prometheus source.
12 changes: 12 additions & 0 deletions docker/grafana/dashboards/dashboard.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: 1

providers:
- name: "Dashboard provider"
orgId: 1
type: file
disableDeletion: false
updateIntervalSeconds: 10
allowUiUpdates: false
options:
path: /var/lib/grafana/dashboards
foldersFromFilesStructure: true

0 comments on commit 34d400f

Please sign in to comment.