A monitoring system built with Go v1.26.
The OpenAPI specification is located under openapi
.dockerholds docker related files.kubernetesholds kubernetes related files.migrationsholds database migrationsbrunoholds the bruno collections for the API clientcmdholds the application entry pointopenapiholds openapi documentationinternalholds the project logicpkgholds shared code and librariestestholds integration tests
Inspired by Clean Architecture and Hexagonal architecture
cmd, holds the application entry pointsinternal, holds the project logicpkg, holds shared libraries that are not specific to any project
The internal directory is organized as follows:
app, holds the application logic (adapters), like repositories, handlers, middlewares, commandscore, holds the domain logic, separated to domain (models) and services
The usecases directory is responsible for combaning different domain areas and business rules
Spin up containers
make docker/upRun REST API
make go/rest/runRun migrations
make db/migrate/upDestroy containers
make docker/downStart minikube and create the cerberus namespace:
make k8s/setupBuild the application and migration Docker images inside minikube's Docker daemon:
make k8s/buildDeploy all services (PostgreSQL, Vault, Tempo, Grafana, App) via Helm:
make k8s/deploymake k8s/statusUse minikube service to access NodePort services:
minikube service cerberus-app -n cerberus
minikube service cerberus-grafana -n cerberusmake k8s/undeployEach service has its own Helm chart under .kubernetes/:
| Chart | Service | Ports |
|---|---|---|
postgres/ |
PostgreSQL 17.5 | 5432 |
vault/ |
Hashicorp Vault 1.21 | 8200 |
tempo/ |
Grafana Tempo | 3200, 4317, 4318 |
grafana/ |
Grafana 11.6.0 | 3000 |
app/ |
Cerberus REST API | 4000, 4010 |
All services run in the cerberus namespace
Cerberus ships with a full OpenTelemetry stack: the application exports traces and metrics to an OTel Collector, which fans them out to Tempo (traces) and Prometheus (metrics). Grafana provides a unified UI over both.
| Component | Role | Local port |
|---|---|---|
| OTel Collector | Receives OTLP from the app, routes to backends | 4317 (gRPC), 4318 (HTTP) |
| Grafana Tempo | Distributed trace storage & query | 3200 |
| Prometheus | Metrics storage & query | 9090 |
| Grafana | Dashboards, trace & metric exploration | 3000 |
The application must point at the collector. In config.yaml:
collector:
host: "localhost:4317"
probability: 1.0 # sample rate — lower in production (e.g. 0.05)
metricInterval: "30s"Grafana — http://localhost:3000
Grafana is pre-configured with two datasources (no login required in the default dev setup):
| Datasource | UID | What it shows |
|---|---|---|
| Prometheus | prometheus |
HTTP request rates, durations, cache hit/miss, active requests |
| Tempo | tempo |
Distributed traces, per-request spans |
Explore traces
- Open Explore (compass icon in the left sidebar).
- Select the Tempo datasource.
- Use Search to filter by service name (
cerberus), HTTP method, status code, or trace duration. - Click any trace to open the flame graph and see every span — HTTP server, database queries, cache lookups.
Explore metrics
- Open Explore and select the Prometheus datasource.
- Useful metric names to start with:
| Metric | Description |
|---|---|
cerberus_http_server_request_total |
Total HTTP requests (by method, path, status) |
cerberus_http_server_request_duration_seconds |
Request latency histogram |
cerberus_http_server_active_requests |
In-flight requests gauge |
cerberus_cache_hit_total |
In-memory (L1) cache hits |
cerberus_cache_miss_total |
In-memory (L1) cache misses |
cerberus_cache_distributed_hit_total |
Redis (L2) cache hits |
cerberus_cache_distributed_miss_total |
Redis (L2) cache misses |
cerberus_cache_size |
Current number of entries in the in-memory cache |
cerberus_http_client_request_total |
Outgoing HTTP requests to downstream services |
Example PromQL — HTTP error rate over the last 5 minutes:
sum(rate(cerberus_http_server_request_total{status=~"5.."}[5m]))
/
sum(rate(cerberus_http_server_request_total[5m]))
Example PromQL — p99 request latency:
histogram_quantile(0.99,
sum by (le) (rate(cerberus_http_server_request_duration_seconds_bucket[5m]))
)
Correlating traces and metrics
Grafana links traces to metrics automatically when both datasources are configured. In a Prometheus panel, click a data point and choose View in Tempo to jump directly to traces from that time window.
Prometheus — http://localhost:9090
Use the Prometheus UI to run ad-hoc PromQL queries or check scrape targets:
- Status → Targets — confirms the OTel Collector scrape is
UP. - Graph — run any PromQL expression directly.
Tempo — http://localhost:3200
Tempo exposes an HTTP API for direct trace lookup when needed:
# Fetch a trace by ID
curl http://localhost:3200/api/traces/<trace-id>