feat(rds): built-in database observability via sidecar exporters#85
feat(rds): built-in database observability via sidecar exporters#85
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (22)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds first-class observability for managed databases by introducing an optional metrics_enabled flag that provisions a Prometheus exporter sidecar (Postgres/MySQL) and persists exporter metadata in the database record.
Changes:
- Extend the database domain model, API requests, repository schema/queries, and service workflows to support metrics sidecars (
metrics_enabled,metrics_port,exporter_container_id). - Launch and clean up exporter containers during DB create/restore/replica/delete flows.
- Update tests and Swagger/docs/ADR to reflect the new API and behavior.
Reviewed changes
Copilot reviewed 19 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/sdk/volume.go | Removes trailing whitespace at EOF (no functional change). |
| internal/workers/database_failover_worker_test.go | Updates mocked DatabaseService method signatures to include metricsEnabled. |
| internal/repositories/postgres/migrations/097_add_database_metrics.up.sql | Adds metrics_enabled, metrics_port, exporter_container_id columns to databases. |
| internal/repositories/postgres/migrations/097_add_database_metrics.down.sql | Drops the newly added database metrics columns. |
| internal/repositories/postgres/database_repo_unit_test.go | Updates repo unit tests for new columns and scan/insert/update behavior. |
| internal/repositories/postgres/database_repo.go | Persists and reads metrics fields in SQL queries and scanning. |
| internal/handlers/database_handler_test.go | Adds handler test coverage for metrics_enabled and updates mocks. |
| internal/handlers/database_handler.go | Plumbs metrics_enabled from requests into service calls for create/restore. |
| internal/csi/driver.go | Formatting/alignment only. |
| internal/core/services/volume_unit_test.go | Whitespace/alignment only. |
| internal/core/services/database_unit_test.go | Adds tests around snapshots/restore and metrics sidecar provisioning. |
| internal/core/services/database_test.go | Updates service tests for new metricsEnabled parameter. |
| internal/core/services/database.go | Implements metrics sidecar launch/cleanup and persists new metadata fields. |
| internal/core/services/benchmarks_test.go | Updates service construction parameters for benchmarks. |
| internal/core/ports/database.go | Extends DatabaseService interface to include metricsEnabled on create/restore. |
| internal/core/domain/database.go | Adds metrics fields to the Database domain model and JSON shape. |
| docs/swagger/swagger.yaml | Documents new response/request fields (metrics_enabled, metrics_port). |
| docs/swagger/swagger.json | Regenerates Swagger JSON with new fields. |
| docs/swagger/docs.go | Regenerates swagger docs template with new fields. |
| docs/database.md | Documents managed DB observability feature and the new fields. |
| docs/adr/ADR-019-database-metrics-sidecar.md | Adds ADR describing the sidecar exporter design and tradeoffs. |
| cmd/csi-driver/main.go | Formatting/alignment only. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (s *DatabaseService) getExporterConfig(engine domain.DatabaseEngine, dbIP, username, password, dbName string) (string, []string, string) { | ||
| switch engine { | ||
| case domain.EnginePostgres: | ||
| // postgres-exporter uses DATA_SOURCE_NAME | ||
| dsn := fmt.Sprintf("postgresql://%s:%s@%s:5432/%s?sslmode=disable", username, password, dbIP, dbName) | ||
| return "prometheuscommunity/postgres-exporter", []string{"DATA_SOURCE_NAME=" + dsn}, "9187" | ||
| case domain.EngineMySQL: | ||
| // mysqld-exporter uses DATA_SOURCE_NAME | ||
| dsn := fmt.Sprintf("%s:%s@(%s:3306)/%s", username, password, dbIP, dbName) | ||
| return "prom/mysqld-exporter", []string{"DATA_SOURCE_NAME=" + dsn}, "9104" | ||
| } |
There was a problem hiding this comment.
getExporterConfig builds DATA_SOURCE_NAME by interpolating username/password/dbName directly into DSNs. Passwords generated by util.GenerateRandomPassword include characters like @, :, #, etc., which will break both the Postgres URL DSN and the MySQL DSN unless properly escaped/encoded. Encode credentials (and dbName) when constructing the DSN (e.g., URL-encode for Postgres URLs, and apply the appropriate escaping for mysqld-exporter’s DSN format) to ensure exporters can connect reliably.
| expCID, expPorts, err := s.compute.LaunchInstanceWithOptions(ctx, ports.CreateInstanceOptions{ | ||
| Name: exporterName, | ||
| ImageName: exporterImage, | ||
| Ports: []string{"0:" + exporterPort}, |
There was a problem hiding this comment.
The exporter sidecar is launched with a host port mapping (Ports: []string{"0:" + exporterPort}), and the Docker compute backend binds published ports to 0.0.0.0. Since Prometheus exporters are typically unauthenticated, this makes the metrics endpoint broadly reachable on the host network and increases data exposure risk. Consider avoiding host port publishing (scrape over the container/VPC network instead), or add a mechanism to bind to an internal-only interface / enforce network policy so metrics aren’t exposed publicly by default.
| Ports: []string{"0:" + exporterPort}, | |
| Ports: []string{exporterPort}, |
| if metricsEnabled { | ||
| dbIP, err := s.compute.GetInstanceIP(ctx, containerID) | ||
| if err == nil { | ||
| exporterImage, exporterEnv, exporterPort := s.getExporterConfig(dbEngine, dbIP, username, password, name) | ||
| exporterName := fmt.Sprintf("cloud-db-exporter-%s-%s", name, db.ID.String()[:8]) | ||
|
|
||
| expCID, expPorts, err := s.compute.LaunchInstanceWithOptions(ctx, ports.CreateInstanceOptions{ | ||
| Name: exporterName, | ||
| ImageName: exporterImage, | ||
| Ports: []string{"0:" + exporterPort}, | ||
| NetworkID: networkID, | ||
| Env: exporterEnv, | ||
| }) | ||
| if err == nil { | ||
| db.ExporterContainerID = expCID | ||
| expHostPort, err := s.parseAllocatedPort(expPorts, exporterPort) | ||
| if err == nil && expHostPort != 0 { | ||
| db.MetricsPort = expHostPort | ||
| } else { | ||
| expHostPort, _ = s.compute.GetInstancePort(ctx, expCID, exporterPort) | ||
| db.MetricsPort = expHostPort | ||
| } | ||
| } else { | ||
| s.logger.Warn("failed to launch metrics exporter", "database_id", db.ID, "error", err) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
When metricsEnabled is true, failures to fetch the DB container IP or to resolve the exporter’s published port are silently ignored (and db.MetricsEnabled remains true). This can persist a database record that claims metrics are enabled while exporter_container_id/metrics_port are unset (or zero), which is confusing for API consumers and hard to operate. Consider logging when GetInstanceIP fails, and either (a) fail the request, or (b) record metrics as disabled / add an explicit exporter status field so the persisted state reflects what was actually provisioned.
| - **MySQL**: Uses `mysqld-exporter` (port 9104). | ||
|
|
||
| #### Scraping & Monitoring | ||
| Once enabled, the exporter's port is mapped to a host port (available in the `metrics_port` field of the database object). These endpoints are automatically registered with the platform's central Prometheus instance for dashboarding and alerting. |
There was a problem hiding this comment.
This section states that metrics endpoints are “automatically registered with the platform's central Prometheus instance”, but there’s no corresponding registration logic in this PR (the only behavior added is launching an exporter and storing metrics_port). Either adjust the documentation to reflect the current behavior (expose metrics_port for external scraping) or add the actual Prometheus registration mechanism.
| Once enabled, the exporter's port is mapped to a host port (available in the `metrics_port` field of the database object). These endpoints are automatically registered with the platform's central Prometheus instance for dashboarding and alerting. | |
| Once enabled, the exporter's port is mapped to a host port and exposed via the `metrics_port` field of the database object. Operators should configure their monitoring stack (for example, a central Prometheus instance) to scrape this host port for dashboarding and alerting. |
This PR introduces native engine observability for managed databases by automatically provisioning sidecar exporters (PostgreSQL and MySQL) alongside database instances.
Key Features:
metrics_enabledflag during provisioning launches a Prometheus-compatible exporter.postgres-exporter(Port 9187).mysqld-exporter(Port 9104).Implementation Details: