Skip to content

feat(rds): built-in database observability via sidecar exporters#85

Merged
poyrazK merged 7 commits intomainfrom
feature/rds-observability-sidecar
Mar 2, 2026
Merged

feat(rds): built-in database observability via sidecar exporters#85
poyrazK merged 7 commits intomainfrom
feature/rds-observability-sidecar

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Mar 2, 2026

This PR introduces native engine observability for managed databases by automatically provisioning sidecar exporters (PostgreSQL and MySQL) alongside database instances.

Key Features:

  • Sidecar Injection: Optional metrics_enabled flag during provisioning launches a Prometheus-compatible exporter.
  • Engine Support:
    • PostgreSQL: Uses postgres-exporter (Port 9187).
    • MySQL: Uses mysqld-exporter (Port 9104).
  • Automated Lifecycle: Exporter containers are automatically terminated and cleaned up when the database is deleted.
  • VPC Integration: Exporters securely communicate with database engines over internal VPC networks.

Implementation Details:

  • 7 Atomic Commits: Covering domain, schema, repository, service logic, API handlers, tests, and documentation.
  • ADR-019: Documented the architectural sidecar pattern.
  • Verification: Service unit tests verified locally.

Copilot AI review requested due to automatic review settings March 2, 2026 13:55
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 2, 2026

Warning

Rate limit exceeded

@poyrazK has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 37 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 145d5f3 and fe651ed.

📒 Files selected for processing (22)
  • cmd/csi-driver/main.go
  • docs/adr/ADR-019-database-metrics-sidecar.md
  • docs/database.md
  • docs/swagger/docs.go
  • docs/swagger/swagger.json
  • docs/swagger/swagger.yaml
  • internal/core/domain/database.go
  • internal/core/ports/database.go
  • internal/core/services/benchmarks_test.go
  • internal/core/services/database.go
  • internal/core/services/database_test.go
  • internal/core/services/database_unit_test.go
  • internal/core/services/volume_unit_test.go
  • internal/csi/driver.go
  • internal/handlers/database_handler.go
  • internal/handlers/database_handler_test.go
  • internal/repositories/postgres/database_repo.go
  • internal/repositories/postgres/database_repo_unit_test.go
  • internal/repositories/postgres/migrations/097_add_database_metrics.down.sql
  • internal/repositories/postgres/migrations/097_add_database_metrics.up.sql
  • internal/workers/database_failover_worker_test.go
  • pkg/sdk/volume.go
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/rds-observability-sidecar

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@poyrazK poyrazK merged commit 14a4eeb into main Mar 2, 2026
43 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class observability for managed databases by introducing an optional metrics_enabled flag that provisions a Prometheus exporter sidecar (Postgres/MySQL) and persists exporter metadata in the database record.

Changes:

  • Extend the database domain model, API requests, repository schema/queries, and service workflows to support metrics sidecars (metrics_enabled, metrics_port, exporter_container_id).
  • Launch and clean up exporter containers during DB create/restore/replica/delete flows.
  • Update tests and Swagger/docs/ADR to reflect the new API and behavior.

Reviewed changes

Copilot reviewed 19 out of 22 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/sdk/volume.go Removes trailing whitespace at EOF (no functional change).
internal/workers/database_failover_worker_test.go Updates mocked DatabaseService method signatures to include metricsEnabled.
internal/repositories/postgres/migrations/097_add_database_metrics.up.sql Adds metrics_enabled, metrics_port, exporter_container_id columns to databases.
internal/repositories/postgres/migrations/097_add_database_metrics.down.sql Drops the newly added database metrics columns.
internal/repositories/postgres/database_repo_unit_test.go Updates repo unit tests for new columns and scan/insert/update behavior.
internal/repositories/postgres/database_repo.go Persists and reads metrics fields in SQL queries and scanning.
internal/handlers/database_handler_test.go Adds handler test coverage for metrics_enabled and updates mocks.
internal/handlers/database_handler.go Plumbs metrics_enabled from requests into service calls for create/restore.
internal/csi/driver.go Formatting/alignment only.
internal/core/services/volume_unit_test.go Whitespace/alignment only.
internal/core/services/database_unit_test.go Adds tests around snapshots/restore and metrics sidecar provisioning.
internal/core/services/database_test.go Updates service tests for new metricsEnabled parameter.
internal/core/services/database.go Implements metrics sidecar launch/cleanup and persists new metadata fields.
internal/core/services/benchmarks_test.go Updates service construction parameters for benchmarks.
internal/core/ports/database.go Extends DatabaseService interface to include metricsEnabled on create/restore.
internal/core/domain/database.go Adds metrics fields to the Database domain model and JSON shape.
docs/swagger/swagger.yaml Documents new response/request fields (metrics_enabled, metrics_port).
docs/swagger/swagger.json Regenerates Swagger JSON with new fields.
docs/swagger/docs.go Regenerates swagger docs template with new fields.
docs/database.md Documents managed DB observability feature and the new fields.
docs/adr/ADR-019-database-metrics-sidecar.md Adds ADR describing the sidecar exporter design and tradeoffs.
cmd/csi-driver/main.go Formatting/alignment only.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +365 to +375
func (s *DatabaseService) getExporterConfig(engine domain.DatabaseEngine, dbIP, username, password, dbName string) (string, []string, string) {
switch engine {
case domain.EnginePostgres:
// postgres-exporter uses DATA_SOURCE_NAME
dsn := fmt.Sprintf("postgresql://%s:%s@%s:5432/%s?sslmode=disable", username, password, dbIP, dbName)
return "prometheuscommunity/postgres-exporter", []string{"DATA_SOURCE_NAME=" + dsn}, "9187"
case domain.EngineMySQL:
// mysqld-exporter uses DATA_SOURCE_NAME
dsn := fmt.Sprintf("%s:%s@(%s:3306)/%s", username, password, dbIP, dbName)
return "prom/mysqld-exporter", []string{"DATA_SOURCE_NAME=" + dsn}, "9104"
}
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getExporterConfig builds DATA_SOURCE_NAME by interpolating username/password/dbName directly into DSNs. Passwords generated by util.GenerateRandomPassword include characters like @, :, #, etc., which will break both the Postgres URL DSN and the MySQL DSN unless properly escaped/encoded. Encode credentials (and dbName) when constructing the DSN (e.g., URL-encode for Postgres URLs, and apply the appropriate escaping for mysqld-exporter’s DSN format) to ensure exporters can connect reliably.

Copilot uses AI. Check for mistakes.
expCID, expPorts, err := s.compute.LaunchInstanceWithOptions(ctx, ports.CreateInstanceOptions{
Name: exporterName,
ImageName: exporterImage,
Ports: []string{"0:" + exporterPort},
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exporter sidecar is launched with a host port mapping (Ports: []string{"0:" + exporterPort}), and the Docker compute backend binds published ports to 0.0.0.0. Since Prometheus exporters are typically unauthenticated, this makes the metrics endpoint broadly reachable on the host network and increases data exposure risk. Consider avoiding host port publishing (scrape over the container/VPC network instead), or add a mechanism to bind to an internal-only interface / enforce network policy so metrics aren’t exposed publicly by default.

Suggested change
Ports: []string{"0:" + exporterPort},
Ports: []string{exporterPort},

Copilot uses AI. Check for mistakes.
Comment on lines +148 to +174
if metricsEnabled {
dbIP, err := s.compute.GetInstanceIP(ctx, containerID)
if err == nil {
exporterImage, exporterEnv, exporterPort := s.getExporterConfig(dbEngine, dbIP, username, password, name)
exporterName := fmt.Sprintf("cloud-db-exporter-%s-%s", name, db.ID.String()[:8])

expCID, expPorts, err := s.compute.LaunchInstanceWithOptions(ctx, ports.CreateInstanceOptions{
Name: exporterName,
ImageName: exporterImage,
Ports: []string{"0:" + exporterPort},
NetworkID: networkID,
Env: exporterEnv,
})
if err == nil {
db.ExporterContainerID = expCID
expHostPort, err := s.parseAllocatedPort(expPorts, exporterPort)
if err == nil && expHostPort != 0 {
db.MetricsPort = expHostPort
} else {
expHostPort, _ = s.compute.GetInstancePort(ctx, expCID, exporterPort)
db.MetricsPort = expHostPort
}
} else {
s.logger.Warn("failed to launch metrics exporter", "database_id", db.ID, "error", err)
}
}
}
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When metricsEnabled is true, failures to fetch the DB container IP or to resolve the exporter’s published port are silently ignored (and db.MetricsEnabled remains true). This can persist a database record that claims metrics are enabled while exporter_container_id/metrics_port are unset (or zero), which is confusing for API consumers and hard to operate. Consider logging when GetInstanceIP fails, and either (a) fail the request, or (b) record metrics as disabled / add an explicit exporter status field so the persisted state reflects what was actually provisioned.

Copilot uses AI. Check for mistakes.
Comment thread docs/database.md
- **MySQL**: Uses `mysqld-exporter` (port 9104).

#### Scraping & Monitoring
Once enabled, the exporter's port is mapped to a host port (available in the `metrics_port` field of the database object). These endpoints are automatically registered with the platform's central Prometheus instance for dashboarding and alerting.
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section states that metrics endpoints are “automatically registered with the platform's central Prometheus instance”, but there’s no corresponding registration logic in this PR (the only behavior added is launching an exporter and storing metrics_port). Either adjust the documentation to reflect the current behavior (expose metrics_port for external scraping) or add the actual Prometheus registration mechanism.

Suggested change
Once enabled, the exporter's port is mapped to a host port (available in the `metrics_port` field of the database object). These endpoints are automatically registered with the platform's central Prometheus instance for dashboarding and alerting.
Once enabled, the exporter's port is mapped to a host port and exposed via the `metrics_port` field of the database object. Operators should configure their monitoring stack (for example, a central Prometheus instance) to scrape this host port for dashboarding and alerting.

Copilot uses AI. Check for mistakes.
@poyrazK poyrazK deleted the feature/rds-observability-sidecar branch April 12, 2026 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants