Skip to content

feat: managed database credential security with HashiCorp Vault and automated rotation#96

Merged
poyrazK merged 52 commits intomainfrom
feature/database-vault-security
Apr 8, 2026
Merged

feat: managed database credential security with HashiCorp Vault and automated rotation#96
poyrazK merged 52 commits intomainfrom
feature/database-vault-security

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Mar 7, 2026

Implementation of Managed Database Credential Security with HashiCorp Vault

This PR introduces enterprise-grade secret management for our Managed Database (RDS) service by integrating HashiCorp Vault. It also adds an automated credential rotation feature.

Key Changes

1. HashiCorp Vault Integration

  • Developed a new VaultAdapter implementing the SecretsManager interface.
  • Database passwords are now stored securely in Vault's KV v2 engine instead of plain text in the primary metadata store.
  • Updated DatabaseService to orchestrate credential storage during creation, retrieval, and restoration.
  • Ensured read replicas fetch primary passwords from Vault to establish replication.

2. Credential Rotation (Day 2 Ops)

  • Added POST /databases/:id/rotate-credentials endpoint.
  • Implemented rotation workflow:
    • Generate new 16-char secure password.
    • Update password in Vault.
    • Execute ALTER USER inside the database container (PostgreSQL/MySQL).
    • Automatically reload associated sidecars (e.g., PgBouncer).

3. Schema & Repository Updates

  • Added migration to include credential_path in the databases table.
  • Updated DatabaseRepository to handle the new field while maintaining fallback compatibility.

4. Comprehensive Testing

  • Added unit tests for the Vault adapter, service logic, and handler.
  • Enhanced E2E test suite with a full credential rotation test case.

Verification

  • All unit tests passed (go test ./internal/...)
  • E2E rotation test passed (go test -v tests/database_e2e_test.go)
  • Swagger documentation regenerated.
  • Documentation updated in docs/database.md and docs/guides/rds.md.

Commit History

The changes have been broken down into 17 logical commits for easier review.

Summary by CodeRabbit

  • New Features

    • POST /databases/{id}/rotate-credentials endpoint, CLI command, and SDK method to rotate DB credentials
    • HashiCorp Vault (KV v2) integration for secure credential storage and automated credential rotation; pooler (PgBouncer) restarts during rotation
  • Documentation

    • Guides, API reference, Swagger, and README updated with Vault integration and rotation workflow
  • Database

    • Schema extended to reference Vault paths (credential_path) and new metrics fields (metrics_enabled, metrics_port, exporter id)

Copilot AI review requested due to automatic review settings March 7, 2026 12:04
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds HashiCorp Vault KV v2 as a SecretsManager, wires Vault into DatabaseService for credential storage and rotation, exposes POST /databases/:id/rotate-credentials (API, handler, CLI, SDK), persists credential_path (migration/repo), and updates config, tests, and docs.

Changes

Cohort / File(s) Summary
Documentation
README.md, docs/FEATURES.md, docs/api-reference.md, docs/database.md, docs/guides/rds.md
Documented Vault integration, credential rotation workflow, new API/CLI/SDK entry, and new DB schema columns (credential_path, metrics_enabled, metrics_port, exporter_container_id).
API Specs / Swagger
docs/swagger/docs.go, docs/swagger/swagger.json, docs/swagger/swagger.yaml
Added POST /databases/{id}/rotate-credentials operation with Idempotency-Key header and APIKeyAuth; updated OpenAPI/Swagger artefacts.
Vault Adapter & NoOp
internal/adapters/vault/adapter.go, internal/adapters/vault/adapter_test.go, internal/adapters/vault/noop.go
New VaultAdapter (KV v2 semantics) implementing SecretsManager with unit tests and a NoOpSecretsManager for dev/testing.
Core Ports, Domain & Service
internal/core/ports/secrets_manager.go, internal/core/ports/database.go, internal/core/domain/database.go, internal/core/services/database.go, internal/core/services/*_test.go
Introduced SecretsManager interface; added Database.CredentialPath; injected SecretsManager and VaultMountPath into DatabaseService; implemented Vault-backed credential lifecycle and RotateCredentials (idempotency via rotationCache); extensive service logic and tests updated.
Dependency Wiring & Config
internal/api/setup/dependencies.go, internal/platform/config.go, internal/api/setup/router.go
New Vault config fields (VaultAddress, VaultToken, VaultMountPath); conditional secrets manager init (NoOp vs Vault) with startup Ping; pass secrets and mount path into service params; registered rotate-credentials route.
HTTP Layer, Handler Tests & CLI/SDK
internal/handlers/database_handler.go, internal/handlers/database_handler_test.go, cmd/cloud/db.go, pkg/sdk/database.go
Added handler and tests for POST /databases/:id/rotate-credentials (reads Idempotency-Key); CLI db rotate-credentials and SDK client method added.
Persistence & Migrations
internal/repositories/postgres/database_repo.go, internal/repositories/postgres/database_repo_unit_test.go, internal/repositories/postgres/migrations/099_add_database_credential_path.up.sql
Persisted and selected credential_path in repo queries; added migration to add credential_path column; updated repo unit tests.
E2E & Worker Tests
tests/database_e2e_test.go, internal/workers/database_failover_worker_test.go
Added E2E rotate-credentials step; updated mocks/tests to include RotateCredentials in workers and handler tests.
Module dependencies
go.mod
Added github.com/hashicorp/vault/api v1.22.0 and adjusted indirect dependencies required by Vault client.

Sequence Diagram

sequenceDiagram
    actor Client
    participant API as API Handler
    participant Service as DatabaseService
    participant Vault as Vault (SecretsManager)
    participant DB as Database Container
    participant Pooler as Pooler Sidecar

    Client->>API: POST /databases/:id/rotate-credentials (Idempotency-Key)
    API->>Service: RotateCredentials(ctx, dbID, idempotencyKey)

    Service->>Service: Generate new password
    Service->>Vault: StoreSecret(vault_path, {"password": new})
    Vault-->>Service: 200 OK

    Service->>DB: ALTER USER ... WITH PASSWORD '<new>'
    DB-->>Service: OK

    Service->>Pooler: Restart (if present)
    Pooler-->>Service: OK

    Service->>Service: Emit audit/event logs
    Service-->>API: success
    API-->>Client: HTTP 200 {message: "database credentials rotated successfully"}
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I nibbled docs and found the key,

I hopped to Vault and hid it free,
Passwords twirl, rotations sing,
Sidecars wake and connections cling,
A cozy burrow for every DB.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.05% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main feature added: HashiCorp Vault integration for database credential security with automated rotation capability.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/database-vault-security

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docs/guides/rds.md (1)

57-71: ⚠️ Potential issue | 🟡 Minor

Duplicate section numbering.

Both "High Availability & Replication" (Line 57) and "Delete a Database" (Line 71) are numbered as section 5. The "Delete a Database" section should be renumbered to 6.

📝 Proposed fix
-### 5. Delete a Database
+### 6. Delete a Database
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/rds.md` around lines 57 - 71, The "Delete a Database" heading is
misnumbered as section 5; update the markdown heading for that section title
(currently "### 5. Delete a Database") to "### 6. Delete a Database" so
numbering follows "### 5. High Availability & Replication"; locate the "Delete a
Database" header in docs/guides/rds.md (heading text "Delete a Database") and
change its numeric prefix from 5 to 6.
internal/core/services/database.go (1)

108-125: ⚠️ Potential issue | 🟠 Major

Clean up Vault state on early provisioning failures.

After line 110 succeeds, any error before performProvisioningRollback runs—such as resolveVpcNetwork or CreateVolume—returns immediately and leaves an orphaned secret behind. The same ordering is repeated in the replica and restore flows below.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/services/database.go` around lines 108 - 125, After storing the
password via s.secrets.StoreSecret (vaultPath from s.getVaultPath and
db.CredentialPath) ensure any early return paths before
performProvisioningRollback remove the orphaned secret: on errors returned by
s.resolveVpcNetwork and s.volumeSvc.CreateVolume (and the analogous points in
the replica and restore flows) call the secrets removal API (e.g.
s.secrets.DeleteSecret(ctx, vaultPath) or the project’s equivalent) before
returning the wrapped error so the Vault state is cleaned up; make sure to
perform this cleanup in every early-exit branch that occurs after successful
StoreSecret and before performProvisioningRollback is invoked.
🧹 Nitpick comments (6)
docs/api-reference.md (1)

597-600: Consider adding response example for consistency.

Other endpoints in this document include response examples. For completeness, consider adding the expected response:

📝 Suggested documentation enhancement
 ### POST /databases/:id/rotate-credentials
 Regenerate the database password and update it in both the database engine and Vault.
 - **Security**: Ensures credentials are never stored in plain text in the primary metadata store.
 - **Workflow**: Automated update of database users and sidecar (pooler) reloads.
+
+**Response:**
+```json
+{
+  "message": "database credentials rotated successfully"
+}
+```
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/api-reference.md` around lines 597 - 600, Add a response example for the
POST /databases/:id/rotate-credentials endpoint to match other endpoints:
include a small JSON success body (e.g., { "message": "database credentials
rotated successfully" }) and label it as a response example under the endpoint
documentation so readers see the expected API response format for
rotate-credentials.
internal/adapters/vault/adapter_test.go (1)

15-69: Prefer a table-driven layout for these adapter cases.

StoreSecret, GetSecret, DeleteSecret, and Ping are all the same test shape with different setup/assertions. Folding them into a small table-driven suite will make it easier to add the missing failure-path cases for the Vault adapter. As per coding guidelines, "Use table-driven tests in test files".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/adapters/vault/adapter_test.go` around lines 15 - 69, The
TestVaultAdapter test should be converted to a table-driven subtest loop: define
a slice of test cases (with fields like name, method to call
(StoreSecret/GetSecret/DeleteSecret/Ping), input path, payload where needed,
expected status or expected value, and whether an error is expected) and iterate
with for _, tc := range cases { t.Run(tc.name, func(t *testing.T) { ... }) },
invoking the appropriate adapter methods (adapter.StoreSecret,
adapter.GetSecret, adapter.DeleteSecret, adapter.Ping) based on tc.name or a
function field and performing the corresponding assertions
(assert.NoError/assert.Error, and for GetSecret assert returned data contains
expected inner "data" map and password). Keep the existing mock server setup and
re-use ctx and adapter variables, and ensure the GetSecret case checks nested
"data" as before; structure will make it easy to add failure-path cases later.
internal/handlers/database_handler_test.go (1)

557-574: Add failure-path coverage and make this table-driven.

This new endpoint test only checks the happy path. For a credential-rotation route, please also cover invalid UUID and service-error responses, and fold the cases into a table-driven test like the rest of the handler surface should be moving toward. As per coding guidelines, "Use table-driven tests in test files".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/handlers/database_handler_test.go` around lines 557 - 574, Convert
TestDatabaseHandlerRotateCredentials into a table-driven test that iterates
cases for success, invalid UUID, and service error; for each case build the
request path, set up the mock service expectation on svc.RotateCredentials (for
success return nil, for service error return a specific error, and for invalid
UUID do not expect a call), call r.ServeHTTP and assert the expected status code
and response body substring (e.g., 200 + "database credentials rotated
successfully" for success, 400 for invalid UUID, 500 + error message for service
error). Keep the test using existing helpers setupDatabaseHandlerTest,
r.POST(..., handler.RotateCredentials) and reference the test name
TestDatabaseHandlerRotateCredentials, the handler method RotateCredentials, and
the mock expectation svc.On("RotateCredentials", ...) when creating cases.
Ensure t.Parallel() remains and defer svc.AssertExpectations(t) still runs.
docs/swagger/docs.go (1)

1382-1418: Use a concrete response schema for credential rotation.

This documents the success body as an anonymous map[string]string, which makes the contract vague for SDK generation and leaves it unclear what fields are safe to rely on for a security-sensitive operation. Please point the swagger annotation at a named response type, or the standard response envelope, so clients get a stable schema.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/swagger/docs.go` around lines 1382 - 1418, The OpenAPI entry for the
POST "/databases/{id}/rotate-credentials" operation uses an anonymous
map[string]string response which is too vague; change the responses.200.schema
to reference a concrete named response model (e.g. "$ref":
"#/definitions/RotateCredentialsResponse" or the existing standard envelope
type) instead of the anonymous "type: object/additionalProperties" block, and
add or reuse the corresponding definition in the same docs/swagger/docs.go (or
definitions section) so SDKs get a stable schema for the rotated credentials
payload.
internal/api/setup/dependencies.go (1)

264-268: Vault connectivity is not verified at startup.

The NewVaultAdapter constructor only creates the client but doesn't verify connectivity (see internal/adapters/vault/adapter.go:19-34). If Vault is unreachable or misconfigured (e.g., empty token from config), the error will only surface during the first secret operation, potentially causing database creation failures at runtime.

Consider calling vaultSvc.Ping(ctx) after initialization to fail fast if Vault is unavailable:

♻️ Proposed enhancement
 	vaultSvc, err := vault.NewVaultAdapter(c.Config.VaultAddress, c.Config.VaultToken, c.Logger)
 	if err != nil {
 		return nil, nil, fmt.Errorf("failed to init vault adapter: %w", err)
 	}
+
+	// Verify Vault connectivity at startup
+	ctx := context.Background()
+	if err := vaultSvc.Ping(ctx); err != nil {
+		c.Logger.Warn("vault health check failed - credential operations may fail", "error", err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/api/setup/dependencies.go` around lines 264 - 268, NewVaultAdapter
currently only constructs the Vault client but doesn't verify connectivity, so
network/token issues surface later; after creating vaultSvc in the
initialization flow (where NewVaultAdapter is called) call vaultSvc.Ping(ctx)
and return an error if Ping fails (use the same wrapped fmt.Errorf pattern) so
the process fails fast on Vault connectivity/token problems; reference
NewVaultAdapter and the Ping method on the returned vaultSvc to locate where to
add the check.
internal/platform/config.go (1)

82-84: Consider validating Vault configuration in production.

VaultToken defaults to an empty string with no validation. When Vault is required for database credential management, an empty token will cause runtime failures during secret operations rather than failing fast at startup. Consider logging a warning or returning an error in InitServices if Vault is enabled but the token is missing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/platform/config.go` around lines 82 - 84, The Vault configuration
allows VaultToken to be empty (set via getEnv) which can cause runtime failures;
update InitServices to validate Vault settings (VaultAddress, VaultToken,
VaultMountPath) when Vault-based secret management is enabled: check that
VaultToken is non-empty and VaultAddress/MountPath are set, and if the token is
missing either log a clear warning and fail fast or return an error from
InitServices so startup aborts rather than failing later during secret ops;
reference the VaultToken/VaultAddress/VaultMountPath fields and the getEnv usage
to locate the config and add the validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/FEATURES.md`:
- Line 127: The docs overstate credential rotation reliability because the
rotation flow in internal/core/services/database.go updates Vault before
executing the container ALTER USER and has no compensating write if the exec
fails; fix by making the workflow atomic or documenting the failure mode: either
(A) change the flow in the rotation routine (e.g.,
RotateCredentials/rotateDatabaseCredentials) to run the ALTER USER inside the DB
container first and only persist the new secret to Vault on success, or (B) keep
the current order but implement a compensating rollback in the same routine that
writes the old secret back to Vault if the container exec returns an error;
alternatively, if you choose not to change behavior, update FEATURE.md to
explicitly call out the non-atomic failure mode and the possibility of Vault/db
drift.
- Line 127: Update the "Credential Rotation" feature blurb in FEATURES.md to
narrow the claim: replace "Sidecars are automatically reloaded" with a precise
statement that only the PgBouncer/pooler sidecar is restarted (and is recreated
rather than hot-reloaded) when present, and that exporter sidecars are not
reloaded as part of the credential rotation flow; keep the rest of the sentence
about regenerating passwords, updating Vault, and executing ALTER USER intact.

In `@docs/guides/rds.md`:
- Around line 49-55: The docs reference a non-existent CLI command; add a new
subcommand "rotate-credentials" to the DB commands in cmd/cloud/db.go (e.g.,
dbCmd) with usage "rotate-credentials <id>" that calls the existing API endpoint
POST /databases/{id}/rotate-credentials via the project’s API client (use the
existing client method for rotating credentials or add one that wraps the HTTP
POST), handle errors and print success output; alternatively, if you prefer not
to implement the CLI, update docs/guides/rds.md to remove the CLI invocation and
instead document calling the API endpoint POST
/databases/{id}/rotate-credentials (show example curl) and fix the numbering by
renumbering "Delete a Database" to section 6.

In `@docs/swagger/docs.go`:
- Around line 7061-7063: The public Database schema currently exposes
"credential_path" — remove this property from the external API schema and ensure
the internal secret path remains only on the persistence/domain model; update
the swagger/docs.go Database schema to drop "credential_path" and instead map
domain.Database to a public DTO (e.g., PublicDatabase or DatabaseDTO) that omits
credential_path, and adjust any serialization/mapping code that converts
domain.Database to the API model so the secret path is not returned to clients.

In `@docs/swagger/swagger.json`:
- Around line 7053-7055: The public domain.Database schema currently includes
the credential_path property which leaks internal Vault locators; remove
credential_path from the public schema (the domain.Database definition) and
instead store it only server-side or in a privileged/internal schema (e.g.,
domain.DatabaseInternal or AdminDatabase) used by admin endpoints and internal
services; update any model/serialization code that references credential_path to
read it from the new internal/config location and ensure public API docs and
generated clients no longer include credential_path.
- Around line 1374-1408: The rotate-credentials POST currently can cause
double-rotation on retries; add support for an Idempotency-Key header and
propagate it through the request handling pipeline to prevent duplicate
rotations: update the OpenAPI operation for "/databases/{id}/rotate-credentials"
to include a new header parameter "Idempotency-Key" (in: header, type: string,
required: false), then wire that header into the upstream handler/service that
implements the rotation logic (the code path handling rotateCredentials /
rotate-credentials) so the service can detect and dedupe requests before
regenerating the password and reloading sidecars; ensure the header is
checked/stored by the rotation service to make the operation idempotent.

In `@docs/swagger/swagger.yaml`:
- Around line 2963-2969: The OpenAPI response for RotateCredentials currently
describes a bare map but the handler returns an httputil.Response wrapper via
httputil.Success(...); update the 200 response schema for RotateCredentials to
match httputil.Response (include top-level properties like code (integer/string
as used), message (string) and data), and make data the object that contains the
original map[string]string shape (e.g., data: type: object,
additionalProperties: { type: string }); ensure the schema names reference the
handler/function RotateCredentials and the wrapper type
httputil.Response/httputil.Success so generated clients will read the message
from data.

In `@go.mod`:
- Around line 133-142: The dependency github.com/hashicorp/vault/api is
currently marked as indirect but is directly imported by your code (the import
of "github.com/hashicorp/vault/api" in the Vault adapter), so run go mod tidy to
update go.mod and move github.com/hashicorp/vault/api v1.22.0 from an indirect
to a direct dependency; ensure the import in the Vault adapter remains unchanged
and re-run your build/tests to verify the module file was updated correctly.

In `@internal/adapters/vault/adapter_test.go`:
- Around line 17-37: The mock HTTP handler inside the httptest.NewServer closure
currently discards errors from json.NewEncoder(w).Encode(...) (in the handler
for "/v1/secret/data/test" and "/v1/sys/health"); change both occurrences to
capture the error and fail the test (e.g., if err :=
json.NewEncoder(w).Encode(...); err != nil { t.Fatalf("json encode failed: %v",
err) }) by closing over the test *testing.T, so encoding failures aren’t
ignored. While here, consider refactoring the handler into a table-driven test
(parameterizing path, method, response body/status) instead of hardcoding each
case to simplify adding scenarios and assertions.

In `@internal/adapters/vault/adapter.go`:
- Around line 36-66: The context passed into StoreSecret, GetSecret,
DeleteSecret and Ping is not used; update the Vault calls to their context-aware
variants so cancellations/timeouts propagate: in StoreSecret replace
a.client.Logical().Write with a.client.Logical().WriteWithContext(ctx, ...), in
GetSecret replace Read with ReadWithContext(ctx, ...), in DeleteSecret replace
Delete with DeleteWithContext(ctx, ...), and in Ping replace
a.client.Sys().Health() with a.client.Sys().HealthWithContext(ctx, ...); keep
existing error wrapping and return behavior unchanged.
- Around line 36-63: GetSecret currently returns the KV v2 envelope and none of
the Vault calls use the provided context; update StoreSecret, GetSecret, and
DeleteSecret to call a.client.Logical().WriteWithContext(ctx,...),
ReadWithContext(ctx,...), and DeleteWithContext(ctx,...) respectively, and
implement KV v2 semantics: when writing wrap the payload as
map[string]interface{}{"data": data} in StoreSecret, and when reading unwrap
secret.Data["data"] (assert/convert to map[string]interface{}) and return that
inner map (error if missing), while DeleteSecret should simply call
DeleteWithContext with ctx and return errors unchanged.

In `@internal/api/setup/router.go`:
- Line 366: The route registration uses a non-existent constant
domain.PermissionLBRead which breaks the build; update the permission constant
reference in the lbGroup.GET call to the correct identifier
domain.PermissionLbRead (i.e. change PermissionLBRead -> PermissionLbRead) so
httputil.Permission(svcs.RBAC, domain.PermissionLbRead) is used; search for
other usages of PermissionLBRead and replace them with PermissionLbRead to
ensure the code compiles.

In `@internal/core/services/database_vault_test.go`:
- Around line 129-156: The test ends without verifying mock expectations, so add
explicit assertions after the call to svc.CreateDatabase to ensure all mocked
interactions occurred: call AssertExpectations(t) on mockVolumeSvc, mockSecrets,
mockCompute, mockRepo, mockEventSvc, and mockAuditSvc (or use a single
mock.AssertExpectations(t) equivalent) to fail the test if any expected mock
call (e.g., CreateVolume, StoreSecret, LaunchInstanceWithOptions, GetInstanceIP,
Create, RecordEvent, Log) was not invoked.
- Around line 98-105: The test RotateCredentials_VaultFailure sets expectations
on mockRepo.GetByID and mockSecrets.StoreSecret but doesn't verify them; after
calling svc.RotateCredentials and asserting the error, call the mock expectation
checks (e.g., mockRepo.AssertExpectations(t) and
mockSecrets.AssertExpectations(t) or AssertCalled/AssertCalled with GetByID and
StoreSecret) to ensure those mocked methods were actually invoked as expected
for the failure path.

In `@internal/core/services/database.go`:
- Around line 619-627: The pooler restart block currently swallows errors (uses
`_ = ...`) which can leave metadata inconsistent; update the logic around
db.PoolerContainerID so you check and handle errors from
s.compute.DeleteInstance, s.compute.GetInstanceIP, s.resolveVpcNetwork and
especially s.provisionPoolerSidecar instead of ignoring them: call
DeleteInstance and return or log on error, capture errors from GetInstanceIP and
resolveVpcNetwork and propagate if they fail, and only call s.repo.Update(ctx,
db) when provisionPoolerSidecar succeeds; if provisionPoolerSidecar returns an
error, surface that error (return it or wrap and return) so callers know the
pooler restart failed and metadata isn’t left inconsistent.
- Around line 580-604: Currently the code stores the new password to Vault
before executing the DB ALTER and even reads back that just-stored value for
MySQL auth, which can leave Vault and the DB out of sync; change the flow so you
first read the existing credential (use s.secrets.GetSecret on db.CredentialPath
or s.getVaultPath(db.ID) to obtain currentPassword), then run the container
Exec/ALTER USER command using that currentPassword (handle both
domain.EnginePostgres and domain.EngineMySQL branches), and only after the Exec
returns success call s.secrets.StoreSecret to persist newPassword; additionally,
if StoreSecret fails after a successful DB change, attempt to roll Vault back by
restoring the old secret (or return a clear error if rollback fails) so the
secret and DB remain consistent.
- Around line 592-593: The Postgres branch (case domain.EnginePostgres)
currently hard-codes the psql user ("psql -U postgres") which can fail when the
container's superuser is configured as db.Username; update the constructed
command (cmd) to use the provisioned admin user variable (db.Username) instead
of the literal "postgres" so the ALTER USER command runs as the configured admin
(keep using db.Username and newPassword in the fmt.Sprintf invocation). Ensure
you update the string in the cmd assignment that builds the psql invocation.

In `@internal/platform/config.go`:
- Line 72: The environment variable name passed to getEnv for the
ObjectStorageNodes config is mistyped as "OBJECT_STORAGE_Nodes" causing
inconsistent casing; update the call that sets ObjectStorageNodes to use the
correct SCREAMING_SNAKE_CASE name "OBJECT_STORAGE_NODES" so it matches other env
vars and the expected key, i.e., change the getEnv argument associated with the
ObjectStorageNodes field to "OBJECT_STORAGE_NODES".

---

Outside diff comments:
In `@docs/guides/rds.md`:
- Around line 57-71: The "Delete a Database" heading is misnumbered as section
5; update the markdown heading for that section title (currently "### 5. Delete
a Database") to "### 6. Delete a Database" so numbering follows "### 5. High
Availability & Replication"; locate the "Delete a Database" header in
docs/guides/rds.md (heading text "Delete a Database") and change its numeric
prefix from 5 to 6.

In `@internal/core/services/database.go`:
- Around line 108-125: After storing the password via s.secrets.StoreSecret
(vaultPath from s.getVaultPath and db.CredentialPath) ensure any early return
paths before performProvisioningRollback remove the orphaned secret: on errors
returned by s.resolveVpcNetwork and s.volumeSvc.CreateVolume (and the analogous
points in the replica and restore flows) call the secrets removal API (e.g.
s.secrets.DeleteSecret(ctx, vaultPath) or the project’s equivalent) before
returning the wrapped error so the Vault state is cleaned up; make sure to
perform this cleanup in every early-exit branch that occurs after successful
StoreSecret and before performProvisioningRollback is invoked.

---

Nitpick comments:
In `@docs/api-reference.md`:
- Around line 597-600: Add a response example for the POST
/databases/:id/rotate-credentials endpoint to match other endpoints: include a
small JSON success body (e.g., { "message": "database credentials rotated
successfully" }) and label it as a response example under the endpoint
documentation so readers see the expected API response format for
rotate-credentials.

In `@docs/swagger/docs.go`:
- Around line 1382-1418: The OpenAPI entry for the POST
"/databases/{id}/rotate-credentials" operation uses an anonymous
map[string]string response which is too vague; change the responses.200.schema
to reference a concrete named response model (e.g. "$ref":
"#/definitions/RotateCredentialsResponse" or the existing standard envelope
type) instead of the anonymous "type: object/additionalProperties" block, and
add or reuse the corresponding definition in the same docs/swagger/docs.go (or
definitions section) so SDKs get a stable schema for the rotated credentials
payload.

In `@internal/adapters/vault/adapter_test.go`:
- Around line 15-69: The TestVaultAdapter test should be converted to a
table-driven subtest loop: define a slice of test cases (with fields like name,
method to call (StoreSecret/GetSecret/DeleteSecret/Ping), input path, payload
where needed, expected status or expected value, and whether an error is
expected) and iterate with for _, tc := range cases { t.Run(tc.name, func(t
*testing.T) { ... }) }, invoking the appropriate adapter methods
(adapter.StoreSecret, adapter.GetSecret, adapter.DeleteSecret, adapter.Ping)
based on tc.name or a function field and performing the corresponding assertions
(assert.NoError/assert.Error, and for GetSecret assert returned data contains
expected inner "data" map and password). Keep the existing mock server setup and
re-use ctx and adapter variables, and ensure the GetSecret case checks nested
"data" as before; structure will make it easy to add failure-path cases later.

In `@internal/api/setup/dependencies.go`:
- Around line 264-268: NewVaultAdapter currently only constructs the Vault
client but doesn't verify connectivity, so network/token issues surface later;
after creating vaultSvc in the initialization flow (where NewVaultAdapter is
called) call vaultSvc.Ping(ctx) and return an error if Ping fails (use the same
wrapped fmt.Errorf pattern) so the process fails fast on Vault
connectivity/token problems; reference NewVaultAdapter and the Ping method on
the returned vaultSvc to locate where to add the check.

In `@internal/handlers/database_handler_test.go`:
- Around line 557-574: Convert TestDatabaseHandlerRotateCredentials into a
table-driven test that iterates cases for success, invalid UUID, and service
error; for each case build the request path, set up the mock service expectation
on svc.RotateCredentials (for success return nil, for service error return a
specific error, and for invalid UUID do not expect a call), call r.ServeHTTP and
assert the expected status code and response body substring (e.g., 200 +
"database credentials rotated successfully" for success, 400 for invalid UUID,
500 + error message for service error). Keep the test using existing helpers
setupDatabaseHandlerTest, r.POST(..., handler.RotateCredentials) and reference
the test name TestDatabaseHandlerRotateCredentials, the handler method
RotateCredentials, and the mock expectation svc.On("RotateCredentials", ...)
when creating cases. Ensure t.Parallel() remains and defer
svc.AssertExpectations(t) still runs.

In `@internal/platform/config.go`:
- Around line 82-84: The Vault configuration allows VaultToken to be empty (set
via getEnv) which can cause runtime failures; update InitServices to validate
Vault settings (VaultAddress, VaultToken, VaultMountPath) when Vault-based
secret management is enabled: check that VaultToken is non-empty and
VaultAddress/MountPath are set, and if the token is missing either log a clear
warning and fail fast or return an error from InitServices so startup aborts
rather than failing later during secret ops; reference the
VaultToken/VaultAddress/VaultMountPath fields and the getEnv usage to locate the
config and add the validation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ba25f82c-6996-4262-bc14-389e77c1e629

📥 Commits

Reviewing files that changed from the base of the PR and between a2f5d6e and fbfdd26.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (25)
  • README.md
  • docs/FEATURES.md
  • docs/api-reference.md
  • docs/database.md
  • docs/guides/rds.md
  • docs/swagger/docs.go
  • docs/swagger/swagger.json
  • docs/swagger/swagger.yaml
  • go.mod
  • internal/adapters/vault/adapter.go
  • internal/adapters/vault/adapter_test.go
  • internal/api/setup/dependencies.go
  • internal/api/setup/router.go
  • internal/core/domain/database.go
  • internal/core/ports/database.go
  • internal/core/ports/secrets_manager.go
  • internal/core/services/database.go
  • internal/core/services/database_vault_test.go
  • internal/handlers/database_handler.go
  • internal/handlers/database_handler_test.go
  • internal/platform/config.go
  • internal/repositories/postgres/database_repo.go
  • internal/repositories/postgres/database_repo_unit_test.go
  • internal/repositories/postgres/migrations/099_add_database_credential_path.up.sql
  • tests/database_e2e_test.go

Comment thread docs/FEATURES.md Outdated
Comment thread docs/guides/rds.md
Comment thread docs/swagger/docs.go Outdated
Comment thread docs/swagger/swagger.json
Comment on lines +1374 to +1408
"/databases/{id}/rotate-credentials": {
"post": {
"security": [
{
"APIKeyAuth": []
}
],
"description": "Regenerates the database password, updates it in the database and secrets manager",
"produces": [
"application/json"
],
"tags": [
"databases"
],
"summary": "Rotate database credentials",
"parameters": [
{
"type": "string",
"description": "Database ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add idempotency to credential rotation.

This POST rotates the password and reloads sidecars, so a retry after a timeout can rotate again and immediately invalidate the just-issued credential. Expose an Idempotency-Key on this endpoint and wire it through the upstream handler/service before regenerating Swagger.

Suggested contract change
                 "parameters": [
                     {
                         "type": "string",
                         "description": "Database ID",
                         "name": "id",
                         "in": "path",
                         "required": true
+                    },
+                    {
+                        "type": "string",
+                        "description": "Idempotency key",
+                        "name": "Idempotency-Key",
+                        "in": "header"
                     }
                 ],
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"/databases/{id}/rotate-credentials": {
"post": {
"security": [
{
"APIKeyAuth": []
}
],
"description": "Regenerates the database password, updates it in the database and secrets manager",
"produces": [
"application/json"
],
"tags": [
"databases"
],
"summary": "Rotate database credentials",
"parameters": [
{
"type": "string",
"description": "Database ID",
"name": "id",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
"/databases/{id}/rotate-credentials": {
"post": {
"security": [
{
"APIKeyAuth": []
}
],
"description": "Regenerates the database password, updates it in the database and secrets manager",
"produces": [
"application/json"
],
"tags": [
"databases"
],
"summary": "Rotate database credentials",
"parameters": [
{
"type": "string",
"description": "Database ID",
"name": "id",
"in": "path",
"required": true
},
{
"type": "string",
"description": "Idempotency key",
"name": "Idempotency-Key",
"in": "header"
}
],
"responses": {
"200": {
"description": "OK",
"schema": {
"type": "object",
"additionalProperties": {
"type": "string"
}
}
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/swagger/swagger.json` around lines 1374 - 1408, The rotate-credentials
POST currently can cause double-rotation on retries; add support for an
Idempotency-Key header and propagate it through the request handling pipeline to
prevent duplicate rotations: update the OpenAPI operation for
"/databases/{id}/rotate-credentials" to include a new header parameter
"Idempotency-Key" (in: header, type: string, required: false), then wire that
header into the upstream handler/service that implements the rotation logic (the
code path handling rotateCredentials / rotate-credentials) so the service can
detect and dedupe requests before regenerating the password and reloading
sidecars; ensure the header is checked/stored by the rotation service to make
the operation idempotent.

Comment thread docs/swagger/swagger.json Outdated
Comment thread internal/core/services/database_vault_test.go
Comment thread internal/core/services/database.go Outdated
Comment thread internal/core/services/database.go Outdated
Comment thread internal/core/services/database.go
Comment thread internal/platform/config.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates HashiCorp Vault for secure managed database credential management, replaces plain-text password storage with Vault KV references, and adds a new POST /databases/:id/rotate-credentials endpoint for zero-downtime credential rotation.

Changes:

  • New VaultAdapter implementing the SecretsManager interface; DatabaseService updated to store/retrieve/delete credentials from Vault during create, replica, modify, restore, and delete operations
  • New RotateCredentials service method and corresponding HTTP handler/route that generates a new password, updates Vault, runs ALTER USER in the container, and restarts PgBouncer sidecars
  • Schema migration adding credential_path column to databases table; repository queries updated accordingly; tests extended for new field

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
internal/adapters/vault/adapter.go New Vault adapter implementing SecretsManager
internal/adapters/vault/adapter_test.go Tests for Vault adapter using mock HTTP server
internal/core/ports/secrets_manager.go New SecretsManager interface
internal/core/ports/database.go Adds RotateCredentials to DatabaseService interface
internal/core/domain/database.go Adds CredentialPath field to Database struct
internal/core/services/database.go Core Vault integration and rotation logic
internal/core/services/database_vault_test.go Tests for new Vault-integrated service methods
internal/handlers/database_handler.go New RotateCredentials handler
internal/handlers/database_handler_test.go Handler test for credential rotation
internal/api/setup/router.go Registers rotate-credentials route; incidental PermissionLBRead change
internal/api/setup/dependencies.go Wires VaultAdapter into DatabaseService
internal/platform/config.go Adds Vault config fields; accidentally changes OBJECT_STORAGE_NODES key
internal/repositories/postgres/database_repo.go SQL queries updated with credential_path column
internal/repositories/postgres/database_repo_unit_test.go Repo tests updated for new field
internal/repositories/postgres/migrations/099_add_database_credential_path.up.sql Migration adding credential_path column
tests/database_e2e_test.go E2E test for credential rotation
go.mod / go.sum Adds hashicorp/vault/api dependency
docs/* Documentation updates for Vault integration and credential rotation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +82 to +83
VaultAddress: getEnv("VAULT_ADDR", "http://localhost:8200"),
VaultToken: getEnv("VAULT_TOKEN", ""),
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VaultToken is stored as a plain string in the Config struct and loaded directly from the VAULT_TOKEN environment variable. The default is an empty string, meaning if VAULT_TOKEN is not set, the service will initialize a Vault client with no token and silently proceed. All subsequent Vault operations will fail at runtime. There is no startup validation that a non-empty token is configured (e.g., by returning an error from NewConfig or NewVaultAdapter when the token is empty). At minimum, a warning should be logged if the token is empty.

Copilot uses AI. Check for mistakes.
Comment thread internal/core/services/database.go Outdated
Comment on lines +108 to +113
// Store password in Vault
vaultPath := s.getVaultPath(db.ID)
if err := s.secrets.StoreSecret(ctx, vaultPath, map[string]interface{}{"password": password}); err != nil {
return nil, errors.Wrap(errors.Internal, "failed to store database credentials in vault", err)
}
db.CredentialPath = vaultPath
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When LaunchInstanceWithOptions fails after the Vault secret has been stored, the code calls cleanupVolumeQuietly but does not clean up the Vault secret stored just before. Only performProvisioningRollback (which handles failure cases where db.CredentialPath is set) cleans up Vault, but on the LaunchInstanceWithOptions failure path, rollback is bypassed. This leaves an orphaned Vault secret if container launch fails after Vault storage succeeds. The Vault secret should be deleted in this error path too.

Copilot uses AI. Check for mistakes.
Comment thread internal/api/setup/router.go Outdated
lbGroup.POST("", httputil.Permission(svcs.RBAC, domain.PermissionLbCreate), handlers.LB.Create)
lbGroup.GET("", httputil.Permission(svcs.RBAC, domain.PermissionLbRead), handlers.LB.List)
lbGroup.GET("/:id", httputil.Permission(svcs.RBAC, domain.PermissionLbRead), handlers.LB.Get)
lbGroup.GET("/:id", httputil.Permission(svcs.RBAC, domain.PermissionLBRead), handlers.LB.Get)
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The domain.PermissionLBRead constant referenced here does not exist. The correct constant defined in internal/core/domain/rbac.go is PermissionLbRead (lowercase "b"). This will cause a compilation error, breaking the entire application.

Suggested change
lbGroup.GET("/:id", httputil.Permission(svcs.RBAC, domain.PermissionLBRead), handlers.LB.Get)
lbGroup.GET("/:id", httputil.Permission(svcs.RBAC, domain.PermissionLbRead), handlers.LB.Get)

Copilot uses AI. Check for mistakes.
Comment thread internal/platform/config.go Outdated
ObjectStorageMode: getEnv("OBJECT_STORAGE_MODE", "local"),

ObjectStorageNodes: getEnv("OBJECT_STORAGE_NODES", ""),
ObjectStorageNodes: getEnv("OBJECT_STORAGE_Nodes", ""),
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variable key has been changed from OBJECT_STORAGE_NODES (uppercase) to OBJECT_STORAGE_Nodes (mixed case). Linux environment variables are case-sensitive, so any deployment that sets OBJECT_STORAGE_NODES will find the value empty, breaking the object storage node configuration. This was presumably an accidental change introduced alongside the Vault config additions.

Suggested change
ObjectStorageNodes: getEnv("OBJECT_STORAGE_Nodes", ""),
ObjectStorageNodes: getEnv("OBJECT_STORAGE_NODES", ""),

Copilot uses AI. Check for mistakes.
Comment on lines +592 to +604
case domain.EnginePostgres:
cmd = []string{"psql", "-U", "postgres", "-c", fmt.Sprintf("ALTER USER %s WITH PASSWORD '%s';", db.Username, newPassword)}
case domain.EngineMySQL:
// For MySQL rotation, we need the root password to run the ALTER USER command.
// Since we don't have it easily available, we'll try using the current user's password if it's stored.
currentPassword := db.Password
if db.CredentialPath != "" {
secret, _ := s.secrets.GetSecret(ctx, db.CredentialPath)
if p, ok := secret["password"].(string); ok {
currentPassword = p
}
}
cmd = []string{"mysql", "-u", "root", "-p" + currentPassword, "-e", fmt.Sprintf("ALTER USER '%s'@'%%' IDENTIFIED BY '%s';", db.Username, newPassword)}
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The password generated by util.GenerateRandomPassword can contain single-quote characters ('), as the character set includes !@#$%^&*()-_=+ (see pkg/util/password.go:10). Single quotes are used as delimiters in both the Postgres and MySQL ALTER USER SQL strings:

  • Postgres: ALTER USER %s WITH PASSWORD '%s'
  • MySQL: ALTER USER '%s'@'%%' IDENTIFIED BY '%s'

A generated password containing a single quote would break the SQL syntax, causing the rotation command to fail or, in the worst case, allowing SQL injection within the container exec context. Parameterized queries or proper SQL escaping should be used instead. For the MySQL case, note that mysql -e does not support parameterized queries, so the password should be escaped or the MySQL client should be invoked differently (e.g., via stdin or a config file).

Copilot uses AI. Check for mistakes.
Comment on lines +594 to +604
case domain.EngineMySQL:
// For MySQL rotation, we need the root password to run the ALTER USER command.
// Since we don't have it easily available, we'll try using the current user's password if it's stored.
currentPassword := db.Password
if db.CredentialPath != "" {
secret, _ := s.secrets.GetSecret(ctx, db.CredentialPath)
if p, ok := secret["password"].(string); ok {
currentPassword = p
}
}
cmd = []string{"mysql", "-u", "root", "-p" + currentPassword, "-e", fmt.Sprintf("ALTER USER '%s'@'%%' IDENTIFIED BY '%s';", db.Username, newPassword)}
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the MySQL branch of RotateCredentials, the code tries to retrieve the current (old) password from Vault at line 599 to authenticate to MySQL. However, at this point the new password has already been saved to Vault at line 585. Therefore, GetSecret will return the newly stored password, not the old one. The MySQL ALTER USER command will then be executed using the new password as authentication credentials, but the database still has the old password, causing the authentication to fail and leaving the system in an inconsistent state (Vault has the new password but the database still has the old one).

The old password should be fetched from Vault before storing the new one, or the legacy db.Password field should be used as the fallback for the auth credential (recognizing the limitations noted in the code comment). Additionally, the error from GetSecret is silently ignored with _, _ which can lead to a nil pointer panic when dereferencing secret["password"] if secret is nil.

Copilot uses AI. Check for mistakes.
Comment thread internal/core/services/database.go Outdated
Comment on lines +108 to +113
// Store password in Vault
vaultPath := s.getVaultPath(db.ID)
if err := s.secrets.StoreSecret(ctx, vaultPath, map[string]interface{}{"password": password}); err != nil {
return nil, errors.Wrap(errors.Internal, "failed to store database credentials in vault", err)
}
db.CredentialPath = vaultPath
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing TestDatabaseServiceUnitExtended in internal/core/services/database_unit_test.go creates a DatabaseService without providing a Secrets mock (line 67-77). Since CreateDatabase, CreateReplica, RestoreDatabase, DeleteDatabase, GetConnectionString, and ModifyDatabase now unconditionally call methods on s.secrets, this will cause a nil pointer panic at runtime when those tests run. The Secrets dependency must be added to the service construction in this test file to keep the existing tests passing.

Copilot uses AI. Check for mistakes.
Comment thread internal/core/domain/database.go Outdated
PoolingEnabled bool `json:"pooling_enabled"`
PoolingPort int `json:"pooling_port,omitempty"`
PoolerContainerID string `json:"-"`
CredentialPath string `json:"credential_path,omitempty"`
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CredentialPath field is serialized to the JSON API response (json:"credential_path,omitempty"). This exposes internal Vault path structure (e.g., secret/data/thecloud/rds/<uuid>/credentials) to API consumers, which leaks infrastructure details. Similar sensitive internal fields like ExporterContainerID, PoolerContainerID, and Password are tagged with json:"-" to prevent serialization. The CredentialPath field should use the same json:"-" tag.

Suggested change
CredentialPath string `json:"credential_path,omitempty"`
CredentialPath string `json:"-"`

Copilot uses AI. Check for mistakes.
Comment thread internal/adapters/vault/adapter.go Outdated
Comment on lines +36 to +71
func (a *VaultAdapter) StoreSecret(ctx context.Context, path string, data map[string]interface{}) error {
_, err := a.client.Logical().Write(path, data)
if err != nil {
return fmt.Errorf("failed to write secret to vault at %s: %w", path, err)
}
return nil
}

func (a *VaultAdapter) GetSecret(ctx context.Context, path string) (map[string]interface{}, error) {
secret, err := a.client.Logical().Read(path)
if err != nil {
return nil, fmt.Errorf("failed to read secret from vault at %s: %w", path, err)
}

if secret == nil || secret.Data == nil {
return nil, fmt.Errorf("secret not found at path: %s", path)
}

return secret.Data, nil
}

func (a *VaultAdapter) DeleteSecret(ctx context.Context, path string) error {
_, err := a.client.Logical().Delete(path)
if err != nil {
return fmt.Errorf("failed to delete secret from vault at %s: %w", path, err)
}
return nil
}

func (a *VaultAdapter) Ping(ctx context.Context) error {
_, err := a.client.Sys().Health()
if err != nil {
return fmt.Errorf("vault health check failed: %w", err)
}
return nil
}
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All methods in the VaultAdapter (StoreSecret, GetSecret, DeleteSecret, Ping) accept a context.Context parameter but do not pass it to the underlying Vault API client calls. The hashicorp/vault/api package provides context-aware variants for all operations (e.g., a.client.Logical().WriteWithContext(ctx, path, data), ReadWithContext, DeleteWithContext, a.client.Sys().HealthWithContext(ctx)). Without context propagation, request cancellations, deadlines, and timeouts set by callers will be silently ignored, leading to goroutine leaks or unresponsive requests.

Copilot uses AI. Check for mistakes.
Comment thread internal/core/services/database.go Outdated
Comment on lines +585 to +611
if err := s.secrets.StoreSecret(ctx, vaultPath, map[string]interface{}{"password": newPassword}); err != nil {
return errors.Wrap(errors.Internal, "failed to store new credentials in vault", err)
}

// 2. Execute ALTER USER in container
var cmd []string
switch db.Engine {
case domain.EnginePostgres:
cmd = []string{"psql", "-U", "postgres", "-c", fmt.Sprintf("ALTER USER %s WITH PASSWORD '%s';", db.Username, newPassword)}
case domain.EngineMySQL:
// For MySQL rotation, we need the root password to run the ALTER USER command.
// Since we don't have it easily available, we'll try using the current user's password if it's stored.
currentPassword := db.Password
if db.CredentialPath != "" {
secret, _ := s.secrets.GetSecret(ctx, db.CredentialPath)
if p, ok := secret["password"].(string); ok {
currentPassword = p
}
}
cmd = []string{"mysql", "-u", "root", "-p" + currentPassword, "-e", fmt.Sprintf("ALTER USER '%s'@'%%' IDENTIFIED BY '%s';", db.Username, newPassword)}
default:
return errors.New(errors.Internal, "unsupported engine for credential rotation")
}

if _, err := s.compute.Exec(ctx, db.ContainerID, cmd); err != nil {
return errors.Wrap(errors.Internal, "failed to execute password rotation in container", err)
}
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RotateCredentials function does not rollback the Vault secret if the Exec command fails (line 609). The operation order is: (1) store new password in Vault, (2) execute ALTER USER in container, (3) update DB record. If step 2 fails, Vault now holds the new password but the database engine still has the old one. This creates an inconsistency: subsequent reads of the credentials from Vault will return the new password, but it will be rejected by the database. A rollback to the old password in Vault should be performed if the Exec call fails, or the old password should be fetched and restored.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 7, 2026 12:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 27 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (3)

internal/api/setup/router.go:366

  • The PermissionLBRead constant does not exist in internal/core/domain/rbac.go. The correct constant name is PermissionLbRead (lowercase 'b'). This will cause a compilation failure. The original code used PermissionLbRead (the diff confirms this is a change from PermissionLbRead to PermissionLBRead), so this is a regression introduced by this PR.
		lbGroup.GET("/:id", httputil.Permission(svcs.RBAC, domain.PermissionLbRead), handlers.LB.Get)

internal/core/services/database.go:140

  • The Vault secret is stored at line 110 before the VPC network resolution and volume creation. If resolveVpcNetwork (line 117-119) or CreateVolume (line 122-124) fails, the function returns early without cleaning up the already-written Vault secret, leaving an orphaned secret. Similarly, if LaunchInstanceWithOptions fails (line 137-140), cleanupVolumeQuietly is called but the Vault secret is not deleted.

The performProvisioningRollback does handle Vault cleanup, but it's not called in these early failure paths. The Vault secret creation should either happen after the volume/container are ready, or the early return paths should also call secrets.DeleteSecret(ctx, vaultPath) to clean up.

	// Store password in Vault
	vaultPath := s.getVaultPath(db.ID)
	if err := s.secrets.StoreSecret(ctx, vaultPath, map[string]interface{}{"password": password}); err != nil {
		return nil, errors.Wrap(errors.Internal, "failed to store database credentials in vault", err)
	}
	db.CredentialPath = vaultPath

	imageName, env, defaultPort := s.getEngineConfig(dbEngine, req.Version, username, password, req.Name, db.Role, "")

	networkID, err := s.resolveVpcNetwork(ctx, req.VpcID)
	if err != nil {
		return nil, err
	}

	vol, err := s.volumeSvc.CreateVolume(ctx, fmt.Sprintf("db-vol-%s", db.ID.String()[:8]), req.AllocatedStorage)
	if err != nil {
		return nil, errors.Wrap(errors.Internal, "failed to create persistent volume", err)
	}

	containerID, allocatedPorts, err := s.compute.LaunchInstanceWithOptions(ctx, ports.CreateInstanceOptions{
		Name:        fmt.Sprintf("cloud-db-%s-%s", req.Name, db.ID.String()[:8]),
		ImageName:   imageName,
		Ports:       []string{"0:" + defaultPort},
		NetworkID:   networkID,
		VolumeBinds: []string{fmt.Sprintf("%s:%s", s.getBackendVolName(vol), s.getMountPath(dbEngine))},
		Env:         env,
		Cmd:         s.buildEngineCmd(dbEngine, req.Parameters),
	})

	if err != nil {
		s.cleanupVolumeQuietly(ctx, vol.ID.String())
		return nil, errors.Wrap(errors.Internal, "failed to launch database container", err)
	}

internal/core/services/database.go:544

  • Same orphaned Vault secret issue as in CreateDatabase. The Vault secret is stored at line 516 before the snapshot restore and container launch. If RestoreSnapshot fails (line 521-523) or LaunchInstanceWithOptions fails (line 542-544), the function returns without cleaning up the already-written Vault secret. The failure path on line 543 calls cleanupVolumeQuietly but does not call secrets.DeleteSecret, so the orphaned Vault secret is never removed.
	// Store password in Vault
	vaultPath := s.getVaultPath(db.ID)
	if err := s.secrets.StoreSecret(ctx, vaultPath, map[string]interface{}{"password": password}); err != nil {
		return nil, errors.Wrap(errors.Internal, "failed to store restored database credentials in vault", err)
	}
	db.CredentialPath = vaultPath

	vol, err := s.snapshotSvc.RestoreSnapshot(ctx, req.SnapshotID, fmt.Sprintf("db-vol-%s", db.ID.String()[:8]))
	if err != nil {
		return nil, err
	}
	
	// Re-verify volume size matches DB record (snapshot restore might resize)
	if vol.SizeGB > db.AllocatedStorage {
		db.AllocatedStorage = vol.SizeGB
	}

	networkID, _ := s.resolveVpcNetwork(ctx, req.VpcID)
	imageName, env, defaultPort := s.getEngineConfig(dbEngine, req.Version, username, password, req.NewName, domain.RolePrimary, "")
	
	containerID, allocatedPorts, err := s.compute.LaunchInstanceWithOptions(ctx, ports.CreateInstanceOptions{
		Name:        fmt.Sprintf("cloud-db-%s-%s", req.NewName, db.ID.String()[:8]),
		ImageName:   imageName,
		Ports:       []string{"0:" + defaultPort},
		NetworkID:   networkID,
		VolumeBinds: []string{fmt.Sprintf("%s:%s", s.getBackendVolName(vol), s.getMountPath(dbEngine))},
		Env:         env,
	})
	if err != nil {
		s.cleanupVolumeQuietly(ctx, vol.ID.String())
		return nil, err

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/adapters/vault/adapter.go Outdated
Comment on lines +38 to +55
func (a *Adapter) StoreSecret(_ context.Context, path string, data map[string]interface{}) error {
_, err := a.client.Logical().Write(path, data)
if err != nil {
return fmt.Errorf("failed to store secret in vault: %w", err)
}
return nil
}

// GetSecret retrieves a secret from the specified path.
func (a *Adapter) GetSecret(_ context.Context, path string) (map[string]interface{}, error) {
secret, err := a.client.Logical().Read(path)
if err != nil {
return nil, fmt.Errorf("failed to read secret from vault: %w", err)
}
if secret == nil || secret.Data == nil {
return nil, fmt.Errorf("secret not found at path: %s", path)
}
return secret.Data, nil
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StoreSecret method writes data directly to the Vault KV v2 path without wrapping it in the required {"data": {...}} envelope. When using the Logical().Write() API with a KV v2 path (e.g., secret/data/thecloud/rds/...), the payload must be {"data": {"password": "..."}}. Without this wrapping, the password won't be stored correctly.

Additionally, GetSecret returns secret.Data from a KV v2 read, which will be structured as {"data": {"password": "..."}, "metadata": {...}}. All callers in database.go access the password as secret["password"].(string) (e.g., lines 199, 307, 446, 600), but the correct access for KV v2 is secret["data"].(map[string]interface{})["password"].(string).

The adapter test (adapter_test.go) confirms the KV v2 nesting by accessing data["data"].(map[string]interface{})["password"], which means the callers in database.go are inconsistent with the actual Vault response structure.

To fix: either (1) use the Vault SDK's KV v2 helper (client.KVv2(mount).Put/Get) which handles wrapping/unwrapping automatically, or (2) manually wrap the data in StoreSecret and unwrap in GetSecret.

Copilot uses AI. Check for mistakes.
Comment thread internal/core/services/database.go Outdated
Comment on lines +599 to +601
secret, _ := s.secrets.GetSecret(ctx, db.CredentialPath)
if p, ok := secret["password"].(string); ok {
currentPassword = p
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GetSecret call on line 599 ignores both the error and the nil-dereference risk. If GetSecret returns an error, secret will be nil, and secret["password"] on line 600 would panic with a nil map dereference. The error should be checked before accessing secret. Consider using the same pattern as line 196-201 which properly handles the nil/error case.

Suggested change
secret, _ := s.secrets.GetSecret(ctx, db.CredentialPath)
if p, ok := secret["password"].(string); ok {
currentPassword = p
secret, err := s.secrets.GetSecret(ctx, db.CredentialPath)
if err != nil {
slog.WarnContext(ctx, "failed to retrieve current password from vault; falling back to stored password",
"error", err,
"credentialPath", db.CredentialPath,
)
} else if secret != nil {
if p, ok := secret["password"].(string); ok && p != "" {
currentPassword = p
}

Copilot uses AI. Check for mistakes.
Comment thread go.mod Outdated
github.com/hashicorp/go-secure-stdlib/strutil v0.1.2 // indirect
github.com/hashicorp/go-sockaddr v1.0.7 // indirect
github.com/hashicorp/hcl v1.0.1-vault-7 // indirect
github.com/hashicorp/vault/api v1.22.0 // indirect
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The github.com/hashicorp/vault/api v1.22.0 dependency is listed as // indirect in go.mod, but it is directly imported and used in internal/adapters/vault/adapter.go. Direct dependencies should not be marked as // indirect. Running go mod tidy would move it to the direct dependencies section.

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +10
"net/http"
"net/http/httptest"
"testing"

"log/slog"
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "log/slog" import is separated from the other standard library imports ("context", "encoding/json", "net/http", etc.) and placed in the same block as the third-party imports. In idiomatic Go, all standard library imports should be in the same group. Running goimports would fix this automatically.

Suggested change
"net/http"
"net/http/httptest"
"testing"
"log/slog"
"log/slog"
"net/http"
"net/http/httptest"
"testing"

Copilot uses AI. Check for mistakes.
Comment thread docs/guides/rds.md
cloud db rotate-credentials <id>
```

### 5. High Availability & Replication
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env var name MYSQL_MASTER_HOST was changed to PRIMARY_HOST in getEngineConfig (database.go line 839), but the documentation in docs/guides/rds.md still refers to MYSQL_MASTER_HOST. Note that the documentation update at line 81 (saying "MYSQL_MASTER_HOST for MySQL") is now out of date and should be updated to say PRIMARY_HOST for both engines.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 7, 2026 17:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 33 changed files in this pull request and generated 10 comments.

Comments suppressed due to low confidence (2)

internal/repositories/postgres/database_repo.go:72

  • ListReplicas selects tenant_id but doesn’t constrain by it. With the repository moving to tenant scoping elsewhere, this method can return replicas across tenants if a primary_id is known. Filter by tenant_id from context (or make this an explicitly internal/admin method) to keep access boundaries consistent.
func (r *DatabaseRepository) ListReplicas(ctx context.Context, primaryID uuid.UUID) ([]*domain.Database, error) {
	query := `
		SELECT id, user_id, tenant_id, name, engine, version, status, role, primary_id, vpc_id, COALESCE(container_id, ''), port, username, password, created_at, updated_at, allocated_storage, parameters, metrics_enabled, COALESCE(metrics_port, 0), COALESCE(exporter_container_id, ''), pooling_enabled, COALESCE(pooling_port, 0), COALESCE(pooler_container_id, ''), COALESCE(credential_path, '')
		FROM databases
		WHERE primary_id = $1
		ORDER BY created_at DESC
	`

internal/core/services/database.go:443

  • PromoteToPrimary now enforces RBAC using UserIDFromContext(ctx). Internal/background workflows (e.g., the database failover worker) typically run without an end-user context, so this is likely to start failing with “failed to get user” / forbidden. Consider using a dedicated system principal in context for workers, or providing an internal promotion path that bypasses end-user RBAC while still being restricted to trusted callers.
func (s *DatabaseService) PromoteToPrimary(ctx context.Context, id uuid.UUID) error {
	userID := appcontext.UserIDFromContext(ctx)
	if err := s.rbacSvc.Authorize(ctx, userID, domain.PermissionDBUpdate, id.String()); err != nil {
		return err
	}
	db, err := s.repo.GetByID(ctx, id)
	if err != nil {
		return err
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 28 to 34
query := `
INSERT INTO databases (id, user_id, name, engine, version, status, role, primary_id, vpc_id, container_id, port, username, password, created_at, updated_at, allocated_storage, parameters, metrics_enabled, metrics_port, exporter_container_id, pooling_enabled, pooling_port, pooler_container_id)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23)
INSERT INTO databases (id, user_id, tenant_id, name, engine, version, status, role, primary_id, vpc_id, container_id, port, username, password, created_at, updated_at, allocated_storage, parameters, metrics_enabled, metrics_port, exporter_container_id, pooling_enabled, pooling_port, pooler_container_id, credential_path)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25)
`
_, err := r.db.Exec(ctx, query,
db.ID, db.UserID, db.Name, db.Engine, db.Version, db.Status, db.Role, db.PrimaryID, db.VpcID, db.ContainerID, db.Port, db.Username, db.Password, db.CreatedAt, db.UpdatedAt, db.AllocatedStorage, db.Parameters, db.MetricsEnabled, db.MetricsPort, db.ExporterContainerID, db.PoolingEnabled, db.PoolingPort, db.PoolerContainerID,
db.ID, db.UserID, db.TenantID, db.Name, db.Engine, db.Version, db.Status, db.Role, db.PrimaryID, db.VpcID, db.ContainerID, db.Port, db.Username, db.Password, db.CreatedAt, db.UpdatedAt, db.AllocatedStorage, db.Parameters, db.MetricsEnabled, db.MetricsPort, db.ExporterContainerID, db.PoolingEnabled, db.PoolingPort, db.PoolerContainerID, db.CredentialPath,
)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tenant_id is now required in INSERT/queries, but the current migrations for databases don’t add a tenant_id column (the original create-table migration only has user_id). This will cause runtime SQL errors on create/read/update/delete. Add a migration to introduce tenant_id (with appropriate FK/index/backfill) or keep repository filtering on user_id until the schema is updated.

Copilot uses AI. Check for mistakes.
Comment on lines 51 to +59
func (r *DatabaseRepository) List(ctx context.Context) ([]*domain.Database, error) {
userID := appcontext.UserIDFromContext(ctx)
tenantID := appcontext.TenantIDFromContext(ctx)
query := `
SELECT id, user_id, name, engine, version, status, role, primary_id, vpc_id, COALESCE(container_id, ''), port, username, password, created_at, updated_at, allocated_storage, parameters, metrics_enabled, COALESCE(metrics_port, 0), COALESCE(exporter_container_id, ''), pooling_enabled, COALESCE(pooling_port, 0), COALESCE(pooler_container_id, '')
SELECT id, user_id, tenant_id, name, engine, version, status, role, primary_id, vpc_id, COALESCE(container_id, ''), port, username, password, created_at, updated_at, allocated_storage, parameters, metrics_enabled, COALESCE(metrics_port, 0), COALESCE(exporter_container_id, ''), pooling_enabled, COALESCE(pooling_port, 0), COALESCE(pooler_container_id, ''), COALESCE(credential_path, '')
FROM databases
WHERE user_id = $1
WHERE tenant_id = $1
ORDER BY created_at DESC
`
rows, err := r.db.Query(ctx, query, userID)
rows, err := r.db.Query(ctx, query, tenantID)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List() is now tenant-scoped via TenantIDFromContext(ctx). Background components like DatabaseFailoverWorker call repo.List(ctx) with a plain context that does not include a tenant id, which will effectively return no databases and disable failover checks. Consider adding a non-tenant-scoped/internal listing method for workers, or ensure worker contexts carry the correct tenant (and adjust RBAC/service calls accordingly).

Copilot uses AI. Check for mistakes.
Comment on lines +2 to +5
ALTER TABLE databases ADD COLUMN credential_path VARCHAR(255);

-- +goose Down
ALTER TABLE databases DROP COLUMN credential_path;
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This migration adds credential_path but does not add the tenant_id column that the updated DatabaseRepository now reads/writes. Without a tenant_id migration (plus backfill/index), database CRUD will fail at runtime. Consider adding the tenant_id migration here (or as a separate migration in the same PR) before switching repository queries to tenant scoping.

Suggested change
ALTER TABLE databases ADD COLUMN credential_path VARCHAR(255);
-- +goose Down
ALTER TABLE databases DROP COLUMN credential_path;
ALTER TABLE databases
ADD COLUMN credential_path VARCHAR(255),
ADD COLUMN tenant_id VARCHAR(255);
CREATE INDEX idx_databases_tenant_id ON databases (tenant_id);
-- +goose Down
DROP INDEX idx_databases_tenant_id;
ALTER TABLE databases
DROP COLUMN tenant_id,
DROP COLUMN credential_path;

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,5 @@
-- +goose Up
ALTER TABLE databases ADD COLUMN credential_path VARCHAR(255);
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

credential_path is constrained to VARCHAR(255). Vault KV v2 paths can exceed 255 characters in practice (mount + nested paths + UUIDs + prefixes), and truncation would be disastrous. Prefer TEXT (as used for other path-like fields) or increase the limit and enforce a length check in code.

Suggested change
ALTER TABLE databases ADD COLUMN credential_path VARCHAR(255);
ALTER TABLE databases ADD COLUMN credential_path TEXT;

Copilot uses AI. Check for mistakes.
Comment on lines 79 to +103
// NewDatabaseService constructs a DatabaseService with its dependencies.
func NewDatabaseService(params DatabaseServiceParams) *DatabaseService {
return &DatabaseService{
repo: params.Repo,
compute: params.Compute,
vpcRepo: params.VpcRepo,
volumeSvc: params.VolumeSvc,
snapshotSvc: params.SnapshotSvc,
snapshotRepo: params.SnapshotRepo,
eventSvc: params.EventSvc,
auditSvc: params.AuditSvc,
logger: params.Logger,
repo: params.Repo,
rbacSvc: params.RBAC,
compute: params.Compute,
vpcRepo: params.VpcRepo,
volumeSvc: params.VolumeSvc,
snapshotSvc: params.SnapshotSvc,
snapshotRepo: params.SnapshotRepo,
eventSvc: params.EventSvc,
auditSvc: params.AuditSvc,
secrets: params.Secrets,
logger: params.Logger,
vaultMountPath: params.VaultMountPath,
rotationCache: make(map[string]bool),
}
}

func (s *DatabaseService) CreateDatabase(ctx context.Context, req ports.CreateDatabaseRequest) (*domain.Database, error) {
userID := appcontext.UserIDFromContext(ctx)
tenantID := appcontext.TenantIDFromContext(ctx)
if err := s.rbacSvc.Authorize(ctx, userID, domain.PermissionDBCreate, "*"); err != nil {
return nil, err
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatabaseService now unconditionally calls s.rbacSvc.Authorize(...) in multiple methods, but NewDatabaseService does not guard against RBAC being nil. This can panic in benchmarks/integration setups that construct the service without RBAC. Consider requiring RBAC (fail fast during construction) or providing a no-op RBAC implementation for non-API contexts/tests.

Copilot uses AI. Check for mistakes.
Comment on lines +529 to 542
func (s *DatabaseService) RotateCredentials(ctx context.Context, id uuid.UUID, idempotencyKey string) error {
if idempotencyKey != "" {
s.rotationMu.Lock()
if s.rotationCache[idempotencyKey] {
s.rotationMu.Unlock()
return nil // Already rotated
}
s.rotationMu.Unlock()
}
dbEngine := domain.DatabaseEngine(req.Engine)
password, _ := util.GenerateRandomPassword(16)
username := s.getDefaultUsername(dbEngine)
db := s.initialDatabaseRecord(userID, req.NewName, dbEngine, req.Version, username, password, req.VpcID)

// Preserving user requested storage if larger than snapshot
db.AllocatedStorage = req.AllocatedStorage
if snap.SizeGB > db.AllocatedStorage {
db.AllocatedStorage = snap.SizeGB

db, err := s.repo.GetByID(ctx, id)
if err != nil {
return err
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RotateCredentials is missing an RBAC authorization check, unlike other service methods (Create/Modify/Get/Delete/etc.). Even though the HTTP route has RBAC middleware, this method can be called from non-HTTP contexts and would bypass authorization. Add Authorize(ctx, userID, PermissionDBUpdate, id.String()) (or similar) at the start of this method.

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +58
// Simple in-memory idempotency cache for rotation
rotationCache map[string]bool
rotationMu sync.Mutex
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in-memory idempotency cache is (1) unbounded (no TTL/eviction), (2) keyed only by Idempotency-Key (so the same key across different DBs will incorrectly short-circuit rotation), and (3) not concurrency-safe for duplicate requests (two concurrent requests with the same key can both pass the initial check and rotate twice). Consider using a TTL cache keyed by {dbID}:{idempotencyKey} and a singleflight/lock-per-key approach, or persist idempotency in the database.

Copilot uses AI. Check for mistakes.
Comment on lines +577 to +593
// 2. Update in Vault ONLY after DB success
vaultPath := db.CredentialPath
if vaultPath == "" {
vaultPath = s.getVaultPath(db.ID)
}
if err := s.secrets.StoreSecret(ctx, vaultPath, map[string]interface{}{"password": newPassword}); err != nil {
// Note: At this point DB password is changed but Vault update failed.
// The system is out of sync. Fallback password in DB record remains old.
return errors.Wrap(errors.Internal, "database password updated but failed to store in vault", err)
}

// 3. Update DB record if needed (metadata or path)
db.CredentialPath = vaultPath
if err := s.repo.Update(ctx, db); err != nil {
return err
}

Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a successful rotation you update Vault and the DB record’s CredentialPath, but you never update db.Password to newPassword. This breaks the documented “fallback compatibility”: if Vault is unavailable later, GetConnectionString will fall back to the stale password and clients won’t be able to connect. Either stop relying on db.Password as a fallback after rotation, or update it (and persist it) when rotation succeeds.

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +75
// DeleteSecret removes a secret from the specified path.
func (a *Adapter) DeleteSecret(ctx context.Context, path string) error {
_, err := a.client.Logical().DeleteWithContext(ctx, path)
if err != nil {
return fmt.Errorf("failed to delete secret from vault: %w", err)
}
return nil
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Vault KV v2, deleting a secret’s data vs. deleting its metadata are different endpoints. Logical().DeleteWithContext(ctx, path) against a /data/... path typically only deletes the latest version and leaves metadata/old versions behind. If the intent is to fully remove credentials when a database is deleted, implement KV v2 deletion semantics explicitly (e.g., use the metadata endpoint or a dedicated KV v2 helper) and document which behavior you want.

Copilot uses AI. Check for mistakes.
Comment thread docs/api-reference.md
Comment on lines +716 to +724
**Response Example**:
```json
{
"message": "database credentials rotated successfully",
"data": {
"message": "database credentials rotated successfully"
}
}
```
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response example shows a top-level message alongside data.message, but the API’s httputil.Success wrapper only emits data (and meta). The example should reflect the actual envelope (e.g., {"data": {"message": ...}, "meta": ...}) to avoid confusing SDK/CLI consumers.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 7, 2026 18:19
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 35 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +530 to +536
if idempotencyKey != "" {
s.rotationMu.Lock()
if s.rotationCache[idempotencyKey] {
s.rotationMu.Unlock()
return nil // Already rotated
}
s.rotationMu.Unlock()
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idempotency cache is keyed only by idempotencyKey, so reusing the same key across different databases will incorrectly short-circuit rotations. Also, the map has no eviction/TTL, so it can grow unbounded in a long-running process. Consider keying by {dbID}:{idempotencyKey} and persisting idempotency state (or at least adding TTL/size bounds).

Copilot uses AI. Check for mistakes.
Comment on lines +588 to +589
// 3. Update DB record if needed (metadata or path)
db.CredentialPath = vaultPath
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a successful rotation, the service updates Vault and credential_path but never updates db.Password to newPassword before calling repo.Update. Since GetConnectionString falls back to db.Password when Vault is unavailable, a Vault outage after rotation would return a connection string with a stale password. Update the stored fallback password (or explicitly clear it and adjust fallback behavior) when rotation succeeds.

Suggested change
// 3. Update DB record if needed (metadata or path)
db.CredentialPath = vaultPath
// 3. Update DB record if needed (metadata, path, and fallback password)
db.CredentialPath = vaultPath
db.Password = newPassword

Copilot uses AI. Check for mistakes.
Comment thread docs/api-reference.md

### POST /databases/:id/rotate-credentials
Regenerate the database password and update it in both the database engine and Vault.
- **Security**: Ensures credentials are never stored in plain text in the primary metadata store.
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bullet claims credentials are “never stored in plain text in the primary metadata store,” but the service/repo still persist the password column for fallback/legacy compatibility (also described elsewhere in the docs). Please adjust this statement to avoid promising a property the implementation doesn’t provide.

Suggested change
- **Security**: Ensures credentials are never stored in plain text in the primary metadata store.
- **Security**: Rotates credentials and synchronizes the updated password with the database engine and Vault.

Copilot uses AI. Check for mistakes.
Comment thread docs/database.md
port INT,
username VARCHAR(255),
password TEXT,
credential_path TEXT,
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema docs list credential_path TEXT, but the actual migration adds it as VARCHAR(255) (see 099_add_database_credential_path.up.sql). Please align the documentation with the real column type (or change the migration to TEXT if long Vault paths should be supported).

Suggested change
credential_path TEXT,
credential_path VARCHAR(255),

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 7, 2026 18:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 38 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 80 to 85
func (r *DatabaseRepository) scanDatabase(row pgx.Row) (*domain.Database, error) {
var db domain.Database
var engine, status, role string
err := row.Scan(
&db.ID, &db.UserID, &db.Name, &engine, &db.Version, &status, &role, &db.PrimaryID, &db.VpcID, &db.ContainerID, &db.Port, &db.Username, &db.Password, &db.CreatedAt, &db.UpdatedAt, &db.AllocatedStorage, &db.Parameters, &db.MetricsEnabled, &db.MetricsPort, &db.ExporterContainerID, &db.PoolingEnabled, &db.PoolingPort, &db.PoolerContainerID,
&db.ID, &db.UserID, &db.TenantID, &db.Name, &engine, &db.Version, &status, &role, &db.PrimaryID, &db.VpcID, &db.ContainerID, &db.Port, &db.Username, &db.Password, &db.CreatedAt, &db.UpdatedAt, &db.AllocatedStorage, &db.Parameters, &db.MetricsEnabled, &db.MetricsPort, &db.ExporterContainerID, &db.PoolingEnabled, &db.PoolingPort, &db.PoolerContainerID, &db.CredentialPath,
)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tenant_id is selected as a nullable column but scanned into db.TenantID (a non-nullable uuid.UUID). If any row has tenant_id = NULL (allowed by the migration), Scan will error and break listing/scanning. Consider making tenant_id NOT NULL at the DB level, or selecting COALESCE(tenant_id, '00000000-0000-0000-0000-000000000000'), or changing the domain field to a nullable type (e.g., *uuid.UUID/uuid.NullUUID).

Copilot uses AI. Check for mistakes.
Comment on lines +529 to 537
func (s *DatabaseService) RotateCredentials(ctx context.Context, id uuid.UUID, idempotencyKey string) error {
if idempotencyKey != "" {
s.rotationMu.Lock()
if s.rotationCache[idempotencyKey] {
s.rotationMu.Unlock()
return nil // Already rotated
}
s.rotationMu.Unlock()
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in-memory idempotency cache for rotations is keyed only by Idempotency-Key, so the same key reused for a different database ID would incorrectly short-circuit rotation. Also, the map has no TTL/eviction and can grow unbounded over time. Consider scoping the cache key by database ID and adding expiry/size bounds (or persisting idempotency in the DB).

Copilot uses AI. Check for mistakes.
Comment on lines +577 to 614
// 2. Update in Vault ONLY after DB success
vaultPath := db.CredentialPath
if vaultPath == "" {
vaultPath = s.getVaultPath(db.ID)
}
if err := s.secrets.StoreSecret(ctx, vaultPath, map[string]interface{}{"password": newPassword}); err != nil {
// Note: At this point DB password is changed but Vault update failed.
// The system is out of sync. Fallback password in DB record remains old.
return errors.Wrap(errors.Internal, "database password updated but failed to store in vault", err)
}

// 3. Update DB record if needed (metadata or path)
db.CredentialPath = vaultPath
if err := s.repo.Update(ctx, db); err != nil {
return err
}

// 4. If pooler is enabled, restart it to pick up new credentials
if db.PoolerContainerID != "" {
if err := s.compute.DeleteInstance(ctx, db.PoolerContainerID); err != nil {
s.logger.Warn("failed to delete old pooler during rotation", "pooler_id", db.PoolerContainerID, "error", err)
}
db.PoolerContainerID = ""
dbIP, err := s.compute.GetInstanceIP(ctx, db.ContainerID)
if err != nil {
return s.performProvisioningRollback(ctx, db, vol.ID.String(), errors.Wrap(errors.Internal, "failed to get restored database IP", err))
return errors.Wrap(errors.Internal, "failed to get database IP for pooler restart", err)
}
if err := s.provisionSidecars(ctx, db, dbEngine, dbIP, username, password, networkID); err != nil {
return s.performProvisioningRollback(ctx, db, vol.ID.String(), err)
networkID, err := s.resolveVpcNetwork(ctx, db.VpcID)
if err != nil {
return errors.Wrap(errors.Internal, "failed to resolve network for pooler restart", err)
}
if err := s.provisionPoolerSidecar(ctx, db, db.Engine, dbIP, db.Username, newPassword, networkID); err != nil {
return errors.Wrap(errors.Internal, "failed to provision new pooler sidecar during rotation", err)
}
if err := s.repo.Update(ctx, db); err != nil {
return err
}
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a successful rotation, the DB metadata record keeps db.Password unchanged (only CredentialPath is updated). This means your fallback path (GetConnectionString uses db.Password when Vault read fails) will return an invalid password post-rotation, and other components (e.g., metrics exporter) may continue using stale credentials. Consider updating the stored password alongside Vault (if fallback is required), and restarting/reprovisioning any credential-dependent sidecars (exporter as well as pooler) to avoid breaking metrics/ops after rotation.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +19
-- +goose Up
-- Add tenant_id to remaining resource tables

-- 1. databases
ALTER TABLE databases ADD COLUMN IF NOT EXISTS tenant_id UUID REFERENCES tenants(id) ON DELETE SET NULL;
CREATE INDEX IF NOT EXISTS idx_databases_tenant ON databases(tenant_id);

-- 2. usage_records
ALTER TABLE usage_records ADD COLUMN IF NOT EXISTS tenant_id UUID REFERENCES tenants(id) ON DELETE SET NULL;
CREATE INDEX IF NOT EXISTS idx_usage_records_tenant ON usage_records(tenant_id);

-- 3. api_keys
ALTER TABLE api_keys ADD COLUMN IF NOT EXISTS tenant_id UUID REFERENCES tenants(id) ON DELETE SET NULL;
CREATE INDEX IF NOT EXISTS idx_api_keys_tenant ON api_keys(tenant_id);

-- Backfill existing data using default_tenant_id from users table
UPDATE databases d SET tenant_id = u.default_tenant_id FROM users u WHERE d.user_id = u.id AND d.tenant_id IS NULL;
UPDATE usage_records ur SET tenant_id = u.default_tenant_id FROM users u WHERE ur.user_id = u.id AND ur.tenant_id IS NULL;
UPDATE api_keys ak SET tenant_id = u.default_tenant_id FROM users u WHERE ak.user_id = u.id AND ak.tenant_id IS NULL;
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is described as Vault integration + credential rotation, but this migration also introduces tenant_id to databases, usage_records, and api_keys (with backfills and new indexes). If this is intentional scope expansion, it should be called out in the PR description/release notes; otherwise consider moving it to a separate PR to reduce risk and review surface area.

Copilot uses AI. Check for mistakes.
Comment on lines 39 to 85
@@ -76,6 +79,9 @@ func NewConfig() (*Config, error) {
FirecrackerKernel: getEnv("FIRECRACKER_KERNEL", "/var/lib/thecloud/vmlinux"),
FirecrackerRootfs: getEnv("FIRECRACKER_ROOTFS", "/var/lib/thecloud/rootfs.ext4"),
FirecrackerMockMode: getEnv("FIRECRACKER_MOCK_MODE", "false") == "true",
VaultAddress: getEnv("VAULT_ADDR", "http://localhost:8200"),
VaultToken: getEnv("VAULT_TOKEN", ""),
VaultMountPath: getEnv("VAULT_MOUNT_PATH", "secret/data/thecloud/rds"),
}, nil
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VAULT_MOUNT_PATH is named like a mount (e.g., secret) but the default includes the KV v2 /data/ segment (secret/data/...), and DatabaseService.getVaultPath concatenates it directly. This is easy to misconfigure (setting the mount to secret would generate non-KV-v2 paths and break reads/writes). Consider renaming this setting to something like VAULT_KV_V2_PATH_PREFIX or having the adapter build the correct KV v2 API path from a true mount + logical secret name.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 7, 2026 20:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 43 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +154 to +162
db := s.initialDatabaseRecord(userID, name, primary.Engine, primary.Version, primary.Username, password, primary.VpcID)
db.TenantID = tenantID
db.Role = domain.RoleReplica
db.PrimaryID = &primaryID
db.AllocatedStorage = primary.AllocatedStorage
db.MetricsEnabled = primary.MetricsEnabled
db.PoolingEnabled = primary.PoolingEnabled

func (s *DatabaseService) validateCreationRequest(req ports.CreateDatabaseRequest, engine domain.DatabaseEngine) error {
if !s.isValidEngine(engine) {
return errors.New(errors.InvalidInput, "unsupported database engine")
}
if req.AllocatedStorage < 10 {
return errors.New(errors.InvalidInput, "allocated storage must be at least 10GB")
}
if req.PoolingEnabled && engine != domain.EnginePostgres {
return errors.New(errors.InvalidInput, "connection pooling is currently only supported for PostgreSQL")
}
return nil
return s.provisionDatabase(ctx, db, password, primary.Parameters, primaryIP, "DATABASE_REPLICA_CREATE")
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateReplica provisions/launches using primary.Parameters, but never assigns them to db.Parameters before persisting. This can result in parameters being inserted as NULL (bypassing the DB default) and later failing to scan into map[string]string on reads, and also means replicas won't persist the inherited config. Set db.Parameters = primary.Parameters (or an empty map) before repo.Create.

Copilot uses AI. Check for mistakes.
Comment on lines +181 to +190
db := s.initialDatabaseRecord(userID, req.NewName, dbEngine, req.Version, username, password, req.VpcID)
db.TenantID = tenantID
db.Role = domain.RoleReplica
db.PrimaryID = &primaryID
db.AllocatedStorage = primary.AllocatedStorage
db.MetricsEnabled = primary.MetricsEnabled
db.PoolingEnabled = primary.PoolingEnabled

imageName, env, defaultPort := s.getEngineConfig(primary.Engine, primary.Version, primary.Username, primary.Password, name, db.Role, primaryIP)
db.AllocatedStorage = req.AllocatedStorage
if snap.SizeGB > db.AllocatedStorage {
db.AllocatedStorage = snap.SizeGB
}
db.MetricsEnabled = req.MetricsEnabled
db.PoolingEnabled = req.PoolingEnabled

Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RestoreDatabase creates a DB record but does not set db.Role (primary) or db.Parameters before provisioning/persisting. This can store an empty role and NULL parameters, breaking subsequent reads and making restored instances inconsistent with regular creates. Explicitly set db.Role = domain.RolePrimary and assign db.Parameters = req.Parameters (or {}) before calling finalizeProvisioning.

Copilot uses AI. Check for mistakes.
Comment on lines +537 to 550
func (s *DatabaseService) RotateCredentials(ctx context.Context, id uuid.UUID, idempotencyKey string) error {
if idempotencyKey != "" {
s.rotationMu.Lock()
if s.rotationCache[idempotencyKey] {
s.rotationMu.Unlock()
return nil // Already rotated
}
s.rotationMu.Unlock()
}

snap, err := s.snapshotSvc.GetSnapshot(ctx, req.SnapshotID)
db, err := s.repo.GetByID(ctx, id)
if err != nil {
return nil, err
return err
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RotateCredentials bypasses RBAC authorization checks that other service methods enforce (Create/Modify/Delete/etc.). Even though the HTTP route has middleware, this service method can be called from other code paths (workers/tests) without protection. Add an Authorize call for domain.PermissionDBUpdate on the database ID, consistent with other DatabaseService methods.

Copilot uses AI. Check for mistakes.
Comment on lines +538 to 545
if idempotencyKey != "" {
s.rotationMu.Lock()
if s.rotationCache[idempotencyKey] {
s.rotationMu.Unlock()
return nil // Already rotated
}
s.rotationMu.Unlock()
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idempotency handling is not safe under concurrent requests: the code checks rotationCache[idempotencyKey] and unlocks before any “in-progress” marker is set, so two requests with the same key can both rotate credentials concurrently. Consider using singleflight.Group, or store a state (in-progress/done) under the lock and keep it until completion.

Copilot uses AI. Check for mistakes.
Comment on lines +624 to +627
if idempotencyKey != "" {
s.rotationMu.Lock()
s.rotationCache[idempotencyKey] = true
s.rotationMu.Unlock()
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rotationCache grows without bounds and has no TTL/eviction, which can lead to unbounded memory growth if clients send unique Idempotency-Key values. Use a bounded cache (LRU) and/or TTL-based eviction, or persist idempotency state in the datastore with an expiration.

Copilot uses AI. Check for mistakes.
Comment on lines +596 to +597
// 3. Update DB record if needed (metadata or path)
db.CredentialPath = vaultPath
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a successful rotation, the database metadata password field is never updated (only Vault is updated and CredentialPath is saved). Since GetConnectionString falls back to db.Password when Vault is unavailable, this makes the fallback incorrect after rotation. Either update the stored fallback password on success, or remove/disable fallback usage once Vault-backed credentials are enabled.

Suggested change
// 3. Update DB record if needed (metadata or path)
db.CredentialPath = vaultPath
// 3. Update DB record so metadata stays consistent with the rotated credential.
db.CredentialPath = vaultPath
db.Password = newPassword

Copilot uses AI. Check for mistakes.
Comment thread docs/database.md
port INT,
username VARCHAR(255),
password TEXT,
credential_path TEXT,
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documented table schema uses credential_path TEXT, but the migration adds credential_path VARCHAR(255). Please align the documentation with the actual schema (or adjust the migration) to avoid confusion for operators and contributors.

Suggested change
credential_path TEXT,
credential_path VARCHAR(255),

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 7, 2026 20:30
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 43 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +541 to +547
func (s *DatabaseService) RotateCredentials(ctx context.Context, id uuid.UUID, idempotencyKey string) error {
if idempotencyKey != "" {
s.rotationMu.Lock()
if s.rotationCache[idempotencyKey] {
s.rotationMu.Unlock()
return nil // Already rotated
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RotateCredentials bypasses RBAC authorization checks in the service layer (unlike the other DatabaseService methods). Even though the HTTP route is wrapped with permission middleware, this method can be called from internal code/worker paths and should enforce tenant/user authorization consistently.

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +58
vaultMountPath string
// Simple in-memory idempotency cache for rotation
rotationCache map[string]bool
rotationMu sync.Mutex
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in-memory idempotency cache (rotationCache) grows unbounded and has no TTL/eviction. In a long-running API process, unique Idempotency-Key values can accumulate indefinitely and increase memory usage.

Copilot uses AI. Check for mistakes.
var cmd []string
switch db.Engine {
case domain.EnginePostgres:
cmd = []string{"psql", "-h", "127.0.0.1", "-U", db.Username, "-d", "postgres", "-c", fmt.Sprintf("ALTER USER %s WITH PASSWORD '%s';", db.Username, newPassword)}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Postgres rotation command uses psql -h 127.0.0.1 -U <user> without providing a password (e.g., via PGPASSWORD) and forces TCP auth. With the official Postgres image defaults, this typically fails with “no password supplied”. Consider connecting via the local socket (omit -h) or passing credentials explicitly for non-interactive exec.

Suggested change
cmd = []string{"psql", "-h", "127.0.0.1", "-U", db.Username, "-d", "postgres", "-c", fmt.Sprintf("ALTER USER %s WITH PASSWORD '%s';", db.Username, newPassword)}
cmd = []string{"psql", "-U", db.Username, "-d", "postgres", "-c", fmt.Sprintf("ALTER USER %s WITH PASSWORD '%s';", db.Username, newPassword)}

Copilot uses AI. Check for mistakes.
if err := s.resolveDatabasePort(ctx, db, allocatedPorts, defaultPort); err != nil {
return s.performProvisioningRollback(ctx, db, vol.ID.String(), errors.Wrap(errors.Internal, "failed to resolve restored database port", err))
// 3. Update DB record if needed (metadata or path)
db.CredentialPath = vaultPath
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a successful rotation, the metadata fallback password (db.Password) is not updated before calling repo.Update. If Vault later becomes unavailable, GetConnectionString() will fall back to the stale password and return an invalid connection string. If the password field is intentionally kept for fallback/legacy compatibility, it should be updated to the new password on success.

Suggested change
db.CredentialPath = vaultPath
db.CredentialPath = vaultPath
db.Password = newPassword

Copilot uses AI. Check for mistakes.
env := []string{"MYSQL_ROOT_PASSWORD=" + password, "MYSQL_DATABASE=" + name}
if role == domain.RoleReplica {
env = append(env, "MYSQL_MASTER_HOST="+primaryIP)
env = append(env, "PRIMARY_HOST="+primaryIP)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MySQL replica containers are configured with PRIMARY_HOST but the docs mention MYSQL_MASTER_HOST for MySQL replication. If replication setup relies on the documented variable, this will break replica provisioning; consider setting the expected MySQL env var (or both) and keeping docs/code consistent.

Suggested change
env = append(env, "PRIMARY_HOST="+primaryIP)
env = append(env,
"PRIMARY_HOST="+primaryIP,
"MYSQL_MASTER_HOST="+primaryIP,
)

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +37
func (m *NoOpSecretsManager) GetSecret(ctx context.Context, path string) (map[string]interface{}, error) {
m.mu.RLock()
defer m.mu.RUnlock()
data, ok := m.secrets[path]
if !ok {
return nil, fmt.Errorf("secret not found at path: %s", path)
}
return data, nil
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoOpSecretsManager.GetSecret returns the internal stored map by reference. Callers can mutate it outside the lock, which can cause data races and corrupt the in-memory store. Consider returning a shallow copy (and optionally copying nested values) to keep the store’s internal state immutable from the caller’s perspective.

Copilot uses AI. Check for mistakes.
@poyrazK poyrazK merged commit b03ffc5 into main Apr 8, 2026
47 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants