feat(rds): implement persistent volume integration for managed databases by poyrazK · Pull Request #76 · poyrazK/thecloud

poyrazK · 2026-02-28T13:43:23Z

This PR implements data persistence for the Managed Database Service by integrating it with the VolumeService.

Changes:

internal/core/services/database.go: Added logic to provision a 10GB persistent block volume for every new database (primary and replica). The volumes are mounted to /var/lib/postgresql/data (Postgres) or /var/lib/mysql (MySQL).
internal/api/setup/dependencies.go: Injected VolumeService into the DatabaseService constructor.
tests/database_persistence_e2e_test.go: Added a new E2E test suite to verify that volumes are correctly provisioned during DB creation and cleaned up during DB deletion.
Unit/Integration Tests: Updated existing tests to handle the new dependency and verify the persistent volume lifecycle.

Summary by CodeRabbit

New Features
- Databases and read replicas now get automatically provisioned persistent block volumes (10GB) and have them mounted for durable storage.
Bug Fixes
- Volumes created during provisioning are cleaned up on failures and removed when databases/replicas are deleted to avoid orphaned storage.
Tests
- Added unit and end-to-end tests covering volume provisioning, attachment, and cleanup.
Documentation
- Docs and design ADR updated to describe persistence behavior, mounts, and lifecycle.

coderabbitai · 2026-02-28T13:43:39Z

📝 Walkthrough

Walkthrough

Injects a VolumeService into the DatabaseService, provisions and binds persistent volumes for databases and replicas, cleans up volumes on failures and deletions, and extends unit and end-to-end tests to cover the volume lifecycle.

Changes

Cohort / File(s)	Summary
Dependency Injection Setup `internal/api/setup/dependencies.go`	Add `VolumeSvc` to `DatabaseServiceParams` and pass `volumeSvc` when constructing the database service.
Core Database Service Implementation `internal/core/services/database.go`	DatabaseService gains a `volumeSvc` dependency; creates persistent volumes for primary and replica databases, includes volume binds when launching containers, and deletes associated volumes on database deletion or on failed creation flows.
Unit Tests `internal/core/services/database_test.go`, `internal/core/services/database_unit_test.go`	Wire real/noop and mock VolumeService instances into service tests; add Create/List/Delete volume calls and adjust expectations to verify volume lifecycle in unit tests.
End-to-end Tests `tests/database_persistence_e2e_test.go`	Add E2E test that creates databases (Postgres/MySQL), verifies volume provisioning by prefix, and ensures volumes are removed after database deletion with polling.
Docs & ADR `docs/FEATURES.md`, `docs/database.md`, `docs/adr/ADR-015-database-data-persistence.md`	Document persistent volume behavior: automatic 10GB volume provisioning, mount paths per engine, replica handling, automated cleanup, and design rationale/limitations.

Sequence Diagram(s)

sequenceDiagram
    Client->>API: POST /databases
    API->>DatabaseService: CreateDatabase(req)
    DatabaseService->>VolumeSvc: CreateVolume(db-id, 10GB, params)
    VolumeSvc->>VolumeRepo: PersistVolume(...)
    VolumeRepo-->>VolumeSvc: volume
    VolumeSvc-->>DatabaseService: volume
    DatabaseService->>DatabaseRepo: PersistDatabase(with volume)
    DatabaseRepo-->>DatabaseService: database
    DatabaseService-->>API: 201 Created

    Client->>API: DELETE /databases/{id}
    API->>DatabaseService: DeleteDatabase(id)
    DatabaseService->>VolumeSvc: ListVolumes(filter db-id)
    VolumeSvc->>VolumeRepo: QueryVolumes(...)
    VolumeRepo-->>VolumeSvc: volumes
    VolumeSvc-->>DatabaseService: volumes
    DatabaseService->>VolumeSvc: DeleteVolume(volume-id) [*for each*]
    VolumeSvc->>VolumeRepo: RemoveVolume(...)
    VolumeRepo-->>VolumeSvc: removed
    DatabaseService->>DatabaseRepo: DeleteDatabase(id)
    DatabaseRepo-->>DatabaseService: deleted
    DatabaseService-->>API: 200 OK

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I dug a little tunnel near the code,
Planted volumes so data can grow.
I bound and cleaned with nimble paws,
Hopped when tests cheered out applause. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: implementing persistent volume integration for the managed database service.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/rds-persistence-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (2)

internal/core/services/database.go (1)
379-395: Volume lookup by prefix scan is acceptable but consider tracking volume ID.

The current approach iterates all volumes to find the matching one by name prefix. While this works, it could become inefficient at scale. Consider storing the volume ID in the database record for direct lookup in a future iteration.

Additionally, the break on line 392 means only the first matching volume is deleted. If there's any chance of multiple volumes matching (unlikely with current naming), this could leave orphaned volumes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/services/database.go` around lines 379 - 395, The current
cleanup scans all volumes via s.volumeSvc.ListVolumes and deletes only the first
match (strings.HasPrefix check) which is inefficient and may leave orphaned
volumes; update the code to persist the volume ID on database creation (add a
volume_id field to the DB record) so future deletes can call
s.volumeSvc.DeleteVolume(ctx, volumeID) directly using that stored ID, and in
the short term remove the premature break in the loop in the cleanup block (the
loop that calls s.volumeSvc.DeleteVolume for matching v.ID) so all matching
volumes are deleted if a stored volume ID isn't yet available; refer to
s.volumeSvc.ListVolumes, s.volumeSvc.DeleteVolume, db.ID and db.Role to locate
the relevant code paths to change.
tests/database_persistence_e2e_test.go (1)
79-80: Fixed sleep may cause flaky tests in CI.

A 1-second sleep might be insufficient under load or excessive in fast environments. Consider polling with a timeout instead for more reliable async cleanup verification.
Example polling approach
// Poll for volume cleanup with timeout
deadline := time.Now().Add(5 * time.Second)
for time.Now().Before(deadline) {
    respVols := getRequest(t, client, testutil.TestBaseURL+"/volumes", token)
    // ... check if volume is gone ...
    if !found {
        break
    }
    time.Sleep(200 * time.Millisecond)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/database_persistence_e2e_test.go` around lines 79 - 80, Replace the
fixed time.Sleep(1 * time.Second) in tests/database_persistence_e2e_test.go with
a polling loop that repeatedly calls getRequest(t, client,
testutil.TestBaseURL+"/volumes", token) until the desired cleanup condition
(volume not found) is met or a timeout elapses; implement a short interval
between polls (e.g., ~200ms) and a reasonable overall deadline (e.g., ~5s) so
the test waits only as long as needed and fails if cleanup doesn’t happen within
the timeout.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/services/database.go`:
- Around line 167-172: Replace the magic literal 10 when creating the replica
volume with the shared volume-size constant used elsewhere (the same one used
for the primary DB volume) instead of a hardcoded number; update the call to
s.volumeSvc.CreateVolume(ctx, volName, <sharedVolumeSizeConstant>) so volName
and the CreateVolume invocation remain, but use the existing constant (e.g.,
DefaultDBVolumeSizeGB or the project’s primary volume-size constant) for the
replica size.
- Around line 80-85: Replace the magic number 10 used when creating the DB
persistent volume with a package-level named constant; add a constant (e.g.
DefaultDatabaseVolumeGB = 10) at the top of the package and update the call to
s.volumeSvc.CreateVolume(ctx, volName, 10) to use DefaultDatabaseVolumeGB
instead, referencing the existing symbols volName, s.volumeSvc.CreateVolume and
db.ID to locate the change.

In `@tests/database_persistence_e2e_test.go`:
- Line 100: The assertion uses fmt.Sprintf unnecessarily; update the call to
assert.False to pass the format string and argument directly instead of wrapping
them in fmt.Sprintf. Locate the assert.False invocation (the one checking found
with the message "Volume with prefix %s should have been deleted") and change it
to assert.False(t, found, "Volume with prefix %s should have been deleted",
expectedPrefix) so the test framework handles formatting.
- Line 70: Replace the call that builds the assertion message with fmt.Sprintf
by passing the format string and argument directly to assert.True; specifically,
update the assert.True(t, found, fmt.Sprintf("Expected volume with prefix %s not
found", expectedPrefix)) call to assert.True(t, found, "Expected volume with
prefix %s not found", expectedPrefix) so the message is passed as the
assertion's format string (affecting the assert.True usage with found and
expectedPrefix).

---

Nitpick comments:
In `@internal/core/services/database.go`:
- Around line 379-395: The current cleanup scans all volumes via
s.volumeSvc.ListVolumes and deletes only the first match (strings.HasPrefix
check) which is inefficient and may leave orphaned volumes; update the code to
persist the volume ID on database creation (add a volume_id field to the DB
record) so future deletes can call s.volumeSvc.DeleteVolume(ctx, volumeID)
directly using that stored ID, and in the short term remove the premature break
in the loop in the cleanup block (the loop that calls s.volumeSvc.DeleteVolume
for matching v.ID) so all matching volumes are deleted if a stored volume ID
isn't yet available; refer to s.volumeSvc.ListVolumes, s.volumeSvc.DeleteVolume,
db.ID and db.Role to locate the relevant code paths to change.

In `@tests/database_persistence_e2e_test.go`:
- Around line 79-80: Replace the fixed time.Sleep(1 * time.Second) in
tests/database_persistence_e2e_test.go with a polling loop that repeatedly calls
getRequest(t, client, testutil.TestBaseURL+"/volumes", token) until the desired
cleanup condition (volume not found) is met or a timeout elapses; implement a
short interval between polls (e.g., ~200ms) and a reasonable overall deadline
(e.g., ~5s) so the test waits only as long as needed and fails if cleanup
doesn’t happen within the timeout.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37ad4c0 and 820da73.

📒 Files selected for processing (5)

internal/api/setup/dependencies.go
internal/core/services/database.go
internal/core/services/database_test.go
internal/core/services/database_unit_test.go
tests/database_persistence_e2e_test.go

coderabbitai · 2026-02-28T13:46:49Z

+	// Create persistent volume for the database
+	volName := fmt.Sprintf("db-vol-%s", db.ID.String()[:8])
+	vol, err := s.volumeSvc.CreateVolume(ctx, volName, 10) // 10GB default
+	if err != nil {
+		return nil, errors.Wrap(errors.Internal, "failed to create persistent volume", err)
+	}


⚠️ Potential issue | 🟡 Minor

Use a named constant for the default volume size.

The literal 10 for the volume size in GB should be a named constant for clarity and maintainability.

As per coding guidelines: "Do not use magic numbers - use named constants instead."

Proposed fix

Add a constant at package level:

const ( // DefaultDatabaseVolumeGB is the default size in GB for database persistent volumes. DefaultDatabaseVolumeGB = 10 )

Then update the calls:

- vol, err := s.volumeSvc.CreateVolume(ctx, volName, 10) // 10GB default + vol, err := s.volumeSvc.CreateVolume(ctx, volName, DefaultDatabaseVolumeGB)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/core/services/database.go` around lines 80 - 85, Replace the magic number 10 used when creating the DB persistent volume with a package-level named constant; add a constant (e.g. DefaultDatabaseVolumeGB = 10) at the top of the package and update the call to s.volumeSvc.CreateVolume(ctx, volName, 10) to use DefaultDatabaseVolumeGB instead, referencing the existing symbols volName, s.volumeSvc.CreateVolume and db.ID to locate the change.

coderabbitai · 2026-02-28T13:46:49Z

+	// Create persistent volume for the replica
+	volName := fmt.Sprintf("db-replica-vol-%s", db.ID.String()[:8])
+	vol, err := s.volumeSvc.CreateVolume(ctx, volName, 10) // 10GB default
+	if err != nil {
+		return nil, errors.Wrap(errors.Internal, "failed to create persistent volume for replica", err)
+	}


⚠️ Potential issue | 🟡 Minor

Same magic number issue for replica volume size.

Apply the same constant here for consistency.

Proposed fix

- vol, err := s.volumeSvc.CreateVolume(ctx, volName, 10) // 10GB default + vol, err := s.volumeSvc.CreateVolume(ctx, volName, DefaultDatabaseVolumeGB)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Create persistent volume for the replica

volName := fmt.Sprintf("db-replica-vol-%s", db.ID.String()[:8])

vol, err := s.volumeSvc.CreateVolume(ctx, volName, 10) // 10GB default

if err != nil {

return nil, errors.Wrap(errors.Internal, "failed to create persistent volume for replica", err)

}

// Create persistent volume for the replica

volName := fmt.Sprintf("db-replica-vol-%s", db.ID.String()[:8])

vol, err := s.volumeSvc.CreateVolume(ctx, volName, DefaultDatabaseVolumeGB)

if err != nil {

return nil, errors.Wrap(errors.Internal, "failed to create persistent volume for replica", err)

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/core/services/database.go` around lines 167 - 172, Replace the magic literal 10 when creating the replica volume with the shared volume-size constant used elsewhere (the same one used for the primary DB volume) instead of a hardcoded number; update the call to s.volumeSvc.CreateVolume(ctx, volName, <sharedVolumeSizeConstant>) so volName and the CreateVolume invocation remain, but use the existing constant (e.g., DefaultDBVolumeSizeGB or the project’s primary volume-size constant) for the replica size.

Copilot

Pull request overview

Adds persistent storage for managed databases by provisioning a VolumeService-backed volume per database/replica and mounting it into the DB runtime, plus updates wiring and tests to validate provisioning and cleanup.

Changes:

Provision a 10GB volume during DB/replica creation and pass it via VolumeBinds to the compute backend.
Inject VolumeService into DatabaseService initialization.
Add/adjust E2E + unit/integration tests around volume lifecycle.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`internal/core/services/database.go`	Creates per-DB volumes, mounts them on container start, and attempts volume cleanup on delete.
`internal/api/setup/dependencies.go`	Wires `VolumeService` into `DatabaseService` constructor params.
`internal/core/services/database_unit_test.go`	Updates unit tests to mock volume provisioning/deletion behavior.
`internal/core/services/database_test.go`	Updates integration test setup to provide a real `VolumeService` dependency.
`tests/database_persistence_e2e_test.go`	Adds E2E coverage asserting volume creation and cleanup around DB lifecycle.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-28T13:49:18Z

 			s.logger.Warn("failed to remove database container", "container_id", db.ContainerID, "error", err)
 		}
 	}

-	// 2. Delete from repo
+	// 2. Clean up volume


If compute.DeleteInstance fails, the method still proceeds to volume deletion. That can delete the underlying storage while the DB container is potentially still running, risking data loss/corruption. Consider returning an error (or at least skipping volume deletion) when container removal fails, and/or ensure the instance is stopped/removed before decommissioning the volume.

Copilot · 2026-02-28T13:49:18Z

+	}
+
+	backendVolName := "thecloud-vol-" + vol.ID.String()[:8]
+	if vol.BackendPath != "" {


vol.BackendPath is being used as the Docker bind source. With the current LVM storage backend, BackendPath is a block device path like /dev/<vg>/<lv> (not a directory), and Docker bind-mounting that onto /var/lib/postgresql/data will fail. Consider using the backend name (e.g. thecloud-vol-<id> / a named volume) when compute.Type()=="docker", or ensure the storage backend returns a mounted directory path suitable for Docker binds.

Suggested change

if vol.BackendPath != "" {

if s.compute.Type() != "docker" && vol.BackendPath != "" {

Copilot · 2026-02-28T13:49:18Z

+	}
+
+	backendVolName := "thecloud-vol-" + vol.ID.String()[:8]
+	if vol.BackendPath != "" {


Same issue as primary creation: using vol.BackendPath as the bind source will break Docker when the storage backend returns a block device path (e.g. LVM /dev/<vg>/<lv>). Prefer a Docker-compatible source (named volume or mounted directory) or branch on compute.Type()/storage backend type.

Suggested change

if vol.BackendPath != "" {

if vol.BackendPath != "" && !strings.HasPrefix(vol.BackendPath, "/dev/") {

Copilot · 2026-02-28T13:49:18Z

+	// We try to find the volume by the expected name
+	vols, err := s.volumeSvc.ListVolumes(ctx)
+	if err == nil {
+		expectedPrefix := fmt.Sprintf("db-vol-%s", db.ID.String()[:8])
+		if db.Role == domain.RoleReplica {
+			expectedPrefix = fmt.Sprintf("db-replica-vol-%s", db.ID.String()[:8])
+		}
+		for _, v := range vols {
+			if strings.HasPrefix(v.Name, expectedPrefix) {
+				if err := s.volumeSvc.DeleteVolume(ctx, v.ID.String()); err != nil {
+					s.logger.Warn("failed to delete database volume", "volume_id", v.ID, "error", err)
+				}
+				break
+			}
+		}
+	}


Volume cleanup currently does a full ListVolumes() and deletes the first entry with a matching prefix. This is O(n) per delete and can delete the wrong volume if any other volume happens to share the prefix. Since the expected volume name is deterministic, delete by exact name via volumeSvc.DeleteVolume(ctx, expectedName) (or GetVolume by name) and log/handle the ListVolumes error path explicitly.

Suggested change

// We try to find the volume by the expected name

vols, err := s.volumeSvc.ListVolumes(ctx)

if err == nil {

expectedPrefix := fmt.Sprintf("db-vol-%s", db.ID.String()[:8])

if db.Role == domain.RoleReplica {

expectedPrefix = fmt.Sprintf("db-replica-vol-%s", db.ID.String()[:8])

}

for _, v := range vols {

if strings.HasPrefix(v.Name, expectedPrefix) {

if err := s.volumeSvc.DeleteVolume(ctx, v.ID.String()); err != nil {

s.logger.Warn("failed to delete database volume", "volume_id", v.ID, "error", err)

}

break

}

}

}

// Delete the expected volume directly by deterministic name instead of scanning all volumes

expectedName := fmt.Sprintf("db-vol-%s", db.ID.String())

if db.Role == domain.RoleReplica {

expectedName = fmt.Sprintf("db-replica-vol-%s", db.ID.String())

}

if err := s.volumeSvc.DeleteVolume(ctx, expectedName); err != nil {

s.logger.Warn("failed to delete database volume", "volume_name", expectedName, "error", err)

}

Copilot · 2026-02-28T13:49:18Z

+		// Wait a bit for async cleanup if any
+		time.Sleep(1 * time.Second)
+
+		// Verify Volume is gone
+		respVols := getRequest(t, client, testutil.TestBaseURL+"/volumes", token)
+		defer func() { _ = respVols.Body.Close() }()
+		require.Equal(t, http.StatusOK, respVols.StatusCode)
+
+		var volsRes struct {
+			Data []domain.Volume `json:"data"`
+		}
+		require.NoError(t, json.NewDecoder(respVols.Body).Decode(&volsRes))
+
+		found := false
+		expectedPrefix := fmt.Sprintf("db-vol-%s", dbID[:8])
+		for _, v := range volsRes.Data {
+			if strings.HasPrefix(v.Name, expectedPrefix) {
+				found = true
+				break
+			}
+		}
+		assert.False(t, found, fmt.Sprintf("Volume with prefix %s should have been deleted", expectedPrefix))


The fixed time.Sleep(1 * time.Second) makes this E2E test both slower and potentially flaky if volume cleanup takes longer than expected. Prefer polling /volumes until the expected volume disappears (with a reasonable timeout), similar to other E2E tests that retry for eventual consistency.

Suggested change

// Wait a bit for async cleanup if any

time.Sleep(1 * time.Second)

// Verify Volume is gone

respVols := getRequest(t, client, testutil.TestBaseURL+"/volumes", token)

defer func() { _ = respVols.Body.Close() }()

require.Equal(t, http.StatusOK, respVols.StatusCode)

var volsRes struct {

Data []domain.Volume `json:"data"`

}

require.NoError(t, json.NewDecoder(respVols.Body).Decode(&volsRes))

found := false

expectedPrefix := fmt.Sprintf("db-vol-%s", dbID[:8])

for _, v := range volsRes.Data {

if strings.HasPrefix(v.Name, expectedPrefix) {

found = true

break

}

}

assert.False(t, found, fmt.Sprintf("Volume with prefix %s should have been deleted", expectedPrefix))

// Verify Volume is eventually gone (async cleanup)

expectedPrefix := fmt.Sprintf("db-vol-%s", dbID[:8])

deadline := time.Now().Add(30 * time.Second)

for {

respVols := getRequest(t, client, testutil.TestBaseURL+"/volumes", token)

func() {

defer func() { _ = respVols.Body.Close() }()

require.Equal(t, http.StatusOK, respVols.StatusCode)

var volsRes struct {

Data []domain.Volume `json:"data"`

}

require.NoError(t, json.NewDecoder(respVols.Body).Decode(&volsRes))

found := false

for _, v := range volsRes.Data {

if strings.HasPrefix(v.Name, expectedPrefix) {

found = true

break

}

}

if !found {

// Volume has been cleaned up

return

}

}()

if time.Now().After(deadline) {

t.Fatalf("Volume with prefix %s should have been deleted within timeout", expectedPrefix)

}

time.Sleep(500 * time.Millisecond)

}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/database_persistence_e2e_test.go`:
- Line 31: Replace the magic number modulus in the dbName generation with a
named constant (e.g., const dbSuffixMod = 1000 or better const dbPrefixLen = 8
if switching to hex/UUID) and ensure any slicing of dbID (dbID[:8]) is guarded
by computing min(len(dbID), dbPrefixLen) or by using a helper GetPrefix(dbID
string, n int) to safely return the first n characters; update all occurrences
(dbName creation and the dbID[:8] uses) to use the constant/helper so you avoid
magic numbers and prevent panics if ID format changes.
- Around line 79-85: Replace the fixed time.Sleep with bounded polling that
repeatedly calls getRequest(t, client, testutil.TestBaseURL+"/volumes", token)
(the same logic around respVols and respVols.Body.Close()) until the expected
cleanup condition is observed or a timeout elapses; implement this either using
require.Eventually or a manual loop with time.Ticker and time.After (e.g., poll
every 100–200ms up to a 5s timeout), and then assert the final HTTP status/body
indicates the Volume is gone. Apply the same change to the second occurrence
handling the later cleanup check.
- Line 41: The defer statements currently discarding Close() errors (e.g., defer
func() { _ = resp.Body.Close() }()) should be replaced with deferred closures
that check the error and report it to the test harness instead of ignoring it;
locate the places where the http response variable resp is closed (the deferred
Close calls around resp) and change each to something that captures err :=
resp.Body.Close() and calls t.Fatalf/t.Fatalf-like assertion (or
require.NoError(t, err)) so test failures are surfaced rather than swallowed.
- Around line 34-39: The test CreateDatabase_ProvisionsVolume should be
converted to a table-driven subtest: replace the hard-coded payload using
"engine":"postgres" with a slice of test cases (e.g., cases with engine
"postgres"/version "16" and engine "mysql"/appropriate version) and iterate over
them, running t.Run for each case (use a descriptive name like
fmt.Sprintf("CreateDatabase_ProvisionsVolume/%s", tc.engine)); inside the loop
build the payload map from tc.engine and tc.version, execute the existing
assertions for provisioning and volume lifecycle, and ensure any cleanup runs
per-case so both Postgres and MySQL paths are validated.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 820da73 and 09084b6.

📒 Files selected for processing (1)

tests/database_persistence_e2e_test.go

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-28T14:13:00Z

+		t.Run(fmt.Sprintf("Engine/%s", tc.engine), func(t *testing.T) {
+			var dbID string
+			dbName := fmt.Sprintf("e2e-persistent-%s-%d", tc.engine, time.Now().UnixNano()%dbSuffixMod)
+


This E2E test only deletes the database in a later subtest; if the create subtest fails early (e.g., require aborts before dbID is set), the database/volume can be left behind and interfere with subsequent runs. Consider registering a t.Cleanup (once dbID is known) to best-effort delete the DB/volume even when assertions fail.

Copilot · 2026-02-28T14:13:00Z

+	backendVolName := "thecloud-vol-" + vol.ID.String()[:8]
+	if vol.BackendPath != "" {
+		backendVolName = vol.BackendPath
+	}
+
+	mountPath := "/var/lib/postgresql/data"
+	if dbEngine == domain.EngineMySQL {
+		mountPath = "/var/lib/mysql"
+	}
+	volumeBinds := []string{fmt.Sprintf("%s:%s", backendVolName, mountPath)}
+


backendVolName is used as the source for Docker HostConfig.Binds, but vol.BackendPath can be a block-device path (e.g. LVM returns /dev/<vg>/<lv>), which Docker bind-mount cannot mount onto a data directory like /var/lib/postgresql/data. Additionally, the current noop storage backend returns a constant path, which would cause multiple DBs to share the same underlying mount. Consider aligning the storage backend contract with the Docker compute backend (e.g., return a unique host directory or Docker volume name for Docker, and only return /dev/... paths for hypervisors; or introduce a dedicated Docker storage backend / device-mount support).

Copilot · 2026-02-28T14:13:01Z

+	// Create persistent volume for the database
+	volName := fmt.Sprintf("db-vol-%s", db.ID.String()[:8])
+	vol, err := s.volumeSvc.CreateVolume(ctx, volName, 10) // 10GB default
+	if err != nil {
+		return nil, errors.Wrap(errors.Internal, "failed to create persistent volume", err)
+	}
+
+	backendVolName := "thecloud-vol-" + vol.ID.String()[:8]
+	if vol.BackendPath != "" {
+		backendVolName = vol.BackendPath
+	}
+
+	mountPath := "/var/lib/postgresql/data"
+	if dbEngine == domain.EngineMySQL {
+		mountPath = "/var/lib/mysql"
+	}
+	volumeBinds := []string{fmt.Sprintf("%s:%s", backendVolName, mountPath)}
+
 	dockerName := fmt.Sprintf("cloud-db-%s-%s", name, db.ID.String()[:8])
 	containerID, allocatedPorts, err := s.compute.LaunchInstanceWithOptions(ctx, ports.CreateInstanceOptions{
 		Name:        dockerName,
 		ImageName:   imageName,
 		Ports:       []string{"0:" + defaultPort},
 		NetworkID:   networkID,
-		VolumeBinds: nil,
+		VolumeBinds: volumeBinds,
 		Env:         env,
 		Cmd:         nil,
 	})


A persistent volume is created and mounted, but the volume record is never marked as IN-USE / associated with the running database container. As a result, users can delete the volume via the Volume API while the DB is running (since DeleteVolume only blocks IN-USE), risking data loss. After a successful launch, consider updating the volume metadata (status + attachment info) and clearing it on DB delete, likely via a VolumeService method rather than leaving it AVAILABLE.

Copilot · 2026-02-28T14:13:01Z

 	if err != nil {
+		_ = s.volumeSvc.DeleteVolume(ctx, vol.ID.String())
 		s.logger.Error("failed to launch database container", "error", err)
 		return nil, errors.Wrap(errors.Internal, "failed to launch database container", err)


On container launch failure, the rollback deletes the volume but ignores any error. If deletion fails, the DB create request returns an error while leaving an orphaned volume behind with no logging, which will be hard to diagnose operationally. Consider logging a warning (or returning a combined error) when volume cleanup fails in rollback paths.

Copilot · 2026-02-28T14:13:01Z

+	backendVolName := "thecloud-vol-" + vol.ID.String()[:8]
+	if vol.BackendPath != "" {
+		backendVolName = vol.BackendPath
+	}
+
+	mountPath := "/var/lib/postgresql/data"
+	if db.Engine == domain.EngineMySQL {
+		mountPath = "/var/lib/mysql"
+	}
+	volumeBinds := []string{fmt.Sprintf("%s:%s", backendVolName, mountPath)}


Replica volume binding has the same backend-path mismatch risk as primary: vol.BackendPath may be a block device path (LVM) which Docker cannot bind-mount to a directory, and with noop storage it may not be unique. This can prevent replica containers from starting or cause replicas to share storage unexpectedly. Consider making the bind source a Docker-compatible mountpoint/volume name for the Docker backend (or adding device-mount support).

Suggested change

backendVolName := "thecloud-vol-" + vol.ID.String()[:8]

if vol.BackendPath != "" {

backendVolName = vol.BackendPath

}

mountPath := "/var/lib/postgresql/data"

if db.Engine == domain.EngineMySQL {

mountPath = "/var/lib/mysql"

}

volumeBinds := []string{fmt.Sprintf("%s:%s", backendVolName, mountPath)}

// Use a Docker-compatible named volume instead of the raw backend path to

// avoid binding block devices or non-unique backend paths.

dockerVolumeName := "thecloud-vol-" + vol.ID.String()

mountPath := "/var/lib/postgresql/data"

if db.Engine == domain.EngineMySQL {

mountPath = "/var/lib/mysql"

}

volumeBinds := []string{fmt.Sprintf("%s:%s", dockerVolumeName, mountPath)}

Copilot · 2026-02-28T14:13:01Z

+	volRepo := postgres.NewVolumeRepository(db)
+	storage := noop.NewNoopStorageBackend()
+	volumeSvc := services.NewVolumeService(volRepo, storage, eventSvc, auditSvc, slog.Default())
+


NoopStorageBackend.CreateVolume returns a constant backend path ("vol-1"), so wiring DatabaseService tests to NewNoopStorageBackend() means every DB volume bind will point at the same underlying Docker volume name. This can cause cross-test contamination and make DB lifecycle tests flaky/incorrect. Consider using a storage backend stub that returns a unique backend name per call (e.g., based on the requested name), or adjusting the noop backend to return the provided backend name.

Copilot · 2026-02-28T14:13:01Z

+	for _, tc := range testCases {
+		t.Run(fmt.Sprintf("Engine/%s", tc.engine), func(t *testing.T) {
+			var dbID string
+			dbName := fmt.Sprintf("e2e-persistent-%s-%d", tc.engine, time.Now().UnixNano()%dbSuffixMod)


dbName uses time.Now().UnixNano()%dbSuffixMod with dbSuffixMod=1000, which increases the chance of collisions across repeated/parallel CI runs (only 1000 possible suffixes). Consider using the full UnixNano() value or a UUID-based suffix so test resources are reliably unique.

coderabbitai

🧹 Nitpick comments (1)

tests/database_persistence_e2e_test.go (1)
40-141: Consider adding replica volume lifecycle coverage.

The test validates primary database volume provisioning and cleanup effectively. However, the PR objectives mention that replicas also utilize persistent volumes. Consider extending the test to verify that:

Creating a replica provisions a volume with the db-replica-vol- prefix

Deleting a replica cleans up its associated volume

This would ensure complete E2E coverage of the persistence feature for both primary and replica databases.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/database_persistence_e2e_test.go` around lines 40 - 141, Test lacks
replica volume lifecycle checks; extend TestDatabasePersistenceE2E to create a
replica for the created database, assert a volume exists with the
"db-replica-vol-" prefix, then delete that replica and poll until the replica
volume is removed. Use the same helpers used for primaries (postRequest,
getRequest, deleteRequest) to call the replica creation and deletion endpoints,
derive expectedPrefix with fmt.Sprintf("db-replica-vol-%s",
safePrefix(replicaID, dbPrefixLen)), and reuse the existing volume-list
assertions and require.Eventually polling logic to verify provisioning and
cleanup. Ensure you capture and store the replica ID (similar to dbID) from the
replica creation response and defer closeBody on all responses.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/database_persistence_e2e_test.go`:
- Around line 40-141: Test lacks replica volume lifecycle checks; extend
TestDatabasePersistenceE2E to create a replica for the created database, assert
a volume exists with the "db-replica-vol-" prefix, then delete that replica and
poll until the replica volume is removed. Use the same helpers used for
primaries (postRequest, getRequest, deleteRequest) to call the replica creation
and deletion endpoints, derive expectedPrefix with
fmt.Sprintf("db-replica-vol-%s", safePrefix(replicaID, dbPrefixLen)), and reuse
the existing volume-list assertions and require.Eventually polling logic to
verify provisioning and cleanup. Ensure you capture and store the replica ID
(similar to dbID) from the replica creation response and defer closeBody on all
responses.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09084b6 and 714a2f7.

📒 Files selected for processing (4)

docs/FEATURES.md
docs/adr/ADR-015-database-data-persistence.md
docs/database.md
tests/database_persistence_e2e_test.go

✅ Files skipped from review due to trivial changes (1)

docs/adr/ADR-015-database-data-persistence.md

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-28T14:36:11Z

 // NewDatabaseService constructs a DatabaseService with its dependencies.
 func NewDatabaseService(params DatabaseServiceParams) *DatabaseService {
 	return &DatabaseService{
-		repo:     params.Repo,
-		compute:  params.Compute,
-		vpcRepo:  params.VpcRepo,
-		eventSvc: params.EventSvc,
-		auditSvc: params.AuditSvc,
-		logger:   params.Logger,
+		repo:      params.Repo,
+		compute:   params.Compute,
+		vpcRepo:   params.VpcRepo,
+		volumeSvc: params.VolumeSvc,
+		eventSvc:  params.EventSvc,
+		auditSvc:  params.AuditSvc,
+		logger:    params.Logger,
 	}


NewDatabaseService accepts VolumeSvc but doesn’t validate it. If a caller constructs the service without VolumeSvc (zero value nil) and then calls CreateDatabase/DeleteDatabase, this will panic at runtime. Consider failing fast in the constructor (return error or panic with a clear message) or defaulting to a no-op implementation to make the dependency requirement explicit.

Copilot · 2026-02-28T14:36:11Z

+				require.NoError(t, json.NewDecoder(resp.Body).Decode(&res))
+				dbID = res.Data.ID.String()
+				assert.NotEmpty(t, dbID)
+


In this E2E, the database cleanup only happens in the dedicated DeleteDatabase_CleansUpVolume subtest. If CreateDatabase_ProvisionsVolume fails after the API creates the DB but before dbID is set/usable (e.g., decode error), the DB/volume can be left behind and affect later runs. Consider registering a t.Cleanup (or defer) right after successfully decoding/setting dbID to best-effort delete the DB even when intermediate assertions fail.

Suggested change

t.Cleanup(func() {

if dbID == "" {

return

}

respDel := deleteRequest(t, client, testutil.TestBaseURL+"/databases/"+dbID, token)

closeBody(t, respDel)

})

Copilot · 2026-02-28T14:36:12Z

+	if vol.BackendPath != "" {
+		backendVolName = vol.BackendPath
+	}


backendVolName prefers vol.BackendPath when present. With the current NoopStorageBackend, CreateVolume always returns the constant path "vol-1", so every database would bind-mount the same underlying volume, causing cross-database data corruption in dev/test. Consider using the backend volume name you generated (thecloud-vol-<idprefix>) as the bind source (or make BackendPath unique per volume / return the backendName) so each DB gets an isolated mount source.

Suggested change

if vol.BackendPath != "" {

backendVolName = vol.BackendPath

}

poyrazK added 5 commits February 28, 2026 16:43

feat(rds): implement persistent volume integration for managed databases

7b223d8

feat(rds): inject VolumeService into DatabaseService

ae11c34

test(rds): add unit tests for persistent volume lifecycle

2c3cb47

test(rds): update integration tests with volume dependency

b5fc842

test(rds): add E2E test for database persistence

820da73

Copilot AI review requested due to automatic review settings February 28, 2026 13:43

Copilot started reviewing on behalf of poyrazK February 28, 2026 13:43 View session

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

Copilot AI reviewed Feb 28, 2026

View reviewed changes

test(rds): fix lint error in E2E tests

09084b6

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

Comment thread tests/database_persistence_e2e_test.go Outdated

Comment thread tests/database_persistence_e2e_test.go Outdated

Comment thread tests/database_persistence_e2e_test.go Outdated

Comment thread tests/database_persistence_e2e_test.go Outdated

test(rds): refactor persistence E2E tests to be table-driven and robust

f5dc483

Copilot AI review requested due to automatic review settings February 28, 2026 14:05

Copilot started reviewing on behalf of poyrazK February 28, 2026 14:05 View session

Copilot AI reviewed Feb 28, 2026

View reviewed changes

poyrazK added 2 commits February 28, 2026 17:15

test(rds): add t.Helper() to E2E test helper

5b2b00e

docs(rds): add ADR and update feature docs for database persistence

714a2f7

Copilot AI review requested due to automatic review settings February 28, 2026 14:30

Copilot started reviewing on behalf of poyrazK February 28, 2026 14:30 View session

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

Copilot AI reviewed Feb 28, 2026

View reviewed changes

poyrazK merged commit aab4f2d into main Feb 28, 2026
42 checks passed

coderabbitai bot mentioned this pull request Mar 9, 2026

feat: managed database credential security with HashiCorp Vault and automated rotation #96

Merged

4 tasks

poyrazK deleted the feature/rds-persistence-v2 branch April 12, 2026 15:57

coderabbitai bot mentioned this pull request Apr 15, 2026

feat(database): add encryption at rest with Vault Transit #152

Merged

3 tasks

	if vol.BackendPath != "" {
	if s.compute.Type() != "docker" && vol.BackendPath != "" {

	if vol.BackendPath != "" {
	if vol.BackendPath != "" && !strings.HasPrefix(vol.BackendPath, "/dev/") {

+				t.Cleanup(func() {
+					if dbID == "" {
+						return
+					}
+					respDel := deleteRequest(t, client, testutil.TestBaseURL+"/databases/"+dbID, token)
+					closeBody(t, respDel)
+				})

Conversation

poyrazK commented Feb 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

poyrazK commented Feb 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 28, 2026 •

edited

Loading