fix(delete,retention): clear DuckDB cache and free OS memory after execution by xe-nvdk · Pull Request #372 · Basekick-Labs/arc

xe-nvdk · 2026-04-08T15:02:11Z

Summary

delete.go and retention.go were not clearing DuckDB's parquet metadata/data cache after completing file operations, causing memory to accumulate permanently and requiring container restarts to recover
Mirrors the cleanup pattern compaction already uses (ClearHTTPCache + FreeOSMemory)
FreeOSMemory runs in a goroutine to avoid blocking the HTTP response with a GC pause
Added ROW_GROUP_SIZE 122880 to delete COPY queries to match compaction's write buffering
Fixed stale "after compaction" log messages in ClearHTTPCache()

Test plan

go build ./internal/... passes
Trigger a retention policy execution while watching docker stats — memory should return to baseline after completion
Trigger a delete operation and verify same memory behavior
Dry-run retention should NOT trigger cache clearing (guarded by !req.DryRun)
Partial delete failures (some files fail) should still clear cache

Context

Reported by a user running Arc in Docker with Garage S3 backend, who observed memory climbing by several GBs each night during nightly retention runs, with no recovery until container restart. Reproducible on v26.03.1 and v26.04.1.

gemini-code-assist

Code Review

This pull request addresses a memory retention issue by clearing DuckDB's internal caches and triggering garbage collection after delete and retention operations, while also optimizing memory usage during file rewrites by setting a row group size. The review highlights that the current implementation misses several exit paths and dry-run scenarios where DuckDB caches are still populated, suggesting the use of a defer block to ensure comprehensive cache cleanup.

…ecution Neither the delete handler nor the retention handler cleared DuckDB's parquet metadata/data cache after completing their file operations, causing memory to accumulate and never be released — requiring container restarts. Mirrors the cleanup pattern already used by compaction. - Call ClearHTTPCache() after delete/retention completes (always, including dry runs and no-match paths — read_parquet populates cache regardless) - Add freeOSMemoryThrottled() — debounced via atomic CAS, fires at most once every 30s in a goroutine to prevent GC storms under concurrent requests - Add ROW_GROUP_SIZE 122880 constant (parquetRowGroupSize) to cap write buffering during file rewrites, matching compaction's row group size - Fix stale "after compaction" log messages in ClearHTTPCache()

gemini-code-assist Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread internal/api/delete.go Outdated

Comment thread internal/api/retention.go Outdated

xe-nvdk force-pushed the fix/delete-retention-memory branch from bcaa2b0 to 9bb839f Compare April 8, 2026 15:12

xe-nvdk merged commit 9db0a33 into main Apr 8, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(delete,retention): clear DuckDB cache and free OS memory after execution#372

fix(delete,retention): clear DuckDB cache and free OS memory after execution#372
xe-nvdk merged 1 commit intomainfrom
fix/delete-retention-memory

xe-nvdk commented Apr 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xe-nvdk commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Context

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xe-nvdk commented Apr 8, 2026 •

edited

Loading