Skip to content

fix(delete,retention): clear DuckDB cache and free OS memory after execution#372

Merged
xe-nvdk merged 1 commit intomainfrom
fix/delete-retention-memory
Apr 8, 2026
Merged

fix(delete,retention): clear DuckDB cache and free OS memory after execution#372
xe-nvdk merged 1 commit intomainfrom
fix/delete-retention-memory

Conversation

@xe-nvdk
Copy link
Copy Markdown
Member

@xe-nvdk xe-nvdk commented Apr 8, 2026

Summary

  • delete.go and retention.go were not clearing DuckDB's parquet metadata/data cache after completing file operations, causing memory to accumulate permanently and requiring container restarts to recover
  • Mirrors the cleanup pattern compaction already uses (ClearHTTPCache + FreeOSMemory)
  • FreeOSMemory runs in a goroutine to avoid blocking the HTTP response with a GC pause
  • Added ROW_GROUP_SIZE 122880 to delete COPY queries to match compaction's write buffering
  • Fixed stale "after compaction" log messages in ClearHTTPCache()

Test plan

  • go build ./internal/... passes
  • Trigger a retention policy execution while watching docker stats — memory should return to baseline after completion
  • Trigger a delete operation and verify same memory behavior
  • Dry-run retention should NOT trigger cache clearing (guarded by !req.DryRun)
  • Partial delete failures (some files fail) should still clear cache

Context

Reported by a user running Arc in Docker with Garage S3 backend, who observed memory climbing by several GBs each night during nightly retention runs, with no recovery until container restart. Reproducible on v26.03.1 and v26.04.1.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a memory retention issue by clearing DuckDB's internal caches and triggering garbage collection after delete and retention operations, while also optimizing memory usage during file rewrites by setting a row group size. The review highlights that the current implementation misses several exit paths and dry-run scenarios where DuckDB caches are still populated, suggesting the use of a defer block to ensure comprehensive cache cleanup.

Comment thread internal/api/delete.go Outdated
Comment thread internal/api/retention.go Outdated
…ecution

Neither the delete handler nor the retention handler cleared DuckDB's
parquet metadata/data cache after completing their file operations,
causing memory to accumulate and never be released — requiring container
restarts. Mirrors the cleanup pattern already used by compaction.

- Call ClearHTTPCache() after delete/retention completes (always, including
  dry runs and no-match paths — read_parquet populates cache regardless)
- Add freeOSMemoryThrottled() — debounced via atomic CAS, fires at most once
  every 30s in a goroutine to prevent GC storms under concurrent requests
- Add ROW_GROUP_SIZE 122880 constant (parquetRowGroupSize) to cap write
  buffering during file rewrites, matching compaction's row group size
- Fix stale "after compaction" log messages in ClearHTTPCache()
@xe-nvdk xe-nvdk force-pushed the fix/delete-retention-memory branch from bcaa2b0 to 9bb839f Compare April 8, 2026 15:12
@xe-nvdk xe-nvdk merged commit 9db0a33 into main Apr 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant