Lite: fix compaction OOM by setting DuckDB temp_directory (#933) by erikdarlingdata · Pull Request #935 · erikdarlingdata/PerformanceMonitor

erikdarlingdata · 2026-05-05T19:45:03Z

Summary

The in-memory DuckDB connections used for parquet compaction had a 4 GB memory_limit pragma but no temp_directory, so the cap acted as a hard wall — DuckDB had nowhere to spill and OOM'd the moment it was hit.
Set temp_directory to <archive>/duckdb_tmp/ on both compaction connections (small-group and incremental pair-merge). Co-locating with the archive keeps spill writes on the same volume as the parquet files.

Verification

Tested end-to-end against 4 monitored SQL Servers under HammerDB load:

First 512 MB reset (single-file groups) — 26 groups compacted, no errors.
Second 512 MB reset (multi-file groups, the path that OOM'd in [BUG] Memory usage on client #933) — 21 groups merged with 2 source files each, completed in ~3.5s, duckdb_tmp/ cleaned up on connection close, no OOM.

Closes #933.

Test plan

Builds clean (0 errors / 0 warnings on incremental build)
First reset (single-file compaction) succeeds with new code path
Second reset (multi-file pair-merge) succeeds — the actual OOM-prone path from [BUG] Memory usage on client #933
duckdb_tmp/ directory created on first archive cycle and cleaned by DuckDB on connection close
No regression in collector health during archive cycles
Reporter (000al000) confirms fix on their 4-server environment

🤖 Generated with Claude Code

The in-memory DuckDB connections used for parquet compaction had a 4 GB memory_limit pragma but no temp_directory, so the cap acted as a hard wall — DuckDB had nowhere to spill and OOM'd the moment it was hit. Co-locate the spill dir with the archive folder so the writes land on the same volume as the parquet files. Verified end-to-end: 4-server HammerDB load, second 512 MB reset triggered ArchiveAllAndResetAsync, all 21 groups went through the multi-file pair-merge path with two sources each, completed in ~3.5s with no OOM. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…don't OOM erikdarlingdata#935 added temp_directory so DuckDB could spill, but on wider workloads the working set still blew past the 4 GB cap before spill caught up (reporter saw OOM at 3.7 GiB compacting 15 query_snapshots files). Three knobs combined to feed that: - memory_limit = 4 GB was too high — DuckDB held off spilling until late - threads defaulted to N cores, multiplying per-thread row-group buffers - ROW_GROUP_SIZE 122880 buffered up to 122k wide-VARCHAR rows per group Drop memory_limit to 1 GB, cap threads to 2, and shrink ROW_GROUP_SIZE to 8192. On 1.7 M rows of real query_stats data this drops peak working set from 1236 MB → 166 MB (87% reduction) at a 31% wall-time cost. Memory now plateaus instead of growing with row count, which is the load-bearing change for issue erikdarlingdata#933. Adds tools/CompactionRepro — a standalone reproducer that splits a real monthly parquet file into N per-cycle-shaped chunks and runs the same pair-merge logic with the tuning knobs exposed on the command line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

erikdarlingdata merged commit e0cd22b into dev May 5, 2026
2 checks passed

This was referenced May 5, 2026

[BUG] Memory usage on client #933

Open

Fix #933 — bound compaction memory so wide-row tables don't OOM #942

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lite: fix compaction OOM by setting DuckDB temp_directory (#933)#935

Lite: fix compaction OOM by setting DuckDB temp_directory (#933)#935
erikdarlingdata merged 1 commit intodevfrom
feature/933-compaction-temp-directory

erikdarlingdata commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erikdarlingdata commented May 5, 2026

Summary

Verification

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant