Skip to content

Upgrade DuckDB to 1.5.0 + automatic parquet compaction#516

Merged
erikdarlingdata merged 1 commit intodevfrom
feature/duckdb-1.5-compaction
Mar 11, 2026
Merged

Upgrade DuckDB to 1.5.0 + automatic parquet compaction#516
erikdarlingdata merged 1 commit intodevfrom
feature/duckdb-1.5-compaction

Conversation

@erikdarlingdata
Copy link
Copy Markdown
Owner

Summary

  • DuckDB 1.5.0: Non-blocking checkpointing, free block reuse, 17% throughput improvement. Storage format upgrades transparently on first open.
  • Automatic parquet compaction: Per-cycle archive files compacted into monthly files (YYYYMM_tablename.parquet) after each archive cycle. Strips dead query_plan_text column from query_store_stats. Reduces steady-state from thousands of files to ~75.
  • Monthly retention: RetentionService switched from 90-day file deletion to 3-month monthly file deletion. Recognizes all naming formats.

Test plan

  • DuckDB 1.5.0 upgrade: zero build errors, storage format upgraded transparently, parquet files compatible
  • Compaction: v_wait_stats query dropped from 1.7s to 0.03s after consolidating 233 files to 19
  • Lite refresh stable at 1.3-1.5s over 30+ minutes of monitoring (down from 6-13s pre-optimization)
  • No slow query alerts in 5+ hours after compaction
  • DB file size stable at ~424MB via free block reuse (was growing unbounded on 1.4.4)
  • EXCLUDE column bug fixed: schema inspection before applying EXCLUDE prevents errors on already-stripped files

🤖 Generated with Claude Code

…ntion

DuckDB 1.5.0: non-blocking checkpointing, free block reuse, 17% throughput
improvement. Storage format v67→v68 upgrades transparently on first open.

ArchiveService: compact per-cycle parquet files into monthly files
(YYYYMM_tablename.parquet) after each archive cycle. Strips dead
query_plan_text column from query_store_stats during compaction. Uses
in-memory DuckDB connection — no contention with collectors. Reduces
steady-state archive from thousands of files to ~75 (25 tables × 3 months).

RetentionService: switch from 90-day file deletion to 3-month monthly
file deletion. Recognizes all naming formats (YYYYMM_, YYYYMMDD_, YYYY-MM_).

Benchmarked: v_wait_stats query dropped from 1.7s to 0.03s after compaction.
Lite refresh stable at 1.3-1.5s (down from 6-13s pre-optimization).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@erikdarlingdata erikdarlingdata merged commit 34aed9d into dev Mar 11, 2026
3 checks passed
@erikdarlingdata erikdarlingdata deleted the feature/duckdb-1.5-compaction branch April 9, 2026 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant