[codex] Add durable cache spool by philcunliffe · Pull Request #1 · hyparam/hypaware

philcunliffe · 2026-05-21T16:37:16Z

Summary

Adds a durable JSONL spool in front of the HypAware Iceberg cache so live sources no longer commit every appendRows() call as a tiny Iceberg append.

What changed

appendRows() now durably appends batch envelopes to per-table JSONL spool files.
Iceberg flushes happen by size threshold, forced query refresh, or sink export reads.
Normal --refresh auto queries debounce cache flushes for 2 minutes and print a cache freshness message to stderr when pending rows are skipped.
--refresh always and hyp query refresh force pending spool rows into Iceberg before query.
Iceberg bytes_written accounting now uses snapshot metadata instead of scanning every file under data/.
Sink exporters force-flush pending rows before reading.
AI gateway shutdown now waits for async exchange finalizers so rows are not lost before query/export.

Why

The previous path wrote directly to Iceberg for every cache append. Event sources like gascity and ai-gateway often append one row at a time, producing many tiny Parquet files and metadata commits. The old byte accounting also scanned the full Iceberg data directory on every append, making writes get slower as file count grew.

Validation

git diff --check
node --check on changed JS files
npm run smoke -- cache_spool_batching
npm run smoke -- cache_query_debounce
npm run smoke -- cache_roundtrip
npm run smoke -- ai_gateway_passthrough
npm run smoke -- gascity_attach_writes_partition
npm run smoke -- otel_listener_writes_rows
npm run smoke -- sink_export_driver
npm run smoke -- local_parquet_export
npm run smoke -- central_forward_outbox
npm run smoke -- blob_sink_parquet_local_fs
npm run smoke -- gateway_codex_capture
npm run smoke -- gateway_claude_capture

Add durable cache spool

1c47926

philcunliffe marked this pull request as ready for review May 21, 2026 16:37

philcunliffe merged commit c72ef1d into master May 21, 2026

This was referenced May 21, 2026

hy-2xeb: Install pipeline for GitHub-URL plugins (hy-gh-1) #9

Merged

Feature: tui-prompts #26

Merged

platypii deleted the codex/cache-jsonl-spool branch May 28, 2026 06:37

philcunliffe mentioned this pull request May 29, 2026

Feature: client-backfill #66

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add durable cache spool#1

[codex] Add durable cache spool#1
philcunliffe merged 1 commit into
masterfrom
codex/cache-jsonl-spool

philcunliffe commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

philcunliffe commented May 21, 2026

Summary

What changed

Why

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant