Skip to content

[codex] Add durable cache spool#1

Merged
philcunliffe merged 1 commit into
masterfrom
codex/cache-jsonl-spool
May 21, 2026
Merged

[codex] Add durable cache spool#1
philcunliffe merged 1 commit into
masterfrom
codex/cache-jsonl-spool

Conversation

@philcunliffe
Copy link
Copy Markdown
Contributor

Summary

Adds a durable JSONL spool in front of the HypAware Iceberg cache so live sources no longer commit every appendRows() call as a tiny Iceberg append.

What changed

  • appendRows() now durably appends batch envelopes to per-table JSONL spool files.
  • Iceberg flushes happen by size threshold, forced query refresh, or sink export reads.
  • Normal --refresh auto queries debounce cache flushes for 2 minutes and print a cache freshness message to stderr when pending rows are skipped.
  • --refresh always and hyp query refresh force pending spool rows into Iceberg before query.
  • Iceberg bytes_written accounting now uses snapshot metadata instead of scanning every file under data/.
  • Sink exporters force-flush pending rows before reading.
  • AI gateway shutdown now waits for async exchange finalizers so rows are not lost before query/export.

Why

The previous path wrote directly to Iceberg for every cache append. Event sources like gascity and ai-gateway often append one row at a time, producing many tiny Parquet files and metadata commits. The old byte accounting also scanned the full Iceberg data directory on every append, making writes get slower as file count grew.

Validation

  • git diff --check
  • node --check on changed JS files
  • npm run smoke -- cache_spool_batching
  • npm run smoke -- cache_query_debounce
  • npm run smoke -- cache_roundtrip
  • npm run smoke -- ai_gateway_passthrough
  • npm run smoke -- gascity_attach_writes_partition
  • npm run smoke -- otel_listener_writes_rows
  • npm run smoke -- sink_export_driver
  • npm run smoke -- local_parquet_export
  • npm run smoke -- central_forward_outbox
  • npm run smoke -- blob_sink_parquet_local_fs
  • npm run smoke -- gateway_codex_capture
  • npm run smoke -- gateway_claude_capture

@philcunliffe philcunliffe marked this pull request as ready for review May 21, 2026 16:37
@philcunliffe philcunliffe merged commit c72ef1d into master May 21, 2026
@platypii platypii deleted the codex/cache-jsonl-spool branch May 28, 2026 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant