fix: address review findings round 3 (hy-ug1uj) by philcunliffe · Pull Request #53 · hyparam/hypaware

philcunliffe · 2026-05-27T01:58:48Z

Summary

Addresses all 5 review findings from PR Feature: cache-lifecycle #41 dual-review round 3
Source fix bead: hy-nsc6k
Review round: 3

Merge target

polecat/hy-ug1uj → integration/cache-lifecycle

🤖 Auto-merged by refinery

@typedef

…y-ug1uj) 1. Spool cleanup: remove malformed-file continue that caused permanent spool accumulation — files are now cleaned up regardless of malformed rows. 2. metadataVersion: propagate real version through commitRowStream instead of hardcoding empty string, fixing ExportMarker contract violation. 3. INTERNAL_FIELDS: wire up filtering in query-facing storage.readRows and storage.dataSourceForTable so _hyp_cache_row_id and _hyp_cache_batch_id are hidden from query output. 4. Replace @typedef with .d.ts interfaces in streaming-reader.js and format-iceberg/maintenance.js per CLAUDE.md. 5. Replace inline import() types with @import declarations in maintenance.js, dataset.js, and test files per CLAUDE.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ntionConfig Inline import('./maintenance.js').ExportRetentionConfig caused a typecheck failure because maintenance.js only consumes that type — it does not re-export it. Import from the canonical ./types.d.ts instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@typedef

* chore: seed integration/cache-lifecycle for feature flow * hy-arjts: Streaming spool reader (#42) * feat: streaming spool reader with batching, resume, and internal fields (hy-arjts) Replace the readFile-based spool flush with a streaming JSONL reader that handles large backlogs without loading them into memory. Key changes: - Stream complete lines only; partial trailing lines stay in the buffer - Batch rows by 128 MiB estimated payload or 100,000 rows (whichever first) - Decorate each row with _hyp_cache_row_id (SHA-256 dedup key) and _hyp_cache_batch_id (flush batch identifier) - Persist flush progress per rotated spool file so restarts resume from the last successfully flushed offset - Malformed JSON lines are counted and skipped without aborting the batch Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: remove dead batchByteLen and batchStartOffset (hy-arjts) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * hy-qyq1p: Partition layout + storage APIs (#43) * feat: partition layout + storage APIs (hy-qyq1p) Introduce per-source/day cache partitions with cursor-based epoch tracking: - appendRowsToPartition(dataset, segments, columns, rows) writes into the current epoch's Iceberg table and updates cursor.json - discoverCachePartitions(scope) enumerates logical partitions with scope filtering (datasets, date range) - resolveClientName() fallback chain for partition key derivation - client_name column added to ai_gateway_messages schema - Existing appendRows compat path unchanged; new APIs added alongside Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: partition storage APIs and client_name column (hy-qyq1p) Unit tests for: - appendRowsToPartition: single/multi-segment, epoch creation, cursor updates, empty rows short-circuit - discoverCachePartitions: empty cache, single/multi-date discovery, scope filtering by datasets/date/range - resolveClientName: full fallback chain including empty string skipping - cursor read/write roundtrip Update ai-gateway schema assertion for new client_name column. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Feature cache-lifecycle: Flush + append integration (hy-omtz3) (#44) * hy-omtz3: Flush + append integration Wire the streaming spool reader into the partition-aware flush pipeline so each batch from streamFlushFile is routed through appendRowsToPartition, landing rows in the correct per-source/day partition (client=X/date=Y). Cache storage read methods (tableExists, readRows, dataSourceForTable) now resolve through the epoch directory layer so queries work with both the old flat layout and the new partitioned layout. Update @hypaware/format-iceberg to stream rows through target-sized batch commits via commitRowStream instead of draining all rows into memory before a single commitBatch call. Smoke test covers batching, partition routing, and row count integrity. 10 new unit tests for resolvePartitionDate and resolvePartitionSegments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: remove dead _spool access that broke typecheck The smoke test referenced kernel.storage._spool which does not exist on ExtendedQueryStorageService. The block was a no-op (declared flushCallCount was never read, the if-body was an empty comment), so removing it fixes CI typecheck without behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * hy-3u4uy: Legacy compat + migration (#46) - discoverCachePartitions detects legacy Iceberg tables (no cursor.json) and marks them with legacy: true; skips .retired/ directories - ai-gateway dataset discovers all partitions (legacy + new per-client/day) and unions them into a single data source for seamless query results - hyp query maintain [--force] migrates legacy partitions into the new per-source/day layout and retires old paths; idempotent - gateway_claude_capture and gateway_codex_capture smokes verify rows land in client=*/date=* partitions with client_name populated - Unit tests for legacy discovery, .retired skipping, and migration Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * hy-xg0v8: Maintenance engine + CLI + config (#48) * feat: cache maintenance engine — snapshot expiration, compaction, CLI, daemon integration (hy-xg0v8) Add maintainCache API with two maintenance operations: - Snapshot expiration: keeps current + min 10 recent + all within 24h - Compaction: streams rows into new epoch with row-id dedup when partition exceeds 32 data files, avg file size < 32 MiB, or metadata > 64 MiB New CLI commands: - hyp query maintain [dataset] [--dry-run] [--force] [--compact-only] [--expire-only] - Enhanced hyp query status with partition-level stats (files, snapshots, metadata bytes) Config surface at query.cache.maintenance with all tunable thresholds. Daemon runs bounded maintenance on a configurable interval (default 60 min, 30s budget per tick). Retired epochs get a 24h grace period before cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: resolve 7 typecheck errors in cache maintenance and CLI Narrow JSDoc union type after error guard in runQueryMaintain, fix bigint/number array type in expireSnapshots, and use ColumnSpec['type'] instead of missing ColumnType export. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: narrow parsed union before property access in runQueryMaintain Move destructuring with JSDoc type assertion above the compactOnly/expireOnly guard so TS no longer sees the error branch of the union on lines 812-815. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * hy-jbxou: Export maintenance — snapshot expiration + compaction status (#50) * feat: export maintenance — snapshot expiration + compaction status (hy-jbxou) - Add format-iceberg/src/maintenance.js with snapshot expiration for blob-store-backed Iceberg export tables (same retention defaults as cache maintenance: keep current + 10 recent, expire after 24h). - Run best-effort snapshot expiration after each successful export commit in table-format.js. - Add `hyp sink maintain [instance] [--dry-run]` CLI command that discovers exported datasets and runs snapshot expiration. Reports `compaction_unsupported` because icebird V1 has no rewrite-data-files or delete-data-files primitive. - Store blobStore on ExtendedSinkHandle for table-format sinks so the CLI can access it. - Update iceberg_export_local_fs smoke: second batch export, maintenance assertions, verify compaction status and post-maintenance readability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: BigInt serialization in commitRowStream + update smoke span count (hy-jbxou) commitRowStream used JSON.stringify to estimate batch byte size, which throws on rows containing BigInt values. Use a BigInt-safe replacer. Update the iceberg_export_local_fs smoke to expect 2 export_batch spans now that the test exercises two batches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: resolve unbound `config` reference in exportDataset snapshot expiration `config` was defined in `buildSink()` scope but referenced inside the standalone `exportDataset()` function. Use `ctx.sinkInstanceConfig?.maintenance` which is the same value and is available via the ctx parameter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * hy-c7pml: Fix dual-review findings on PR #41 (cache-lifecycle) (#51) 1. Add smoke_name/smoke_step/DEV_RUN_ID observability attributes to cache_lifecycle_maintenance smoke, matching the cache_roundtrip pattern. 2. Convert compactPartition from unbounded in-memory collect to batched streaming writes (10k rows per batch), preventing OOM on large partitions. 3. Surface malformedCount from streaming-reader through spool flush pipeline; throw on corruption to restore the data-integrity invariant that the original readFlushChunks enforced. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cache): address all 7 review findings from PR #41 dual-review (hy-nsc6k) (#52) 1. sinkInstanceConfig.maintenance: extract maintenance config in buildSink and pass through to exportDataset via the untyped config object 2. retirePartition: move inside rows.length > 0 guard so empty legacy partitions are not irrecoverably retired 3. shutdown: track maintenanceInFlight promise and await it during shutdown to prevent orphan data files from abandoned mid-write operations 4. compactPartition: use inferColumnType from migrate.js instead of hardcoding all columns as STRING 5. spool malformed-line: replace throw with continue so remaining spool files are still processed (malformed files retained for inspection) 6. ExportMarker.dataFiles: accumulate dataFiles across batches in commitRowStream and propagate to the marker write 7. retention DEFAULTS: export SNAPSHOT_RETENTION_DEFAULTS from cache maintenance and import in format-iceberg to eliminate dual source of truth Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address review findings round 3 (hy-ug1uj) (#53) * fix: address all 5 review findings from PR #41 dual-review round 3 (hy-ug1uj) 1. Spool cleanup: remove malformed-file continue that caused permanent spool accumulation — files are now cleaned up regardless of malformed rows. 2. metadataVersion: propagate real version through commitRowStream instead of hardcoding empty string, fixing ExportMarker contract violation. 3. INTERNAL_FIELDS: wire up filtering in query-facing storage.readRows and storage.dataSourceForTable so _hyp_cache_row_id and _hyp_cache_batch_id are hidden from query output. 4. Replace @typedef with .d.ts interfaces in streaming-reader.js and format-iceberg/maintenance.js per CLAUDE.md. 5. Replace inline import() types with @import declarations in maintenance.js, dataset.js, and test files per CLAUDE.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(types): replace inline import() types with @import for ExportRetentionConfig Inline import('./maintenance.js').ExportRetentionConfig caused a typecheck failure because maintenance.js only consumes that type — it does not re-export it. Import from the canonical ./types.d.ts instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address round 4 dual-review findings (codex-driven) Fixes all 6 findings from the round 4 Claude + Codex dual review: - C1: compactPartition now updates cursor and retires old epoch on zero-row dedup - C2: appendRowsToPartition propagates return value from partition.js - X1: Internal field filtering now strips both columns and cells to maintain alignment - X2: createDataSource passes scope through to re-discovery to prevent data leaks - X3: Maintenance ticks guarded against overlap with inFlight flag - X4: Resume offset only advances after batch is durably flushed Co-Authored-By: Codex (gpt-5.3-codex) <noreply@openai.com> * fix: resolve typecheck and test failures from codex fix round - storage.appendRowsToPartition: keep Promise<void> return type to match kernel interface (span still records metrics internally) - storage.dataSourceForTable: internal field filtering now outputs AsyncCells (Record<string, AsyncCell>) not array, preserving type compatibility with squirreling's AsyncRow - cache-storage.test: remove return-value assertions (void), test filtering via column/resolved exclusion instead - ai-gateway-dataset.test: add required limit to QueryScope, use resolved-only assertion path Typecheck clean, 490/490 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: dry-run must not perform real migration or epoch cleanup - Skip migrateLegacyPartitions when --dry-run is set in hyp query maintain - Skip cleanRetiredEpochs when dryRun is true in maintainCache Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: feature-launch <feature-launch@gas.city> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Codex (gpt-5.3-codex) <noreply@openai.com>

philcunliffe and others added 2 commits May 26, 2026 18:56

philcunliffe merged commit 430d953 into integration/cache-lifecycle May 27, 2026
6 checks passed

philcunliffe deleted the polecat/hy-ug1uj branch May 27, 2026 02:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: address review findings round 3 (hy-ug1uj)#53

fix: address review findings round 3 (hy-ug1uj)#53
philcunliffe merged 2 commits into
integration/cache-lifecyclefrom
polecat/hy-ug1uj

philcunliffe commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

philcunliffe commented May 27, 2026

Summary

Merge target

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant