Conversation
…o-writeback branch The no-writeback branch in pushChanged called markSynced when nothing needed pushing, on the premise that "nothing to upload ⇒ local matches remote." That premise is wrong: the slow walk only checks each LOCAL file against the remote index, so remote-only entries are never visited. `pushed === 0` is therefore consistent with "local is missing N files that remote has" — exactly the second-repo `datastore setup` scenario from production. Marking the sidecar clean here let the next `pullChanged` fast-path past unfetched files and silently lose data. Drop the markSynced call from that branch entirely. The only legitimate sites are `pullChanged` end-of-walk (proven local matches remote by walk + downloads) and `pushChanged`'s writeback branch (proven by writing back the index that exactly describes local). Cost: at most one extra slow path on the next call when the cache is genuinely synced but `tryFastPushChanged` couldn't prove it. Not a regression on any case that was previously correct. Tests: replaced the swamp-club #168 "no-writeback records ETag/ generation" tests with #1225 regression tests that pin the new contract — seed remote with files local doesn't have, run pushChanged, assert sidecar does not exist, then assert a follow-up pullChanged actually downloads the missing files. Updated two s3 tests that used `pushChanged` against an empty cache to "establish a clean baseline" to use `pullChanged` priming instead (the legitimate baseline-setter). The TOCTOU invariant (no post-walk HEAD/getMetadata) is preserved. Bumped both manifests to 2026.04.28.1. Completes the markDirty hardening line started by f918a6e / swamp #1225. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Summary
This is a clean, correct fix for a real data-loss bug. The root cause analysis in the PR description is accurate: the pushChanged slow-walk only iterates local files against the remote index, so pushed === 0 is consistent with "local is missing N files the remote has." Calling markSynced in that branch let the next pullChanged fast-path past files that were never fetched.
Code changes (both S3 and GCS):
- The
indexGeneration/indexETagcapture frompullIndexinpushChangedis correctly dropped since the return value is no longer used — the discardedawaitis intentional and clean. - The else-branch
markSyncedcall is removed. The remaining two legitimate call sites remain unchanged:pullChangedend-of-walk (walk confirms all remote index entries are local) andpushChangedwriteback (we just wrote the index that exactly describes local). - The
indexMutatedpath is unaffected and still callsmarkSyncedcorrectly via the writeback branch.
Tests:
- Both
*_cache_sync_test.tsfiles replace the previous TOCTOU-race regression test with a #1225 regression test that sets up the failing scenario (remote has 2 files, local has 1), callspushChanged, and asserts the sidecar does not exist. - The symmetric positive assertion (a subsequent
pullChangedactually downloads the missing file rather than fast-pathing) is the right way to prove the end-to-end invariant holds. - Two S3 tests that used
pushChangedas a clean-baseline setter are correctly updated to usepullChangedinstead — the legitimate path to a verified-clean sidecar. - Tests use in-memory mock clients (no live cloud services),
finallyblocks clean up temp dirs, and no env vars are mutated.
CLAUDE.md compliance: no model/ files touched, no any types, named exports only, no npm dependency changes, manifests bumped to 2026.04.28.1.
Suggestions
- The 9–10-line comment block added above
return pushedis longer than the one-short-line max called out in CLAUDE.md. The WHY here genuinely is non-obvious (a production incident was needed to surface it), so some comment is warranted — but it could be trimmed to 2–3 lines pointing at the issue tracker reference. Not a blocker given the existing codebase already has similar multi-line comment blocks throughoutpullChanged.
There was a problem hiding this comment.
Adversarial Review
I traced every code path in both the GCS and S3 pushChanged methods, the markSynced / tryFastPullChanged / tryFastPushChanged fast-path machinery, and the test changes. I attempted to break this across logic/correctness, error handling, concurrency, data integrity, and API contract dimensions.
Critical / High
None found.
Medium
None found.
Low
- GCS test uses manual
Deno.statfor sidecar check while S3 test usesreadSidecarhelper (gcs_cache_sync_test.ts:1597-1604): The GCS test file doesn't have areadSidecarhelper, so the test does a manualDeno.stat+NotFoundcheck. This is functionally correct but is a minor asymmetry with the S3 test. Not a bug — just a note.
Analysis
The core fix is logically sound. The key asymmetry that caused the bug:
pullChangedwalks remote index entries → whenpulled === 0, every remote entry has been verified to exist locally.markSyncedis safe.pushChangedwalks local files → whenpushed === 0, every local file exists in the remote index. But remote-only entries are never visited, so the converse (every remote entry exists locally) is NOT proven.markSyncedwas recording a false "in-sync" state.
The scrubIndex / indexMutated path is unaffected — when indexMutated is true, the writeback branch fires, writes the index, and markSynced is called with the PutObject ETag/generation. That path is safe because the written-back index describes exactly what's local.
The two S3 tests that switched from pushChanged to pullChanged as a baseline-setter (s3_cache_sync_test.ts:1270, s3_cache_sync_test.ts:1548) are correct — pullChanged end-of-walk is now the only non-writeback path that legitimately sets the sidecar.
The regression tests are well-constructed: they set up the exact production scenario (remote has 2 files, local has 1), assert the sidecar is NOT written, and include a symmetric positive assertion that pullChanged afterward actually downloads the missing file. This is end-to-end proof that the fix works.
The unused indexGeneration / indexETag capture from pullIndex() is cleanly removed from pushChanged in both providers. The pullChanged method still captures and uses it correctly.
Verdict
PASS — Clean, well-reasoned fix for a real silent data-loss bug. The removal of markSynced from the no-writeback branch is correct, the performance trade-off (one extra slow path) is acknowledged and acceptable, and the tests thoroughly validate both the negative (sidecar not written) and positive (subsequent pull fetches missing files) cases.
…markSynced in pushChanged no-writeback (#112) PR #111 dropped the markSynced call from pushChanged's no-writeback branch unconditionally to fix the swamp-club#1225 data-loss scenario (reader's `datastore setup` against a non-empty bucket would mark the sidecar clean even though local was missing every remote-index entry, causing the next pullChanged to fast-path past unfetched files). That fix was too aggressive: it also broke the legitimate recovery path (swamp-uat tests/cli/datastore/sync_test.ts:418), where local cache fully matches remote, the sidecar was wiped — e.g. by an upgrade migration — and the next no-op push should regenerate it so subsequent fast-path syncs work. The two cases are indistinguishable from the per-LOCAL-file walk alone: `pushed === 0` only proves no local-only diffs to upload. To tell them apart we need a remote-side check. Restore the markSynced call in the no-writeback branch but gate it on `localHasAllRemoteEntries()` — a stat over every entry in the in-memory remote index, returning true iff each is present locally with matching size. Cheap (O(N) stats, only on the slow path that already paid for pullIndex(forceRemote)) and decisive: data-loss scenario stays unmarked, recovery scenario regenerates the sidecar. End-to-end verified against real GCS: - Writer pushes 19 files / 4 records, fresh reader's pull returns 4 records (data-loss fix preserved). - Sidecar deleted, no-op push regenerates it with localDirty=false (recovery scenario fixed). Closes the regression introduced by #111. Bumps both extensions to 2026.04.28.2.
Summary
Fixes a silent data-loss bug in
@swamp/s3-datastoreand@swamp/gcs-datastore(introduced in2026.04.25.2) where a second repo runningswamp datastore setupagainst a non-empty remote bucket ends up with only the migrated bundle file in its local cache, but a.datastore-sync-state.jsonsidecar markedlocalDirty: falseagainst the remote's current generation. The nextdatastore sync --pullhits the fast path and returns "Pulled 0 files" without downloading the actual data.Failure mode
In each extension's
pushChangedslow path, the no-writeback else branch calledmarkSynced(indexGeneration)whenpushed === 0. The premise — "nothing to upload ⇒ local matches remote" — is wrong: the slow walk only checks each LOCAL file against the remote index. Remote-index entries with no local counterpart are never visited, sopushed === 0is consistent with "local is missing N files that remote has." Marking the sidecar clean in this case let the nextpullChangedfast-path past unfetched files.The only legitimate
markSyncedcall sites are:pullChangedend-of-walkpushChangedwritebackpushChangedno-writeback (THIS)This PR drops the call from that no-writeback branch.
Cost of removal: at most one extra slow path on the next call (1 HEAD + 1 walk) when the cache happens to be genuinely synced but
tryFastPushChangedcouldn't prove it. Not a regression on any case that was previously correct.TOCTOU invariant preserved: no post-walk HEAD/getMetadata in this branch — the branch simply does no sidecar work at all now (swamp-club #168).
Bisect window
Failing UAT run: https://github.com/systeminit/swamp-uat/actions/runs/24994591142
tests/cli/datastore/sync_test.ts:30— sync pull retrieves data written by another repo via single-step workflow [gcs]tests/cli/datastore/sync_test.ts:107— sync pull with multi-step workflow retrieves all model outputs [gcs]tests/cli/datastore/sync_test.ts:308— sync fast path invalidates when a second writer pushes [s3]Tests
Replaced the swamp-club #168 "no-writeback records ETag/generation" tests in both
*_cache_sync_test.tsfiles with #1225 regression tests:pushChanged(), assertpushed === 0AND that the sidecar file does not exist.pullChanged()must actually download the missing remote file rather than fast-pathing past it.Also updated two s3 tests that relied on
pushChangedagainst an empty cache as a "clean baseline" — those now usepullChangedpriming (the legitimate baseline-setter via end-of-walk markSynced). The gcs equivalents already usedpullChangedpriming, so no change there.Manifests bumped to
2026.04.28.1.Completes the
markDirtyhardening line started by f918a6e / swamp #1225.Test plan
deno checkclean for both extensionsdeno lintclean for both extensionsdeno fmt --checkclean for both extensionsswamp-uat-gcs-datastore-testing: writer pushes 4 records; reader on a fresh repo runsdatastore setup+datastore sync --pulland sees 4 (was 0 pre-fix)🤖 Generated with Claude Code