fix: auto-recover stale sync state after crash#110
Merged
Conversation
When a sync process dies mid-run (laptop sleep, OOM, SIGKILL, terminal close), sync_state.json keeps status: "syncing" forever. The next run then threw "already syncing. Use --force to override" even though the filesystem lock had already detected the dead pid and taken over. The two concurrency gates disagreed. Now the filesystem lock is the single source of truth — if acquireSyncLock returns, no live process holds the lock, and a lingering "syncing" state is auto-recovered with a log line instead of a throw. Also installs SIGINT/SIGTERM handlers that flip state to "failed" and release the lock before exiting, so cleanly-signaled terminations don't leave zombie state behind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sync_state.jsonkeptstatus: "syncing"and the next run threw"already syncing. Use --force to override"even though the filesystem lock had already detected the dead pid and taken over. Fix: trust the lock as the single source of truth on concurrency — ifacquireSyncLockreturns, a lingeringsyncingstate is evidence of a crashed previous run, so log a recovery line and proceed.syncModelthat flip state tofailedand release the filesystem lock before exiting, so cleanly-signaled terminations don't leave zombie state behind (closes the window where a process killed between "set syncing" and the catch block leaves state wedged).1.37.2.Scope: no heartbeat field (explicitly marked optional in the spec).
--forcesemantics unchanged — it still skips cursor-resume and lastSync-derived date filtering.Test plan
kill -9mid-run, re-run without--force→ expect "Recovered from crashed previous sync" log line and normal proceed (previously threw)one sync run <platform>concurrently → second still throwsSyncLockError(lock gate fires first, no regression)Ctrl-Cmid-sync → state flips tofailed, lock directory removed, next run starts clean without recovery message🤖 Generated with Claude Code