Skip to content

fix: auto-recover stale sync state after crash#110

Merged
moekatib merged 1 commit intomainfrom
fix/sync-crash-recovery
Apr 19, 2026
Merged

fix: auto-recover stale sync state after crash#110
moekatib merged 1 commit intomainfrom
fix/sync-crash-recovery

Conversation

@moekatib
Copy link
Copy Markdown
Contributor

Summary

  • When a sync crashed mid-run (laptop sleep, OOM, SIGKILL), sync_state.json kept status: "syncing" and the next run threw "already syncing. Use --force to override" even though the filesystem lock had already detected the dead pid and taken over. Fix: trust the lock as the single source of truth on concurrency — if acquireSyncLock returns, a lingering syncing state is evidence of a crashed previous run, so log a recovery line and proceed.
  • Install SIGINT/SIGTERM handlers in syncModel that flip state to failed and release the filesystem lock before exiting, so cleanly-signaled terminations don't leave zombie state behind (closes the window where a process killed between "set syncing" and the catch block leaves state wedged).
  • Bump to 1.37.2.

Scope: no heartbeat field (explicitly marked optional in the spec). --force semantics unchanged — it still skips cursor-resume and lastSync-derived date filtering.

Test plan

  • Start a sync, kill -9 mid-run, re-run without --force → expect "Recovered from crashed previous sync" log line and normal proceed (previously threw)
  • Run two one sync run <platform> concurrently → second still throws SyncLockError (lock gate fires first, no regression)
  • Clean sync, re-run → no recovery log, normal sync
  • Ctrl-C mid-sync → state flips to failed, lock directory removed, next run starts clean without recovery message

🤖 Generated with Claude Code

When a sync process dies mid-run (laptop sleep, OOM, SIGKILL, terminal
close), sync_state.json keeps status: "syncing" forever. The next run
then threw "already syncing. Use --force to override" even though the
filesystem lock had already detected the dead pid and taken over.

The two concurrency gates disagreed. Now the filesystem lock is the
single source of truth — if acquireSyncLock returns, no live process
holds the lock, and a lingering "syncing" state is auto-recovered with
a log line instead of a throw.

Also installs SIGINT/SIGTERM handlers that flip state to "failed" and
release the lock before exiting, so cleanly-signaled terminations
don't leave zombie state behind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@moekatib moekatib merged commit e19ac10 into main Apr 19, 2026
@moekatib moekatib deleted the fix/sync-crash-recovery branch April 19, 2026 05:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant