You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Killing a long-running sync run (SIGTERM) while PGlite is mid-transaction leaves the embedded Postgres in an unrecoverable state: postmaster.pid contains a negative PID, and on next invocation mem doctor reports:
backend opens and applies schema ✗ Aborted()
No automatic recovery — user has to delete the pglite directory or hand-edit postmaster.pid to get back to a working state. Any unflushed data is lost regardless.
Repro
one sync run fathom --full-refresh (a profile with enrich so the run is slow)
kill -TERM <pid> mid-run (Ctrl-C is trapped by runner.ts signalCleanup → graceful; a real SIGTERM from a process manager is not).
Inspect ~/.one/mem/postmaster.pid — negative PID.
one --agent mem doctor → Aborted() on ensureSchema step.
Root cause (hypothesis)
PGlite's WASM postmaster doesn't fully flush WAL/shm before the signal handler terminates the worker. The negative PID suggests the file was written with an incomplete value during teardown. @electric-sql/pglite doesn't expose a graceful-shutdown hook we currently call in signalCleanup.
Expected
At minimum: detect corrupt-state (negative PID, failed schema apply) on startup and offer a safe recovery path — either reset the pglite dir (with confirmation) or point at a separate backup. Ideally: call a pglite graceful close before process.exit in signalCleanup.
Notes
Not blocking. Most users don't SIGTERM a sync.
Separate from PR feat: unified memory (v1.42.0) #125 (which introduced SIGINT/SIGTERM handling for sync state/lock cleanup — the PGlite layer needs its own graceful close).
Workaround: rm -rf ~/.one/mem/postmaster.pid (lose the db) or start fresh with rm -rf ~/.one/mem (lose all memories).
Summary
Killing a long-running
sync run(SIGTERM) while PGlite is mid-transaction leaves the embedded Postgres in an unrecoverable state:postmaster.pidcontains a negative PID, and on next invocationmem doctorreports:No automatic recovery — user has to delete the pglite directory or hand-edit
postmaster.pidto get back to a working state. Any unflushed data is lost regardless.Repro
one sync run fathom --full-refresh(a profile withenrichso the run is slow)kill -TERM <pid>mid-run (Ctrl-C is trapped by runner.ts signalCleanup → graceful; a real SIGTERM from a process manager is not).~/.one/mem/postmaster.pid— negative PID.one --agent mem doctor→Aborted()on ensureSchema step.Root cause (hypothesis)
PGlite's WASM postmaster doesn't fully flush WAL/shm before the signal handler terminates the worker. The negative PID suggests the file was written with an incomplete value during teardown. @electric-sql/pglite doesn't expose a graceful-shutdown hook we currently call in
signalCleanup.Expected
At minimum: detect corrupt-state (negative PID, failed schema apply) on startup and offer a safe recovery path — either reset the pglite dir (with confirmation) or point at a separate backup. Ideally: call a pglite graceful close before
process.exitin signalCleanup.Notes
rm -rf ~/.one/mem/postmaster.pid(lose the db) or start fresh withrm -rf ~/.one/mem(lose all memories).