Skip to content

Add pre-flight safety checks to rollback script#172

Merged
raymondjacobson merged 6 commits intomainfrom
rollback-preflight-checks
Mar 31, 2026
Merged

Add pre-flight safety checks to rollback script#172
raymondjacobson merged 6 commits intomainfrom
rollback-preflight-checks

Conversation

@raymondjacobson
Copy link
Copy Markdown
Contributor

Prevents running rollback while the node is still up, which can corrupt postgres by starting a second instance on the same data directory.

Prevents running rollback while the node is still up, which can corrupt
postgres by starting a second instance on the same data directory.

Also renames the rollback binary destination from /bin/rollback to
/bin/rollback-bin in the Dockerfile to avoid Docker COPY creating a
directory instead of placing the file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@raymondjacobson raymondjacobson force-pushed the rollback-preflight-checks branch from 3318e49 to 3dc5f52 Compare March 31, 2026 01:33
raymondjacobson and others added 5 commits March 30, 2026 18:35
The rollback is typically run as a separate container (docker run --rm)
sharing only the /data volume. Process-based checks like kill -0 and
fuser don't work across PID namespaces.

Replace with filesystem-level checks on the shared volume:
- postmaster.pid existence (hard stop, with escape hatch for stale files)
- CometBFT PebbleDB LOCK file presence
- Network checks (curl, pg_isready) kept as best-effort for --net=host

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PREBUILT_ROLLBACK_BINARY was never set in the CI workflow, so the
Dockerfile COPY with an empty arg created /bin/rollback as a directory
instead of copying the binary — causing "Is a directory" at runtime.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Now that CI properly builds and passes PREBUILT_ROLLBACK_BINARY, the
rename workaround is unnecessary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove network-based checks (curl, pg_isready) that don't work across
Docker containers. Keep only volume-level checks that reliably detect
a running node.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@raymondjacobson raymondjacobson merged commit 99af214 into main Mar 31, 2026
6 checks passed
@raymondjacobson raymondjacobson deleted the rollback-preflight-checks branch March 31, 2026 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant