Overview
When a PostgresCluster is bootstrapped from a dataSource.pgbackrest restore, postgres promotes from timeline 1 to timeline 2 and immediately tries to archive 00000002.history via archive_command. At that moment the pgBackRest stanza (archive.info) does not yet exist — the operator creates it later in its reconcile loop. pgBackRest's async archiver silently drops the push with error 103 and removes the spool entry. postgres considers the file archived and never retries. The history file is permanently absent from the archive.
Without 00000002.history in the archive, pg_rewind cannot reconstruct the full timeline chain during any future PITR restore and fails with:
could not find common ancestor of the source and target cluster's timelines
Replica pods remain stuck at 2/4 containers Ready indefinitely.
Environment
- Platform: GKE
- Platform Version: 1.35.3-gke.1522000
- PGO Image Tag:
ubi9-5.8.7-0
- Postgres Version: 18
- Storage:
pd-ssd (GKE standard SSD persistent disk)
Steps to Reproduce
REPRO
- Create a source
PostgresCluster with a pgBackRest S3/object-store repo and wait for stanzaCreated: true
- Write some data and trigger a full backup; wait for the backup job to complete
- Create a second
PostgresCluster with dataSource.pgbackrest pointing to the source cluster's repo (this triggers a bootstrap restore)
- Wait for the restored cluster to reach Ready and
stanzaCreated: true
- Check whether
00000002.history is present in the pgBackRest archive:
PRIMARY=$(kubectl get pod -n <namespace> \
-l postgres-operator.crunchydata.com/cluster=<restored-cluster>,postgres-operator.crunchydata.com/role=master \
-o jsonpath='{.items[0].metadata.name}')
kubectl exec -n <namespace> "${PRIMARY}" -c database -- \
pgbackrest --stanza=db archive-get "00000002.history" "/tmp/00000002.history.check"
echo "exit code: $?"
EXPECTED
archive-get exits 0 and downloads 00000002.history — the timeline history file is present in the archive.
ACTUAL
archive-get exits non-zero — 00000002.history is missing from the archive. This is a race condition; it does not reproduce on every run. When it does not reproduce, stanza-create happened to run before the async archiver background worker attempted the push.
The root cause sequence:
postgres restore completes
└─ postgres promotes TL1 → TL2
└─ archive_command: pgbackrest archive-push 00000002.history
└─ async mode: spool entry queued, exit 0 returned to postgres ✓
└─ background archiver runs: archive-push → ERROR 103 (archive.info missing)
└─ spool entry dropped — postgres never retries ✗
(later) reconcileStanzaCreate → stanza-create → archive.info created
└─ 00000002.history is permanently missing from the archive
Once the history file is missing, any subsequent PITR restore of this cluster will leave all replicas permanently stuck:
could not find common ancestor of the source and target cluster's timelines
Replica pods never reach Ready (2/4 containers).
Logs
pgBackRest async archiver log on the primary pod (captured from /tmp/pgbackrest/archive-push-async.log):
-------------------PROCESS START-------------------
P00 INFO: archive-push async start
P00 INFO: push 1 WAL file(s) to archive: 00000002.history
P01 DETAIL: pushed WAL file '00000002.history' to the archive
(or in the failing case — error 103 below)
P01 ERROR: [103]: unable to open archive file '00000002.history' for write:
raised from remote-0 protocol on '...': archive.info does not exist
pgBackRest archive status confirming the file is missing:
kubectl exec -n <namespace> "${PRIMARY}" -c database -- \
pgbackrest --stanza=db info
# The archive section shows WAL starts at 000000020000000000000001,
# but 00000002.history is absent.
Additional Information
The bug is non-deterministic — it is a race between the async archiver background worker and reconcileStanzaCreate. When the cluster is under load or the object store is slow, the window widens and the bug reproduces more reliably.
Workaround: Set archive-async = n in the pgBackRest global configuration. This forces synchronous archiving so postgres retries on failure instead of silently dropping the spool entry. This has a performance cost for normal WAL archiving.
Proposed fix: After a successful stanza-create, immediately re-push any *.history files found in $PGDATA/pg_wal using --no-archive-async. This is idempotent — if the file is already in the archive the push exits 0; if it was dropped by the race it is recovered. The call site is reconcileStanzaCreate in internal/controller/postgrescluster/pgbackrest.go, immediately after StanzaCreateOrUpgrade returns success and before stanzaCreated: true is written to status.
Overview
When a
PostgresClusteris bootstrapped from adataSource.pgbackrestrestore, postgres promotes from timeline 1 to timeline 2 and immediately tries to archive00000002.historyviaarchive_command. At that moment the pgBackRest stanza (archive.info) does not yet exist — the operator creates it later in its reconcile loop. pgBackRest's async archiver silently drops the push with error 103 and removes the spool entry. postgres considers the file archived and never retries. The history file is permanently absent from the archive.Without
00000002.historyin the archive,pg_rewindcannot reconstruct the full timeline chain during any future PITR restore and fails with:Replica pods remain stuck at
2/4containers Ready indefinitely.Environment
ubi9-5.8.7-0pd-ssd(GKE standard SSD persistent disk)Steps to Reproduce
REPRO
PostgresClusterwith a pgBackRest S3/object-store repo and wait forstanzaCreated: truePostgresClusterwithdataSource.pgbackrestpointing to the source cluster's repo (this triggers a bootstrap restore)stanzaCreated: true00000002.historyis present in the pgBackRest archive:EXPECTED
archive-getexits 0 and downloads00000002.history— the timeline history file is present in the archive.ACTUAL
archive-getexits non-zero —00000002.historyis missing from the archive. This is a race condition; it does not reproduce on every run. When it does not reproduce,stanza-createhappened to run before the async archiver background worker attempted the push.The root cause sequence:
Once the history file is missing, any subsequent PITR restore of this cluster will leave all replicas permanently stuck:
Replica pods never reach Ready (
2/4containers).Logs
pgBackRest async archiver log on the primary pod (captured from
/tmp/pgbackrest/archive-push-async.log):(or in the failing case — error 103 below)
pgBackRest archive status confirming the file is missing:
Additional Information
The bug is non-deterministic — it is a race between the async archiver background worker and
reconcileStanzaCreate. When the cluster is under load or the object store is slow, the window widens and the bug reproduces more reliably.Workaround: Set
archive-async = nin the pgBackRest global configuration. This forces synchronous archiving so postgres retries on failure instead of silently dropping the spool entry. This has a performance cost for normal WAL archiving.Proposed fix: After a successful
stanza-create, immediately re-push any*.historyfiles found in$PGDATA/pg_walusing--no-archive-async. This is idempotent — if the file is already in the archive the push exits 0; if it was dropped by the race it is recovered. The call site isreconcileStanzaCreateininternal/controller/postgrescluster/pgbackrest.go, immediately afterStanzaCreateOrUpgradereturns success and beforestanzaCreated: trueis written to status.