standby node stuck in catchingup state when replication slot lost on primary node #1031
Unanswered
rhicks0614
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
hi,
I'm trying out pg_auto_failover version 2.1.2, using two nodes (postgresql-14).
I'm additionally setting max_slot_wal_keep_size, due to disk space limitations.
After stopping the standby node, the primary node went to "wait_primary" as expected.
Eventually the replication slot wal_status goes to "lost", due to max_slot_wal_keep_size being set:
Now after restarting the standby, I observe it is stuck in the "catchingup" state per "pg_autoctl show state":
and in the postgres logs, the standby is continually trying to start streaming again, even though the WAL segment has been removed (lost):
Is there some way to trigger it to give up, and proceed to do pg_basebackup to recover?
thanks!
Beta Was this translation helpful? Give feedback.
All reactions