-
Notifications
You must be signed in to change notification settings - Fork 838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No callbacks on failed start & bringing down a VIP on a failed master #1418
Comments
I’m interested in learning more about these callbacks - could you point me to relevant documentation? We use vip-manager (https://github.com/cybertec-postgresql/vip-manager) to manage VIPs. It runs on each node and checks Consul (could also use etcd) regularly. This may work better for your needs, as there is no dependency on Patroni functioning correctly. |
Thank you guys for the feedback. It felt weird to build layers upon layers of HA code, so we left out the cybertec solution for now and used a custom script : https://github.com/daamien/ansible-role-patroni-vip We will look into the cybertec solution again. |
Hi,
First and foremost : thank you for the software !
We have built a 3 nodes cluster with the following software on each nodes :
We use a VIP to direct the connection on the master node. For this we use patroni's callbacks.
We delete the VIP on :
We create the VIP on :
on_stop
One of our crash scenarios involves removing the
$PGDATA/global/pg_control
on the master server.Patroni crashes with the following errors since the
data
frompg_controldata
is not available :The failure is handled correctly by patroni, the master fails over and the crashed primary is marked as failed. But in that case the
on_start
callback is never fired since PostgreSQL is never started as a strandby . In our case this means the VIP is not going down on the crashed primary.Is there another way to archive our goal (bringing down the VIP) with this kind of crash ?
It feels like the failure could be handled more gracefully.
What do you think about adding a
on_failed
callback ?Thank you in advance for your feedback.
Benoit
The text was updated successfully, but these errors were encountered: