Failover and recovery #3

paunin · 2017-03-31T06:08:07Z

Hi Guys!

I'm looking for information in your solution about failover and recovery of nodes in postgres cluster, sorry but not sure if it exists... could you please advice?

jmccormick2001 · 2017-03-31T13:09:31Z

its something we are definitely going to add, the containers we use under the hood on this support a watch/failover capability, we will leverage some of that into what the operator will allow...
current thinking is something like pgo create watch mycluster or something similar would let a user set up a failover watch on a cluster, ideally we would like to support different failover strategies similar to the way we support different cluster strategies. But for sure, the operator would let you trigger a recovery. so stay tuned, it will happen in an upcoming release of the operator.

thekalinga · 2017-04-03T13:01:28Z

@jmccormick2001

Assuming that master went down

Isn't failover supposed to be automatic, based on which replica has most of the data of the master & promote him as master?
(or) automatic mounting pvc of (previous)master to one of the replicas and promote him as the new master?

My assumption was that the operator is supposed to watch over the instances & take care of master selection (on bootstrap & on master failure) & coordination among the participating pods.

Please correct me incase if I am missing anything

paunin · 2017-04-04T02:54:38Z

Well, @jmccormick2001 I was asking because looking for alternative solution and understand benefits against mine, please let me know whenever you got something around... I'm all into this topic! thanks :)

jmccormick2001 · 2017-04-04T12:49:10Z

this is a good topic for sure and one I'm thinking about...some design ideas in my head include:

allow cluster 'watching', but don't mandate it
allow for a user initiated faillover, where the user can trigger the failover to whatever replica they choose
allow for a failover to a 'certain' replica using a configurable/programmable selection algorithm (based on metadata of a replica, or replication status, other)
allow for a pre-hook and post-hook script to be executed when a failover is triggered
allow for continuous cluster watching after a failover has occurred (updating labels as required to enable proper client routing)
allow for killing off stale replicas after a failover

paunin · 2017-04-04T13:47:59Z

allow cluster 'watching', but don't mandate it

usually you dont want cluster without failover logic

allow for a user initiated faillover, where the user can trigger the failover to whatever replica they choose

Swichover is the correct term here :P as we are not failing ... and yeah it's something i have not developed.

allow for a failover to a 'certain' replica using a configurable/programmable selection algorithm (based on metadata of a replica, or replication status, other)

That might be solved by replica priority, and quorum election of course to avoid split-brain... so your priority should not mean much in case of network issues but it will in case of master bad health.

allow for a pre-hook and post-hook script to be executed when a failover is triggered

Any practical use for this one?

allow for continuous cluster watching after a failover has occurred (updating labels as required to enable proper client routing)

For that one I use pgpool ;) it does it's job, excluding nodes from list of backends based on different conditions and health checks to postgres servers

allow for killing off stale replicas after a failover

That I don't understand at all :(

Sorry for going through the list of your ideas, but looks like I've been in that state half an year ago. And can help you with some of them if you need help ofcourse :)
I like the Idea to implement special API for DB objects in k8s but also think that you might want to segregate responsibilities:

reliable dockerized postgres cluster
management of the cluster using k8s stuff and facilities

PS
I would invest some time in first one(build or adapt my images) while you could focus on wrapper you are developing right now. Let me know if you are interested in and have wish to build something a bit more cooler than we've done separately. Cheers!

hartmut-pq · 2017-07-25T13:54:39Z

Hi everyone!
@jmccormick2001 the postgres-operator is looking very exciting, nice work and overall concept.
I was looking into failover/HA capabilities, too but couldn't really find out what's the current status.
Currently if the cluster master dies the cluster is simply down?

Keep up the good work!

hartmut-pq · 2017-07-25T13:55:55Z

Was also looking at an alternative using stolon / etcd:
https://medium.com/@SergeyNuzhdin/how-to-deploy-ha-postgresql-cluster-on-kubernetes-3bf9ed60c64f

jmccormick2001 · 2017-07-25T13:59:55Z

thanks, currently the master runs in a Deployment which 'should' get rescheduled by Kube if the Kube node dies, Kube 'should' restart the pod if it dies as well to keep the Deployment consistent. However, the gotcha here is what if the master's data is somehow corrupted? Kube in that case will just restart the bad database over and over. What I'm considering is a more formal way to specify that "I want to fail over to this replica", I have some ideas on how this will work but want to give it some extra thought before I do the implementation. I also want a means of specifying a 'sync' replica that a user could specifically target as the failover target. This is definitely a high priority for an upcoming release so stay tuned. There is also the case where a user might want an 'automated failover', I'm thinking about that use case as well.

hartmut-pq · 2017-07-25T14:04:51Z

Thanks for the immediate reply!
Simple use case: HA on AWS...
e.g. you're running a Multi-AZ k8s cluster on eu-west-1a, eu-west-1b, eu-west-1c
and you want your pgcluster running on k8s to be HA in case one A-Z goes down. In the case that the node on an AZ with the pgcluster master goes down - the EBS being AZ-restrictive isn't available to mount on any other node/pod (AZ down..) - so one of the replicas running on other pods on other nodes on another AZ would immediately and automatically take over.

hartmut-pq · 2017-07-25T15:19:45Z

That's not AWS specific though... just in general - if the db is the critical point of failure - which is for most applications in some way.. and you are aiming towards 100% uptime...

So in case of a sudden node death or downtime, 1-2min for k8s to free the PVC, mount on another node + restart the pod somewhere else is quite a long downtime - when there are replicas available that could take over...

jmccormick2001 · 2017-07-25T15:47:03Z

understood, there is definitely the case where users will want to orchestrate a failover onto a specific replica regardless of what Kube might do

hartmut-pq · 2017-07-25T16:54:58Z

Many thanks for your input! I don't want to urge or anything - do you have a rough timescale / idea when some sort of automatic failover may get realised? May I help in any way?
I need to choose a way/option going forward how to build/manage a pg cluster and this operator seems like a pretty good candidate... :-)

jmccormick2001 · 2017-07-25T18:00:10Z

no worries, the current roadmap/schedule is to release the 'policy' mechanism in the next week or so, this is a new feature that lets you apply SQL policies against clusters....right after that is the failover work...so I'm shooting to have some form of failover feature in about the 4-5 week time frame, hopefully earlier. Once I have some early work done on this, I'll reach out and see if you could do a sanity check on it.

hartmut-pq · 2017-07-25T18:20:25Z

👍 will try to keep track, happy to help with sanity check.
One more question - is there any way to have automated backups? It's always full backups, right? What about incremental backups?

jmccormick2001 · 2017-07-25T18:33:05Z

right now there is only a full backup (using pg_basebackup)...the other forms of backup and a scheduling capability are in the works too but they will likely lag behind the failover and policy releases.

hartmut-pq · 2017-07-26T13:05:36Z

That all sounds highly promising 👏 !

jmccormick2001 · 2018-03-30T15:32:19Z

manual failover is coded and will land in upcoming 2.6 release.

jmccormick2001 · 2018-06-04T15:32:48Z

auto failover (first cut) will land in operator 3.1 soon to be released.

jmccormick2001 added the question label Mar 31, 2017

jmccormick2001 closed this as completed Jun 4, 2018

dsessler7 mentioned this issue Feb 22, 2024

Allow customers to specify roles that they want credentials for... #3854

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failover and recovery #3

Failover and recovery #3

paunin commented Mar 31, 2017

jmccormick2001 commented Mar 31, 2017

thekalinga commented Apr 3, 2017 •

edited

paunin commented Apr 4, 2017

jmccormick2001 commented Apr 4, 2017

paunin commented Apr 4, 2017 •

edited

hartmut-pq commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017 •

edited

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 26, 2017

jmccormick2001 commented Mar 30, 2018

jmccormick2001 commented Jun 4, 2018

Failover and recovery #3

Failover and recovery #3

Comments

paunin commented Mar 31, 2017

jmccormick2001 commented Mar 31, 2017

thekalinga commented Apr 3, 2017 • edited

paunin commented Apr 4, 2017

jmccormick2001 commented Apr 4, 2017

paunin commented Apr 4, 2017 • edited

hartmut-pq commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017 • edited

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 25, 2017

jmccormick2001 commented Jul 25, 2017

hartmut-pq commented Jul 26, 2017

jmccormick2001 commented Mar 30, 2018

jmccormick2001 commented Jun 4, 2018

thekalinga commented Apr 3, 2017 •

edited

paunin commented Apr 4, 2017 •

edited

hartmut-pq commented Jul 25, 2017 •

edited