Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failover and recovery #3

Closed
paunin opened this issue Mar 31, 2017 · 18 comments
Closed

Failover and recovery #3

paunin opened this issue Mar 31, 2017 · 18 comments
Labels

Comments

@paunin
Copy link

paunin commented Mar 31, 2017

Hi Guys!

I'm looking for information in your solution about failover and recovery of nodes in postgres cluster, sorry but not sure if it exists... could you please advice?

@jmccormick2001
Copy link
Contributor

its something we are definitely going to add, the containers we use under the hood on this support a watch/failover capability, we will leverage some of that into what the operator will allow...
current thinking is something like pgo create watch mycluster or something similar would let a user set up a failover watch on a cluster, ideally we would like to support different failover strategies similar to the way we support different cluster strategies. But for sure, the operator would let you trigger a recovery. so stay tuned, it will happen in an upcoming release of the operator.

@thekalinga
Copy link

thekalinga commented Apr 3, 2017

@jmccormick2001

Assuming that master went down

Isn't failover supposed to be automatic, based on which replica has most of the data of the master & promote him as master?
(or) automatic mounting pvc of (previous)master to one of the replicas and promote him as the new master?

My assumption was that the operator is supposed to watch over the instances & take care of master selection (on bootstrap & on master failure) & coordination among the participating pods.

Please correct me incase if I am missing anything

@paunin
Copy link
Author

paunin commented Apr 4, 2017

Well, @jmccormick2001 I was asking because looking for alternative solution and understand benefits against mine, please let me know whenever you got something around... I'm all into this topic! thanks :)

@jmccormick2001
Copy link
Contributor

this is a good topic for sure and one I'm thinking about...some design ideas in my head include:

  • allow cluster 'watching', but don't mandate it
  • allow for a user initiated faillover, where the user can trigger the failover to whatever replica they choose
  • allow for a failover to a 'certain' replica using a configurable/programmable selection algorithm (based on metadata of a replica, or replication status, other)
  • allow for a pre-hook and post-hook script to be executed when a failover is triggered
  • allow for continuous cluster watching after a failover has occurred (updating labels as required to enable proper client routing)
  • allow for killing off stale replicas after a failover

@paunin
Copy link
Author

paunin commented Apr 4, 2017

  • allow cluster 'watching', but don't mandate it

usually you dont want cluster without failover logic

  • allow for a user initiated faillover, where the user can trigger the failover to whatever replica they choose

Swichover is the correct term here :P as we are not failing ... and yeah it's something i have not developed.

  • allow for a failover to a 'certain' replica using a configurable/programmable selection algorithm (based on metadata of a replica, or replication status, other)

That might be solved by replica priority, and quorum election of course to avoid split-brain... so your priority should not mean much in case of network issues but it will in case of master bad health.

  • allow for a pre-hook and post-hook script to be executed when a failover is triggered

Any practical use for this one?

  • allow for continuous cluster watching after a failover has occurred (updating labels as required to enable proper client routing)

For that one I use pgpool ;) it does it's job, excluding nodes from list of backends based on different conditions and health checks to postgres servers

  • allow for killing off stale replicas after a failover

That I don't understand at all :(

Sorry for going through the list of your ideas, but looks like I've been in that state half an year ago. And can help you with some of them if you need help ofcourse :)
I like the Idea to implement special API for DB objects in k8s but also think that you might want to segregate responsibilities:

  • reliable dockerized postgres cluster
  • management of the cluster using k8s stuff and facilities

PS
I would invest some time in first one(build or adapt my images) while you could focus on wrapper you are developing right now. Let me know if you are interested in and have wish to build something a bit more cooler than we've done separately. Cheers!

@hartmut-pq
Copy link

Hi everyone!
@jmccormick2001 the postgres-operator is looking very exciting, nice work and overall concept.
I was looking into failover/HA capabilities, too but couldn't really find out what's the current status.
Currently if the cluster master dies the cluster is simply down?

Keep up the good work!

@hartmut-pq
Copy link

Was also looking at an alternative using stolon / etcd:
https://medium.com/@SergeyNuzhdin/how-to-deploy-ha-postgresql-cluster-on-kubernetes-3bf9ed60c64f

@jmccormick2001
Copy link
Contributor

thanks, currently the master runs in a Deployment which 'should' get rescheduled by Kube if the Kube node dies, Kube 'should' restart the pod if it dies as well to keep the Deployment consistent. However, the gotcha here is what if the master's data is somehow corrupted? Kube in that case will just restart the bad database over and over. What I'm considering is a more formal way to specify that "I want to fail over to this replica", I have some ideas on how this will work but want to give it some extra thought before I do the implementation. I also want a means of specifying a 'sync' replica that a user could specifically target as the failover target. This is definitely a high priority for an upcoming release so stay tuned. There is also the case where a user might want an 'automated failover', I'm thinking about that use case as well.

@hartmut-pq
Copy link

Thanks for the immediate reply!
Simple use case: HA on AWS...
e.g. you're running a Multi-AZ k8s cluster on eu-west-1a, eu-west-1b, eu-west-1c
and you want your pgcluster running on k8s to be HA in case one A-Z goes down. In the case that the node on an AZ with the pgcluster master goes down - the EBS being AZ-restrictive isn't available to mount on any other node/pod (AZ down..) - so one of the replicas running on other pods on other nodes on another AZ would immediately and automatically take over.

@hartmut-pq
Copy link

hartmut-pq commented Jul 25, 2017

That's not AWS specific though... just in general - if the db is the critical point of failure - which is for most applications in some way.. and you are aiming towards 100% uptime...

So in case of a sudden node death or downtime, 1-2min for k8s to free the PVC, mount on another node + restart the pod somewhere else is quite a long downtime - when there are replicas available that could take over...

@jmccormick2001
Copy link
Contributor

understood, there is definitely the case where users will want to orchestrate a failover onto a specific replica regardless of what Kube might do

@hartmut-pq
Copy link

Many thanks for your input! I don't want to urge or anything - do you have a rough timescale / idea when some sort of automatic failover may get realised? May I help in any way?
I need to choose a way/option going forward how to build/manage a pg cluster and this operator seems like a pretty good candidate... :-)

@jmccormick2001
Copy link
Contributor

no worries, the current roadmap/schedule is to release the 'policy' mechanism in the next week or so, this is a new feature that lets you apply SQL policies against clusters....right after that is the failover work...so I'm shooting to have some form of failover feature in about the 4-5 week time frame, hopefully earlier. Once I have some early work done on this, I'll reach out and see if you could do a sanity check on it.

@hartmut-pq
Copy link

👍 will try to keep track, happy to help with sanity check.
One more question - is there any way to have automated backups? It's always full backups, right? What about incremental backups?

@jmccormick2001
Copy link
Contributor

right now there is only a full backup (using pg_basebackup)...the other forms of backup and a scheduling capability are in the works too but they will likely lag behind the failover and policy releases.

@hartmut-pq
Copy link

That all sounds highly promising 👏 !

@jmccormick2001
Copy link
Contributor

manual failover is coded and will land in upcoming 2.6 release.

@jmccormick2001
Copy link
Contributor

auto failover (first cut) will land in operator 3.1 soon to be released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants