orchestrator is able to recover from a set of failure scenarios. Notably, it can recover a failed master or a failed intermediate master.
orchestrator supports:
- Automated recovery (takes action on unexpected failures).
- Graceful, planned, master promotion.
- Manual recovery.
- Manual, forced/panic failovers.
To run any kind of failovers, your topologies must support either:
- Oracle GTID (with
MASTER_AUTO_POSITION=1) - MariaDB GTID
- Pseudo GTID
- Binlog Servers
See MySQL Configuration for more details.
Automated recovery is opt in. Please consider recovery configuration.
Based on a failure detection, a sequence of events compose a recovery:
- Pre-recovery hooks (external processes execution)
- Healing of topology
- Post-recovery hooks
Where:
- Pre-recovery hooks are configured by the user.
- Executed sequentially
- Failure of any of these hooks (nonzero exit code) will abort the failover.
- Topology healing is managed by
orchestratorand is state-based rather than configuration-based.orchestratortries to make the best out of a bad situation, taking into consideration the existing topology, versions, server configurations, etc. - Post-recovery hooks are, again, configured by the user.
The following highlights some of the complexity of a recovery.
A "simple" recovery case is that of a DeadIntermediateMaster. Its replicas are orphaned, but when
using GTID or Pseudo-GTID they can still be re-connected to the topology. We might choose to:
- Find a sibling of the dead intermediate master, and move orphaned replicas below said sibling
- Promote a replica from among the orphaned replicas, make it intermediate master of its siblings, then connect promoted replica up the topology
- relocate all orphaned replicas
- Combine parts of the above
The exact implementation greatly depends on the topology setup (which instances have log-slave-updates? Are instances lagging? Do they have replication filters? Which versions of MySQL? etc.). It is very (very) likely your topology will support at least one of the above (in particular, matching-up the replicas is a trivial solution, unless replication filters are in place).
Recovering from a dead master is a much more complex operation, for various reasons:
-
There is outage implied and recovery is expected to be as fast as possible.
-
Some servers may be lost in the process.
orchestratormust determine which, if any. -
State of the topology may be such that the user wishes to prevent the recovery.
-
Master service discovery must take place: the app must be able to communicate with the new master (potentially be told that the master has changed).
-
Find the best replica to promote.
- A naive approach would be to pick the most up-to-date replica, but that may not always be the right choice.
- It may so happen that the most up-to-date replica will not have the necessary configuration to act as master to other replicas (e.g. binlog format, MySQL versioning, replication filters and more). By blindly promoting the most up-to-date replica one may lose replica capacity.
orchestratorattempts to promote a replica that will retain the most serving capacity.
-
Promote said replica, taking over its siblings.
-
Bring siblings up to date
-
Possibly, do a 2nd phase promotion; the user may have tagged specific servers to be promoted if possible (see
register-candidatecommand). -
Call upon hooks (read further)
Master service discovery is largely the user's responsibility to implement. Common solutions are:
- DNS based discovery;
orchestratorwill need to invoke a hook that modifies DNS entries. - ZooKeeper/Consul KV/etcd/other key-value based discovery;
orchestratorhas built-in support for Consul KV, otherwise an external hook must update KV stores - Proxy based discovery;
orchestratorwill call upon external hook that modifies proxy config, or will update Consul/Zk/etcd as above, which will itself trigger an update to proxy configuration. - Other solutions...
orchestrator attempts to be a generic solution hence takes no stance on your service discovery method.
Opt-in. Automated recovery may be applied to all ("*") clusters or specific ones.
Recovery follows detection, and assuming nothing blocks recovery (read further below)
For greater resolution, different configuration applies for master recovery and for intermediate-master recovery. Detailed breakdown of recovery-related configuration follows.
The analysis mechanism runs at all times, and checks periodically for failure/recovery scenarios. It will make an automated recovery for:
- An actionable type of scenario
- For an instance that is not downtimed
- For an instance belonging to a cluster for which recovery is explicitly enabled via configuration
- For an instance in a cluster that has not recently been recovered, unless such recent recoveries were acknowledged
- Where global recoveries are enabled
a.k.a graceful master takeover
TL;DR use this for replacing a master in an orderly, planned operation.
Typically for purposes of upgrade, host maintenance, etc., you may wish to replace an existing master with another one. This is a graceful promotion (aka graceful master takeover). It's essentially a rotation of roles: an existing replica turns into a new master, the old master turns into a replica.
In a graceful takeover:
- either the user or
orchestratorchoose an existing replica as the designated new master. orchestratorensures the designated replica takes over its siblings as intermediate masterorchestratorturns the master to beread-only(possibly alsosuper-read-only)orchestratormakes sure your designated server is caught up with replication.orchestratorpromotes your designated server as the new master.orchestratorturns promoted server to be writable.orchestratordemotes the old master and places it as a direct replica of the new master- if possible,
orchestratorsets replication user/password for the demoted master - in the
graceful-master-takeover-autovariant (see following),orchestratorstarts replication on demoted master.
The operation can take a few seconds, during which time your app is expected to complain, seeing that the master is read-only.
orchestrator provides you with specialized hooks to run a graceful takeover:
PreGracefulTakeoverProcessesPostGracefulTakeoverProcesses
These hooks run in addition to the standard hooks. orchestrator will run PreGracefulTakeoverProcesses, then go through a DeadMaster flow, running the normal pre-, post- hooks for DeadMaster, and at last follows up with PostGracefulTakeoverProcesses.
We find that some operations are similar between graceful-takeover and real failover, and some are different. The PreGracefulTakeoverProcesses and PostGracefulTakeoverProcesses hooks can be used, for example, to silence down alerts. You may want to disable the pager for the duration of a planned failover. Advanced usage may include stalling traffic at proxy layer.
From within the normal pre-, post- failover processes, you may use the {command} placeholder, or ORC_COMMAND environment variable to check whether this is a graceful takeover. You will see the value graceful-master-takeover.
There are two variations of graceful takeover:
graceful-master-takeover:- The user must indicate the designated replica to be promoted
- or, setup topology such that the master only has a single direct replica, which implicitly makes it the designated replica
- the demoted master is placed as replica to the new master, but
orchestratordoes not start replication on the server.
graceful-master-takeover-auto:- The user may indicate the designated replica to be promoted, which
orchestratormust respect - or, the user may omit the identity of designated replica, in which case
orchestratorpicks the best replica to promote - the demoted master is placed as replica to the new master, and
orchestratorstarts replication on the server.
- The user may indicate the designated replica to be promoted, which
Invoke graceful takeover via:
-
Command line; examples:
orchestrator-client -c graceful-master-takeover -alias mycluster -d designated.master.to.promote:3306: indicate designated replica;orchestratordoes not start replication on demoted masterorchestrator-client -c graceful-master-takeover-auto -alias mycluster -d designated.master.to.promote:3306: indicate designated replica;orchestratorstarts replication on demoted masterorchestrator-client -c graceful-master-takeover-auto -alias mycluster: letorchestratorchoose replica to promote;orchestratorstarts replication on demoted master
-
Web API; examples:
/api/graceful-master-takeover/:clusterHint/:designatedHost/:designatedPort: gracefully promote a new master (planned failover), indicating the designated master to promote./api/graceful-master-takeover/:clusterHint: gracefully promote a new master (planned failover). Designated server not indicated, works when the master has exactly one direct replica./api/graceful-master-takeover-auto/:clusterHint: gracefully promote a new master (planned failover).orchestratorpicks replica to promote.orchestratorstarts replication on demoted master:
-
Web interface: drag a direct master's replica onto the left half of the master's box. The web interface uses the
graceful-master-takeovervariation; the replication on demoted master will not kick in.
TL;DR use this when an instance is recognized as failed but where auto-recovery is disabled or blocked.
You may choose to ask orchestrator to recover a failure by providing a specific instance that is failed.
This instance must be recognized as having a failure. It is possible to request recovery for an instance that is downtimed (as this is manual recovery it overrides automatic assumptions).
Recover via:
- Command line:
orchestrator-client -c recover -i dead.instance.com:3306 --debug - Web API:
/api/recover/dead.instance.com/:3306 - Web: instance is colored black; click the
Recoverbutton
Manual recoveries don't block on RecoveryPeriodBlockSeconds (read more in next section). They also override RecoverMasterClusterFilters and RecoverIntermediateMasterClusterFilters. Thus, a human can always invoke a recovery by demand. A recovery may only block on yet another recovery running at that time on the same database instance.
TL;DR force master failover right now regardless of what orchestrator thinks.
Perhaps orchestrator doesn't see that the instance is failed, or you have some app-logic that requires the master must change right now, or perhaps the type of failure is such that orchestrator is unsure about. You wish to kick a master failover right now. You will run:
-
Command line:
orchestrator-client -c force-master-failover --alias myclusteror
orchestrator-client -c force-master-failover -i instance.in.that.cluster -
Web API:
/api/force-master-failover/myclusteror
/api/force-master-failover/instance.in.that.cluster/3306
Recoveries are audited via:
/web/audit-recovery/api/audit-recovery/api/audit-recovery-steps/:uid
Nuance auditing and control available via:
/api/blocked-recoveries: see blocked recoveries/api/ack-recovery/cluster/:clusterHint: acknowledge a recovery on a given cluster/api/ack-all-recoveries: acknowledge all recoveries/api/disable-global-recoveries: global switch to disableorchestratorfrom running any recoveries/api/enable-global-recoveries: re-enable recoveries/api/check-global-recoveries: check is global recoveries are enabled
Running manual recoveries (see next sections):
/api/recover/:host/:port: recover specific host, assumingorchestratoragrees there is failure./api/recover-lite/:host/:port: same, do not invoke external hooks (can be useful for testing)/api/graceful-master-takeover/:clusterHint/:designatedHost/:designatedPort: gracefully promote a new master (planned failover), indicating the designated master to promote./api/graceful-master-takeover/:clusterHint: gracefully promote a new master (planned failover). Designated server not indicated, works when the master has exactly one direct replica./api/force-master-failover/:clusterHint: panic, force master failover for given cluster
Some corresponding command line invocations:
orchestrator-client -c recover -i some.instance:3306orchestrator-client -c graceful-master-takeover -i some.instance.in.somecluster:3306orchestrator-client -c graceful-master-takeover -alias someclusterorchestrator-client -c force-master-takeover -alias someclusterorchestrator-client -c ack-cluster-recoveries -alias someclusterorchestrator-client -c ack-all-recoveriesorchestrator-client -c disable-global-recoveriesorchestrator-client -c enable-global-recoveriesorchestrator-client -c check-global-recoveries
orchestrator avoid flapping (cascading failures causing continuous outage and elimination of resources) by introducing a block period, where on any given cluster, orchestrator will not kick in automated recovery on an interval smaller than said period, unless cleared to do so by a human.
The block period is indicated by RecoveryPeriodBlockSeconds. It only applies to recoveries on same cluster. There is nothing to prevent concurrent recoveries running on different clusters.
Pending recoveries are unblocked either once RecoveryPeriodBlockSeconds has passed or such a recovery has been acknowledged.
Acknowledging a recovery is possible either via web API/interface (see audit/recovery page) or via command line interface (orchestrator-client -c ack-cluster-recoveries -alias somealias).
Note that manual recovery (e.g. orchestrator-client -c recover or orchestrator-client -c force-master-failover) ignores the blocking period.
Some servers are better candidate for promotion in the event of failovers. Some servers aren't good picks. Examples:
- A server has weaker hardware configuration. You prefer to not promote it.
- A server is in a remote data center and you don't want to promote it.
- A server is used as your backup source and has LVM snapshots open at all times. You don't want to promote it.
- A server has a good setup and is ideal as candidate. You prefer to promote it.
- A server is OK, and you don't have any particular opinion.
You will announce your preference for a given server to orchestrator in the following way:
orchestrator-client -c register-candidate -i ${::fqdn} --promotion-rule ${promotion_rule}
Supported promotion rules are:
preferneutralprefer_notmust_not
Promotion rules expire after an hour. That's the dynamic nature of orchestrator. You will want to setup a cron job that will announce the promotion rule for a server:
*/2 * * * * root "/usr/bin/perl -le 'sleep rand 10' && /usr/bin/orchestrator-client -c register-candidate -i this.hostname.com --promotion-rule prefer"
This setup comes from production environments. The cron entries get updated by puppet to reflect the appropriate promotion_rule. A server may have prefer at this time, and prefer_not in 5 minutes from now. Integrate your own service discovery method, your own scripting, to provide with your up-to-date promotion-rule.
All failure/recovery scenarios are analyzed. However also taken into consideration is the downtime status of
an instance. An instance can be downtimed (via orchestrator-client -c begin-downtime) and this is noted in the analysis summary. When considering automated recovery, downtimed servers are skipped.
Downtime was, in fact, explicitly created for this very purpose, and allows the DBA a way to suppress automated failover and a specific server.
Note that manual recovery (e.g. orchestrator-client -c recover) overrides downtime.
orchestrator supports hooks -- external scripts invoked through the recovery process. These are arrays of commands invoked via shell, in particular bash. See hook configuration details in recovery configuration
OnFailureDetectionProcesses: described in failure detection.PreGracefulTakeoverProcesses: invoked ongraceful-master-takeovercommand, before master goesread-only.PreFailoverProcessesPostMasterFailoverProcessesPostIntermediateMasterFailoverProcessesPostFailoverProcessesPostUnsuccessfulFailoverProcessesPostGracefulTakeoverProcesses: executed on planned, graceful master takeover, after the old master is positioned under the newly promoted master.