Skip to content

Commit

Permalink
doc: Add standby cluster related documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
sgotti committed May 30, 2017
1 parent 062aa2e commit acffd0c
Show file tree
Hide file tree
Showing 4 changed files with 103 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ For an introduction to stolon you can also take a look at [this post](https://sg
* Full cluster setup in minutes.
* Easy [cluster admininistration](doc/stolonctl.md)
* Can do point in time recovery integrating with your preferred backup/restore tool.
* [Standby cluster](doc/standbycluster.md) (for multi site replication and near zero downtime migration).
* Automatic service discovery and dynamic reconfiguration (handles postgres and stolon processes changing their addresses).
* Can use [pg_rewind](doc/pg_rewind.md) for fast instance resyncronization with current master.

Expand Down
1 change: 1 addition & 0 deletions doc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
* Backup/Restore
* [Point In Time Recovery](pitr.md)
* [Point In Time Recovery with wal-e](pitr_wal-e.md)
* [Standby Cluster](standbycluster.md)
* Examples
* [Simple test cluster](simplecluster.md)
* [Kubernetes](../examples/kubernetes/README.md)
Expand Down
10 changes: 10 additions & 0 deletions doc/cluster_spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@ Some options in a running cluster specification can be changed to update the des
| initMode | The cluster initialization mode. Can be *new* or *existing*. *new* means that a new db cluster will be created on a random keeper and the other keepers will sync with it. *existing* means that a keeper (that needs to have an already created db cluster) will be choosed as the initial master and the other keepers will sync with it. In this case the `existingConfig` object needs to be populated. | yes | string | |
| existingConfig | configuration for initMode of type "existing" | if initMode is "existing" | ExistingConfig | |
| mergePgParameters | merge pgParameters of the initialized db cluster, useful the retain initdb generated parameters when InitMode is new, retain current parameters when initMode is existing or pitr. | no | bool | true |
| role | cluster role (master or standby) | no | bool | master |
| pitrConfig | configuration for initMode of type "pitr" | if initMode is "pitr" | PITRConfig | |
| standbySettings | standby settings when the cluster is a standby cluster | if role is "standby" | StandbySettings | |
| pgParameters | a map containing the postgres server parameters and their values. The parameters value don't have to be quoted and single quotes don't have to be doubled since this is already done by the keeper when writing the postgresql.conf file | no | map[string]string | |

#### ExistingConfig
Expand All @@ -51,6 +53,14 @@ Some options in a running cluster specification can be changed to update the des
|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------------------------|---------|
| RestoreCommand | defines the command to execute for restoring the archives. See the related [postgresql doc](https://www.postgresql.org/docs/current/static/archive-recovery-settings.html) | yes | string | |

#### StandbySettings

| Name | Description | Required | Type | Default |
|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------------------------|---------|
| primaryConnInfo | connection string to connect to the primary server (its value will be placed in the `primary_conninfo` parameter of the instance `recovery.conf` file. See the related [postgresql doc](https://www.postgresql.org/docs/current/static/standby-settings.html) | yes | string | |
| primarySlotName | optional replication slot to use (its value will be placed in the `primary_slot_name` parameter of the instance `recovery.conf` file. See the related [postgresql doc](https://www.postgresql.org/docs/current/static/standby-settings.html) | no | string | |
| recoveryMinApplyDelay | delay recovery for a fixed period of time (its value will be placed in the `recovery_min_apply_delay` parameter of the instance `recovery.conf` file. See the related [postgresql doc](https://www.postgresql.org/docs/current/static/standby-settings.html) | no | string | |

#### Special Types
duration types (as described in https://golang.org/pkg/time/#ParseDuration) are signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".

Expand Down
91 changes: 91 additions & 0 deletions doc/standbycluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
## Stolon standby cluster

A stolon cluster can be initialized as a standby of another remote postgresql instance (being it another stolon cluster or a standalone instance or any other kind of architecture).

This is useful for a lot of different use cases:

* Disaster recovery
* (near) Zero downtime migration to stolon

In a stolon standby cluster the master keeper will be the one that will sync with the remote instance, while the other keepers will replicate with the master keeper (creating a cascading replication topology).
Everything else will work as a normal cluster, if a keeper dies another one will be elected as the cluster master.

### Initializing a stolon standby cluster

#### Prerequisites

* The remote postgresql primary should have defined a superuser and user with replication privileges (can also be the same superuser) and accept remote logins from the replication user (be sure `pg_hba.conf` contains the required lines).
* You should provide the above user credentials to the stolon keepers (`--pg-su-username --pg-su-passwordfile/--pg-su-password --pg-repl-username --pg-repl-passwordfile/--pg-repl-password`)

**NOTE:** In future we could improve this example using other authentication methods like client TLS certificates.

In this example we'll use the below information:

* remote instance host: `remoteinstancehost`
* remote instance port: `5432`
* remote instance replication user name: `repluser`
* remote instance replication user password: `replpassword`
* stolon cluster name: `stolon-cluster`
* stolon store type: `etcd` (listening on localhost with default port to make it simple)

We can leverage stolon [Point in time Recovery](pitr.md) feature to clone from a remote postgres db. For example we can use `pg_basebackup` to initialize the cluster. We have to call `pg_basebackup` providing the remote instance credential for a replication user. To provide the password to `pg_basebackup` we have to create a password file like this:

```
remoteinstancehost:5432:*:repluser:replpassword
```

ensure to set the [right permissions to the password file](https://www.postgresql.org/docs/current/static/libpq-pgpass.html).


* Start one or more stolon sentinels and one or more stolon keepers passing the right values for `--pg-su-username --pg-su-passwordfile/--pg-su-password --pg-repl-username --pg-repl-passwordfile/--pg-repl-password`

* Initialize the cluster with the following cluster spec:

```
stolonctl --cluster-name stolon-cluster --store-backend=etcd init '
{
"role": "standby",
"initMode": "pitr",
"pitrConfig": {
"dataRestoreCommand": "PGPASSFILE=passfile pg_basebackup -D \"%d\" -h remoteinstancehost -p 5432 -U repluser"
},
"standbySettings": {
"primaryConnInfo": "host=remoteinstancehost port=5432 user=repluser password=replpassword sslmode=disable"
}
}'
```

If all is correct the sentinel will choose a keeper as the cluster "master" and this keeper will start the db pitr using the `pg_basebackup` command provided in the cluster specification.

If the command completes successfully the master keeper will start as a standby of the remote instance using the recovery options provided in the cluster spec `standbySettings` (this will be used to generate the `recovery.conf` file).

The other keepers will become standby keepers of the cluster master keeper.

### Additional options

You can specify additional options in the `standbySettings` (for all the options see the [cluster spec doc](https://github.com/sorintlab/stolon/blob/master/doc/cluster_spec.md#standbysettings))

For example you can specify a primary slot name to use for syncing with the master and a wal apply delay

Ex. with a primary slot name:
```
"standbySettings": {
"primaryConnInfo": "host=remoteinstancehost port=5432 user=repluser password=replpassword sslmode=disable",
"primarySlotName": "standbycluster"
}
```

### Promoting a standby cluster

When you want to promote your standby cluster to a primary one (for example in a disaster recovery scenario to switch to a dr site, or during a migration to switch to the new stolon cluster) you can do this using `stolonctl`:

```
stolonctl --cluster-name stolon-cluster --store-backend=etcd promote
```

This is the same as doing:

```
stolonctl --cluster-name stolon-cluster --store-backend=etcd update --patch { "role": "master" }
```

0 comments on commit acffd0c

Please sign in to comment.