diff --git a/docs/administrator.md b/docs/administrator.md index eacb9f280..9408541d0 100644 --- a/docs/administrator.md +++ b/docs/administrator.md @@ -3,6 +3,21 @@ Learn how to configure and manage the Postgres Operator in your Kubernetes (K8s) environment. +## Upgrading the operator + +The Postgres Operator is upgraded by changing the docker image within the +deployment. Before doing so, it is recommended to check the release notes +for new configuration options or changed behavior you might want to reflect +in the ConfigMap or config CRD. E.g. a new feature might get introduced which +is enabled or disabled by default and you want to change it to the opposite +with the corresponding flag option. + +When using helm, be aware that installing the new chart will not update the +`Postgresql` and `OperatorConfiguration` CRD. Make sure to update them before +with the provided manifests in the `crds` folder. Otherwise, you might face +errors about new Postgres manifest or configuration options being unknown +to the CRD schema validation. + ## Minor and major version upgrade Minor version upgrades for PostgreSQL are handled via updating the Spilo Docker @@ -157,20 +172,26 @@ from numerous escape characters in the latter log entry, view it in CLI with `PodTemplate` used by the operator is yet to be updated with the default values used internally in K8s. -The operator also support lazy updates of the Spilo image. That means the pod -template of a PG cluster's stateful set is updated immediately with the new -image, but no rolling update follows. This feature saves you a switchover - and -hence downtime - when you know pods are re-started later anyway, for instance -due to the node rotation. To force a rolling update, disable this mode by -setting the `enable_lazy_spilo_upgrade` to `false` in the operator configuration -and restart the operator pod. With the standard eager rolling updates the -operator checks during Sync all pods run images specified in their respective -statefulsets. The operator triggers a rolling upgrade for PG clusters that -violate this condition. - -Changes in $SPILO\_CONFIGURATION under path bootstrap.dcs are ignored when -StatefulSets are being compared, if there are changes under this path, they are -applied through rest api interface and following restart of patroni instance +The StatefulSet is replaced if the following properties change: +- annotations +- volumeClaimTemplates +- template volumes + +The StatefulSet is replaced and a rolling updates is triggered if the following +properties differ between the old and new state: +- container name, ports, image, resources, env, envFrom, securityContext and volumeMounts +- template labels, annotations, service account, securityContext, affinity, priority class and termination grace period + +Note that, changes in `SPILO_CONFIGURATION` env variable under `bootstrap.dcs` +path are ignored for the diff. They will be applied through Patroni's rest api +interface, following a restart of all instances. + +The operator also support lazy updates of the Spilo image. In this case the +StatefulSet is only updated, but no rolling update follows. This feature saves +you a switchover - and hence downtime - when you know pods are re-started later +anyway, for instance due to the node rotation. To force a rolling update, +disable this mode by setting the `enable_lazy_spilo_upgrade` to `false` in the +operator configuration and restart the operator pod. ## Delete protection via annotations @@ -667,6 +688,12 @@ if it ends up in your specified WAL backup path: envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data" ``` +You can also check if Spilo is able to find any backups: + +```bash +envdir "/run/etc/wal-e.d/env" wal-g backup-list +``` + Depending on the cloud storage provider different [environment variables](https://github.com/zalando/spilo/blob/master/ENVIRONMENT.rst) have to be set for Spilo. Not all of them are generated automatically by the operator by changing its configuration. In this case you have to use an @@ -734,8 +761,15 @@ WALE_S3_ENDPOINT='https+path://s3.eu-central-1.amazonaws.com:443' WALE_S3_PREFIX=$WAL_S3_BUCKET/spilo/{WAL_BUCKET_SCOPE_PREFIX}{SCOPE}{WAL_BUCKET_SCOPE_SUFFIX}/wal/{PGVERSION} ``` -If the prefix is not specified Spilo will generate it from `WAL_S3_BUCKET`. -When the `AWS_REGION` is set `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are +The operator sets the prefix to an empty string so that spilo will generate it +from the configured `WAL_S3_BUCKET`. + +:warning: When you overwrite the configuration by defining `WAL_S3_BUCKET` in +the [pod_environment_configmap](#custom-pod-environment-variables) you have +to set `WAL_BUCKET_SCOPE_PREFIX = ""`, too. Otherwise Spilo will not find +the physical backups on restore (next chapter). + +When the `AWS_REGION` is set, `AWS_ENDPOINT` and `WALE_S3_ENDPOINT` are generated automatically. `WALG_S3_PREFIX` is identical to `WALE_S3_PREFIX`. `SCOPE` is the Postgres cluster name. @@ -874,6 +908,36 @@ on one of the other running instances (preferably replicas if they do not lag behind). You can test restoring backups by [cloning](user.md#how-to-clone-an-existing-postgresql-cluster) clusters. +If you need to provide a [custom clone environment](#custom-pod-environment-variables) +copy existing variables about your setup (backup location, prefix, access +keys etc.) and prepend the `CLONE_` prefix to get them copied to the correct +directory within Spilo. + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: postgres-pod-config +data: + AWS_REGION: "eu-west-1" + AWS_ACCESS_KEY_ID: "****" + AWS_SECRET_ACCESS_KEY: "****" + ... + CLONE_AWS_REGION: "eu-west-1" + CLONE_AWS_ACCESS_KEY_ID: "****" + CLONE_AWS_SECRET_ACCESS_KEY: "****" + ... +``` + +### Standby clusters + +The setup for [standby clusters](user.md#setting-up-a-standby-cluster) is very +similar to cloning. At the moment, the operator only allows for streaming from +the S3 WAL archive of the master specified in the manifest. Like with cloning, +if you are using [additional environment variables](#custom-pod-environment-variables) +to access your backup location you have to copy those variables and prepend the +`STANDBY_` prefix for Spilo to find the backups and WAL files to stream. + ## Logical backups The operator can manage K8s cron jobs to run logical backups (SQL dumps) of diff --git a/docs/user.md b/docs/user.md index be7b41cfe..ef3277436 100644 --- a/docs/user.md +++ b/docs/user.md @@ -733,20 +733,21 @@ spec: uid: "efd12e58-5786-11e8-b5a7-06148230260c" cluster: "acid-batman" timestamp: "2017-12-19T12:40:33+01:00" + s3_wal_path: "s3:///spilo///wal/" ``` Here `cluster` is a name of a source cluster that is going to be cloned. A new cluster will be cloned from S3, using the latest backup before the `timestamp`. Note, that a time zone is required for `timestamp` in the format of +00:00 which -is UTC. The `uid` field is also mandatory. The operator will use it to find a -correct key inside an S3 bucket. You can find this field in the metadata of the -source cluster: +is UTC. You can specify the `s3_wal_path` of the source cluster or let the +operator try to find it based on the configured `wal_[s3|gs]_bucket` and the +specified `uid`. You can find the UID of the source cluster in its metadata: ```yaml apiVersion: acid.zalan.do/v1 kind: postgresql metadata: - name: acid-test-cluster + name: acid-batman uid: efd12e58-5786-11e8-b5a7-06148230260c ``` @@ -799,7 +800,7 @@ no statefulset will be created. ```yaml spec: standby: - s3_wal_path: "s3 bucket path to the master" + s3_wal_path: "s3:///spilo///wal/" ``` At the moment, the operator only allows to stream from the WAL archive of the