From 48bcd13010a0fffd32a853b7aa61407cefc340e3 Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin Date: Sat, 12 Oct 2019 12:34:38 +0200 Subject: [PATCH 1/7] Improve documentation * document tags * move dynamic configuration out of `bootstrap.dcs` * document REST API endpoints --- docs/ENVIRONMENT.rst | 15 +++- docs/SETTINGS.rst | 78 +++++++++++------- docs/dynamic_configuration.rst | 146 +-------------------------------- docs/index.rst | 1 + 4 files changed, 61 insertions(+), 179 deletions(-) diff --git a/docs/ENVIRONMENT.rst b/docs/ENVIRONMENT.rst index 7b3517daac..40e122062f 100644 --- a/docs/ENVIRONMENT.rst +++ b/docs/ENVIRONMENT.rst @@ -11,6 +11,9 @@ Global/Universal - **PATRONI\_NAME**: name of the node where the current instance of Patroni is running. Must be unique for the cluster. - **PATRONI\_NAMESPACE**: path within the configuration store where Patroni will keep information about the cluster. Default value: "/service" - **PATRONI\_SCOPE**: cluster name + +Log +--- - **PATRONI\_LOG\_LEVEL**: sets the general logging level. Default value is **INFO** (see `the docs for Python logging `_) - **PATRONI\_LOG\_FORMAT**: sets the log formatting string. Default value is **%(asctime)s %(levelname)s: %(message)s** (see `the LogRecord attributes `_) - **PATRONI\_LOG\_DATEFORMAT**: sets the datetime formatting string. (see the `formatTime() documentation `_) @@ -62,6 +65,10 @@ Etcd - **PATRONI\_ETCD\_CERT**: File with the client certificate. - **PATRONI\_ETCD\_KEY**: File with the client key. Can be empty if the key is part of certificate. +ZooKeeper +--------- +- **PATRONI\_ZOOKEEPER\_HOSTS**: comma separated list of ZooKeeper cluster members: "'host1:port1','host2:port2','etc...'". It is important to quote every single entity! + Exhibitor --------- - **PATRONI\_EXHIBITOR\_HOSTS**: initial list of Exhibitor (ZooKeeper) nodes in format: 'host1,host2,etc...'. This list updates automatically whenever the Exhibitor (ZooKeeper) cluster topology changes. @@ -127,6 +134,8 @@ CTL - **PATRONI\_CTL\_CERTFILE**: Specifies the file with the client certificate in the PEM format. If not provided patronictl will use the value provided for REST API "certfile" parameter. - **PATRONI\_CTL\_KEYFILE**: Specifies the file with the client secret key in the PEM format. If not provided patronictl will use the value provided for REST API "keyfile" parameter. -ZooKeeper ---------- -- **PATRONI\_ZOOKEEPER\_HOSTS**: comma separated list of ZooKeeper cluster members: "'host1:port1','host2:port2','etc...'". It is important to quote every single entity! +CTL +--- +- **PATRONI\_CTL\_INSECURE**: Allow connections to REST API without verifying SSL certs. +- **PATRONI\_CTL\_CACERT**: Specifies the file with the CA_BUNDLE file or directory with certificates of trusted CAs to use while verifying REST API SSL certs. If not provided patronictl will use the value provided for REST API "cacert" parameter. +- **PATRONI\_CTL\_CERTFILE**: Specifies the file with the certificate in the PEM format to use while verifying REST API SSL certs. If not provided patronictl will use the value provided for REST API "certfile" parameter. diff --git a/docs/SETTINGS.rst b/docs/SETTINGS.rst index 139696c3af..d7bd5b411f 100644 --- a/docs/SETTINGS.rst +++ b/docs/SETTINGS.rst @@ -4,6 +4,39 @@ YAML Configuration Settings =========================== +.. _dynamic_configuration_settings: + +Dynamic configuration settings +------------------------------ + +Dynamic configuration is stored in the DCS (Distributed Configuration Store) and applied on all cluster nodes. Some parameters, like **loop_wait**, **ttl**, **postgresql.parameters.max_connections**, **postgresql.parameters.max_worker_processes** and so on could be set only in the dynamic configuration. Some other parameters like **postgresql.listen**, **postgresql.data_dir** could be set only locally, i.e. in the Patroni config file or via :ref:`configuration ` variable. In most cases the local configuration will override the dynamic configuration. In order to change the dynamic configuration you can use either ``patronictl edit-config`` tool or Patroni :ref:`REST API `. + +- **loop\_wait**: the number of seconds the loop will sleep. Default value: 10 +- **ttl**: the TTL to acquire the leader lock. Think of it as the length of time before initiation of the automatic failover process. Default value: 30 +- **retry\_timeout**: timeout for DCS and PostgreSQL operation retries. DCS or network issues shorter than this will not cause Patroni to demote the leader. Default value: 10 +- **maximum\_lag\_on\_failover**: the maximum bytes a follower may lag to be able to participate in leader election. +- **master\_start\_timeout**: the amount of time a master is allowed to recover from failures before failover is triggered. Default is 300 seconds. When set to 0 failover is done immediately after a crash is detected if possible. When using asynchronous replication a failover can cause lost transactions. Best worst case failover time for master failure is: loop\_wait + master\_start\_timeout + loop\_wait, unless master\_start\_timeout is zero, in which case it's just loop\_wait. Set the value according to your durability/availability tradeoff. +- **synchronous\_mode**: turns on synchronous replication mode. In this mode a replica will be chosen as synchronous and only the latest leader and synchronous replica are able to participate in leader election. Synchronous mode makes sure that successfully committed transactions will not be lost at failover, at the cost of losing availability for writes when Patroni cannot ensure transaction durability. See :ref:`replication modes documentation ` for details. +- **synchronous\_mode\_strict**: prevents disabling synchronous replication if no synchronous replicas are available, blocking all client writes to the master. See :ref:`replication modes documentation ` for details. +- **postgresql**: + - **use\_pg\_rewind**: whether or not to use pg_rewind + - **use\_slots**: whether or not to use replication_slots. + - **recovery\_conf**: additional configuration settings written to recovery.conf when configuring follower. + - **parameters**: list of configuration settings for Postgres. +- **standby\_cluster**: if this section is defined, we want to bootstrap a standby cluster. + - **host**: an address of remote master + - **port**: a port of remote master + - **primary\_slot\_name**: which slot on the remote master to use for replication. This parameter is optional, the default value is derived from the instance name (see function `slot_name_from_member_name`). + - **create\_replica\_methods**: an ordered list of methods that can be used to bootstrap standby leader from the remote master, can be different from the list defined in :ref:`postgresql_settings` + - **restore\_command**: command to restore WAL records from the remote master to standby leader, can be different from the list defined in :ref:`postgresql_settings` + - **archive\_cleanup\_command**: cleanup command for standby leader + - **recovery\_min\_apply\_delay**: how long to wait before actually apply WAL records on a standby leader +- **slots**: define permanent replication slots. These slots will be preserved during switchover/failover. Patroni will try to create slots before opening connections to the cluster. + - **my_slot_name**: the name of replication slot. It is the responsibility of the operator to make sure that there are no clashes in names between replication slots automatically created by Patroni for members and permanent replication slots. + - **type**: slot type. Could be ``physical`` or ``logical``. If the slot is logical, you have to additionally define ``database`` and ``plugin``. + - **database**: the database name where logical slots should be created. + - **plugin**: the plugin name for the logical slot. + Global/Universal ---------------- - **name**: the name of the host. Must be unique for the cluster. @@ -27,32 +60,7 @@ Log Bootstrap configuration ----------------------- -- **dcs**: This section will be written into `///config` of a given configuration store after initializing of new cluster. This is the global configuration for the cluster. If you want to change some parameters for all cluster nodes - just do it in DCS (or via Patroni API) and all nodes will apply this configuration. - - **loop\_wait**: the number of seconds the loop will sleep. Default value: 10 - - **ttl**: the TTL to acquire the leader lock. Think of it as the length of time before initiation of the automatic failover process. Default value: 30 - - **retry\_timeout**: timeout for DCS and PostgreSQL operation retries. DCS or network issues shorter than this will not cause Patroni to demote the leader. Default value: 10 - - **maximum\_lag\_on\_failover**: the maximum bytes a follower may lag to be able to participate in leader election. - - **master\_start\_timeout**: the amount of time a master is allowed to recover from failures before failover is triggered. Default is 300 seconds. When set to 0 failover is done immediately after a crash is detected if possible. When using asynchronous replication a failover can cause lost transactions. Best worst case failover time for master failure is: loop\_wait + master\_start\_timeout + loop\_wait, unless master\_start\_timeout is zero, in which case it's just loop\_wait. Set the value according to your durability/availability tradeoff. - - **synchronous\_mode**: turns on synchronous replication mode. In this mode a replica will be chosen as synchronous and only the latest leader and synchronous replica are able to participate in leader election. Synchronous mode makes sure that successfully committed transactions will not be lost at failover, at the cost of losing availability for writes when Patroni cannot ensure transaction durability. See :ref:`replication modes documentation ` for details. - - **synchronous\_mode\_strict**: prevents disabling synchronous replication if no synchronous replicas are available, blocking all client writes to the master. See :ref:`replication modes documentation ` for details. - - **postgresql**: - - **use\_pg\_rewind**: whether or not to use pg_rewind - - **use\_slots**: whether or not to use replication_slots. Must be False for PostgreSQL 9.3. You should comment out max_replication_slots before it becomes ineligible for leader status. - - **recovery\_conf**: additional configuration settings written to recovery.conf when configuring follower. - - **parameters**: list of configuration settings for Postgres. Many of these are required for replication to work. - - **standby\_cluster**: if this section is defined, we want to bootstrap a standby cluster. - - **host**: an address of remote master - - **port**: a port of remote master - - **primary\_slot\_name**: which slot on the remote master to use for replication. This parameter is optional, the default value is derived from the instance name (see function `slot_name_from_member_name`). - - **create\_replica\_methods**: an ordered list of methods that can be used to bootstrap standby leader from the remote master, can be different from the list defined in :ref:`postgresql_settings` - - **restore\_command**: command to restore WAL records from the remote master to standby leader, can be different from the list defined in :ref:`postgresql_settings` - - **archive\_cleanup\_command**: cleanup command for standby leader - - **recovery\_min\_apply\_delay**: how long to wait before actually apply WAL records on a standby leader - - **slots**: define permanent replication slots. These slots will be preserved during switchover/failover. Patroni will try to create slots before opening connections to the cluster. - - **my_slot_name**: the name of replication slot. It is the responsibility of the operator to make sure that there are no clashes in names between replication slots automatically created by Patroni for members and permanent replication slots. - - **type**: slot type. Could be ``physical`` or ``logical``. If the slot is logical, you have to additionally define ``database`` and ``plugin``. - - **database**: the database name where logical slots should be created. - - **plugin**: the plugin name for the logical slot. +- **dcs**: This section will be written into `///config` of a given configuration store after initializing of new cluster. The global dynamic configuration for the cluster. Under the ``bootstrap.dcs`` you can put any of the parameters described in the :ref:`Dynamic Configuration settings ` and after Patroni initialized (bootstrapped) the new cluster it will write this section into `///config` of a given configuration store. All later changes of ``bootstrap.dcs`` will not take any effect! If you want to change them please use either ``patronictl edit-config`` or Patroni :ref:`REST API `. - **method**: custom script to use for bootstrapping this cluster. See :ref:`custom bootstrap methods documentation ` for details. When ``initdb`` is specified revert to the default ``initdb`` command. ``initdb`` is also triggered when no ``method`` @@ -110,6 +118,10 @@ Most of the parameters are optional, but you have to specify one of the **host** - **cert**: (optional) file with the client certificate. - **key**: (optional) file with the client key. Can be empty if the key is part of **cert**. +ZooKeeper +---------- +- **hosts**: list of ZooKeeper cluster members in format: ['host1:port1', 'host2:port2', 'etc...']. + Exhibitor --------- - **hosts**: initial list of Exhibitor (ZooKeeper) nodes in format: 'host1,host2,etc...'. This list updates automatically whenever the Exhibitor (ZooKeeper) cluster topology changes. @@ -190,7 +202,7 @@ PostgreSQL REST API -------- -- **connect\_address**: IP address (or hostname) and port, to access the Patroni's REST API. All the members of the cluster must be able to connect to this address, so unless the Patroni setup is intended for a demo inside the localhost, this address must be a non "localhost" or loopback addres (ie: "localhost" or "127.0.0.1"). It can serve as a endpoint for HTTP health checks (read below about the "listen" REST API parameter), and also for user queries (either directly or via the REST API), as well as for the health checks done by the cluster members during leader elections (for example, to determine whether the master is still running, or if there is a node which has a WAL position that is ahead of the one doing the query; etc.) The connect_address is put in the member key in DCS, making it possible to translate the member name into the address to connect to its REST API. +- **connect\_address**: IP address (or hostname) and port, to access the Patroni's :ref:`REST API `. All the members of the cluster must be able to connect to this address, so unless the Patroni setup is intended for a demo inside the localhost, this address must be a non "localhost" or loopback addres (ie: "localhost" or "127.0.0.1"). It can serve as a endpoint for HTTP health checks (read below about the "listen" REST API parameter), and also for user queries (either directly or via the REST API), as well as for the health checks done by the cluster members during leader elections (for example, to determine whether the master is still running, or if there is a node which has a WAL position that is ahead of the one doing the query; etc.) The connect_address is put in the member key in DCS, making it possible to translate the member name into the address to connect to its REST API. - **listen**: IP address (or hostname) and port that Patroni will listen to for the REST API - to provide also the same health checks and cluster messaging between the participating nodes, as described above. to provide health-check information for HAProxy (or any other load balancer capable of doing a HTTP "OPTION" or "GET" checks). @@ -214,12 +226,16 @@ CTL - **certfile**: Specifies the file with the client certificate in the PEM format. If not provided patronictl will use the value provided for REST API "certfile" parameter. - **keyfile**: Specifies the file with the client secret key in the PEM format. If not provided patronictl will use the value provided for REST API "keyfile" parameter. -ZooKeeper ----------- -- **hosts**: list of ZooKeeper cluster members in format: ['host1:port1', 'host2:port2', 'etc...']. - Watchdog -------- - **mode**: ``off``, ``automatic`` or ``required``. When ``off`` watchdog is disabled. When ``automatic`` watchdog will be used if available, but ignored if it is not. When ``required`` the node will not become a leader unless watchdog can be successfully enabled. - **device**: Path to watchdog device. Defaults to ``/dev/watchdog``. - **safety_margin**: Number of seconds of safety margin between watchdog triggering and leader key expiration. + +Tags +---- +- **nofailover**: ``true`` or ``false``, controls whether this node is allowed to participate in the leader race and become a leader. Defaults to ``false`` +- **clonefrom**: ``true`` or ``false``. If set to ``true`` other nodes might prefer to use this node for bootstrap (take ``pg_basebackup`` from). If there are several nodes with ``clonefrom`` tag set to ``true`` the node to bootstrap from will be chosen randomly. The default value is ``false``. +- **noloadbalance**: ``true`` or ``false``. If set to ``true`` the node will return HTTP Status Code 503 for the ``GET /replica`` REST API health-check and therefore will be excluded from the load-balancing. Defaults to ``false``. +- **replicatefrom**: The IP address/hostname of another replica. Used to support cascading replication. +- **nosync**: ``true`` or ``false``. If set to ``true`` the node will never be selected as a synchronous replica. diff --git a/docs/dynamic_configuration.rst b/docs/dynamic_configuration.rst index c3c94a88c6..453ab13c4e 100644 --- a/docs/dynamic_configuration.rst +++ b/docs/dynamic_configuration.rst @@ -14,7 +14,7 @@ Patroni configuration is stored in the DCS (Distributed Configuration Store). Th - Local :ref:`configuration ` (patroni.yml). These options are defined in the configuration file and take precedence over dynamic configuration. - patroni.yml could be changed and reload in runtime (without restart of Patroni) by sending SIGHUP to the Patroni process or by performing ``POST /reload`` REST-API request. + patroni.yml could be changed and reload in runtime (without restart of Patroni) by sending SIGHUP to the Patroni process, performing ``POST /reload`` REST-API request or executing ``patronictl reload``. - Environment :ref:`configuration `. It is possible to set/override some of the "Local" configuration parameters with environment variables. @@ -83,147 +83,3 @@ Upon changing these options, Patroni will read the relevant section of the confi run-time values. Patroni nodes are dumping the state of the DCS options to disk upon for every change of the configuration into the file ``patroni.dynamic.json`` located in the Postgres data directory. Only the master is allowed to restore these options from the on-disk dump if these are completely absent from the DCS or if they are invalid. - -REST API -======== - -We provide a REST API endpoint for working with dynamic configuration. - -GET /config ------------ -Get current version of dynamic configuration. - -.. code-block:: bash - - $ curl -s localhost:8008/config | jq . - { - "ttl": 30, - "loop_wait": 10, - "retry_timeout": 10, - "maximum_lag_on_failover": 1048576, - "postgresql": { - "use_slots": true, - "use_pg_rewind": true, - "parameters": { - "hot_standby": "on", - "wal_log_hints": "on", - "wal_keep_segments": 8, - "wal_level": "hot_standby", - "max_wal_senders": 5, - "max_replication_slots": 5, - "max_connections": "100" - } - } - } - -PATCH /config -------------- -Change existing configuration. - -.. code-block:: bash - - $ curl -s -XPATCH -d \ - '{"loop_wait":5,"ttl":20,"postgresql":{"parameters":{"max_connections":"101"}}}' \ - http://localhost:8008/config | jq . - { - "ttl": 20, - "loop_wait": 5, - "maximum_lag_on_failover": 1048576, - "retry_timeout": 10, - "postgresql": { - "use_slots": true, - "use_pg_rewind": true, - "parameters": { - "hot_standby": "on", - "wal_log_hints": "on", - "wal_keep_segments": 8, - "wal_level": "hot_standby", - "max_wal_senders": 5, - "max_replication_slots": 5, - "max_connections": "101" - } - } - } - -The above REST API call patches the existing configuration and returns the new configuration. - -Let's check that the node processed this configuration. First of all it should start printing log lines every 5 seconds (loop_wait=5). The change of "max_connections" requires a restart, so the "restart_pending" flag should be exposed: - -.. code-block:: bash - - $ curl -s http://localhost:8008/patroni | jq . - { - "pending_restart": true, - "database_system_identifier": "6287881213849985952", - "postmaster_start_time": "2016-06-13 13:13:05.211 CEST", - "xlog": { - "location": 2197818976 - }, - "patroni": { - "scope": "batman", - "version": "1.0" - }, - "state": "running", - "role": "master", - "server_version": 90503 - } - -Removing parameters: - -If you want to remove (reset) some setting just patch it with ``null``: - -.. code-block:: bash - - $ curl -s -XPATCH -d \ - '{"postgresql":{"parameters":{"max_connections":null}}}' \ - http://localhost:8008/config | jq . - { - "ttl": 20, - "loop_wait": 5, - "retry_timeout": 10, - "maximum_lag_on_failover": 1048576, - "postgresql": { - "use_slots": true, - "use_pg_rewind": true, - "parameters": { - "hot_standby": "on", - "unix_socket_directories": ".", - "wal_keep_segments": 8, - "wal_level": "hot_standby", - "wal_log_hints": "on", - "max_wal_senders": 5, - "max_replication_slots": 5 - } - } - } - -Above call removes ``postgresql.parameters.max_connections`` from the dynamic configuration. - -PUT /config ------------ - -It's also possible to perform the full rewrite of an existing dynamic configuration unconditionally: - -.. code-block:: bash - - $ curl -s -XPUT -d \ - '{"maximum_lag_on_failover":1048576,"retry_timeout":10,"postgresql":{"use_slots":true,"use_pg_rewind":true,"parameters":{"hot_standby":"on","wal_log_hints":"on","wal_keep_segments":8,"wal_level":"hot_standby","unix_socket_directories":".","max_wal_senders":5}},"loop_wait":3,"ttl":20}' \ - http://localhost:8008/config | jq . - { - "ttl": 20, - "maximum_lag_on_failover": 1048576, - "retry_timeout": 10, - "postgresql": { - "use_slots": true, - "parameters": { - "hot_standby": "on", - "unix_socket_directories": ".", - "wal_keep_segments": 8, - "wal_level": "hot_standby", - "wal_log_hints": "on", - "max_wal_senders": 5 - }, - "use_pg_rewind": true - }, - "loop_wait": 3 - } diff --git a/docs/index.rst b/docs/index.rst index 3e91ef62e4..8b77cfe1e3 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -19,6 +19,7 @@ We call Patroni a "template" because it is far from being a one-size-fits-all or README dynamic_configuration + rest_api ENVIRONMENT SETTINGS replica_bootstrap From 2c61d44d4c33b0be7310713d6852fac8b8dcbcab Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin Date: Tue, 29 Oct 2019 11:57:33 +0100 Subject: [PATCH 2/7] Remove duplicate section --- docs/ENVIRONMENT.rst | 6 ------ 1 file changed, 6 deletions(-) diff --git a/docs/ENVIRONMENT.rst b/docs/ENVIRONMENT.rst index 40e122062f..6ff1c24f1e 100644 --- a/docs/ENVIRONMENT.rst +++ b/docs/ENVIRONMENT.rst @@ -133,9 +133,3 @@ CTL - **PATRONI\_CTL\_CACERT**: Specifies the file with the CA_BUNDLE file or directory with certificates of trusted CAs to use while verifying REST API SSL certs. If not provided patronictl will use the value provided for REST API "cafile" parameter. - **PATRONI\_CTL\_CERTFILE**: Specifies the file with the client certificate in the PEM format. If not provided patronictl will use the value provided for REST API "certfile" parameter. - **PATRONI\_CTL\_KEYFILE**: Specifies the file with the client secret key in the PEM format. If not provided patronictl will use the value provided for REST API "keyfile" parameter. - -CTL ---- -- **PATRONI\_CTL\_INSECURE**: Allow connections to REST API without verifying SSL certs. -- **PATRONI\_CTL\_CACERT**: Specifies the file with the CA_BUNDLE file or directory with certificates of trusted CAs to use while verifying REST API SSL certs. If not provided patronictl will use the value provided for REST API "cacert" parameter. -- **PATRONI\_CTL\_CERTFILE**: Specifies the file with the certificate in the PEM format to use while verifying REST API SSL certs. If not provided patronictl will use the value provided for REST API "certfile" parameter. From 33028b096ac6b65380eddc940df54499b683ccc9 Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin Date: Tue, 29 Oct 2019 12:22:18 +0100 Subject: [PATCH 3/7] Add forgotten rest_api.rst --- docs/rest_api.rst | 335 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 335 insertions(+) create mode 100644 docs/rest_api.rst diff --git a/docs/rest_api.rst b/docs/rest_api.rst new file mode 100644 index 0000000000..ecd773d8aa --- /dev/null +++ b/docs/rest_api.rst @@ -0,0 +1,335 @@ +.. _rest_api: + +Patroni REST API +================ + +Patroni has a reach REST API, which is used by Patroni itself during the leader race, by ``patronictl`` tool in order to perform failovers/switchovers/reinitialize/restarts/reloads, by HAProxy or any other kind of load-balancer to perform HTTP health-checks and of course of could also be used for monitoring. Below you will find the list of Patroni REST API endpoints. + +Health-check endpoints +---------------------- +For all health-check ``GET`` requests along with HTTP status code Patroni returns a JSON document with the status of the node. If you don't want or don't need the JSON document you might consider using the ``OPTIONS`` method instead of ``GET``. + +- Following requests to Patroni REST API will return HTTP status code **200** only when the Patroni node is running as the leader + + - ``GET /`` + - ``GET /master`` + - ``GET /leader`` + - ``GET /primary`` + - ``GET /read-write`` + +- ``GET /replica``: replica health-check endpoint. will return HTTP status code **200** only when the Patroni node is in the state ``running``, the role is ``replica`` and ``noloadbalance`` tag is not set. + +- ``GET /read-only``: like the above endpoint, but also includes the primary. + +- ``GET /standby-leader``: returns HTTP status code **200** only when the Patroni node is running as the leader in the :ref:`standby cluster `. + +- ``GET /synchronous`` or ``GET /sync``: returns HTTP status code **200** only when the Patroni node is running as a synchronous standby. + +- ``GET /asynchronous`` or ``GET /async``: returns HTTP status code **200** only when the Patroni node is running as an asynchronous standby. + +- ``GET /health``: returns HTTP status code **200** only when PostgreSQL is up and running. + +Monitoring endpoint +------------------- + +The ``GET /patroni`` is used by Patroni during the leader race. It also could be used by your monitoring system. The JSON document produced by this endpoint has the same structure and the JSON produced by health-check endpoints. + +.. code-block:: bash + + $ curl -s http://localhost:8008/patroni | jq . + { + "state": "running", + "postmaster_start_time": "2019-09-24 09:22:32.555 CEST", + "role": "master", + "server_version": 110005, + "cluster_unlocked": false, + "xlog": { + "location": 25624640 + }, + "timeline": 3, + "database_system_identifier": "6739877027151648096", + "patroni": { + "version": "1.6.0", + "scope": "batman" + } + } + +Cluster status endpoints +------------------------ + +- The ``GET /cluster`` endpoint generates JSON document describing the current cluster topology and state. + +.. code-block:: bash + + $ curl -s http://localhost:8008/cluster | jq . + { + "members": [ + { + "name": "postgresql0", + "host": "127.0.0.1", + "port": 5432, + "role": "leader", + "state": "running", + "api_url": "http://127.0.0.1:8008/patroni", + "timeline": 5, + "tags": { + "clonefrom": true + } + }, + { + "name": "postgresql1", + "host": "127.0.0.1", + "port": 5433, + "role": "replica", + "state": "running", + "api_url": "http://127.0.0.1:8009/patroni", + "timeline": 5, + "tags": { + "clonefrom": true + }, + "lag": 0 + } + ], + "scheduled_switchover": { + "at": "2019-09-24T10:36:00+02:00", + "from": "postgresql0" + } + } + + +- The ``GET /history`` endpoint provides a view on the history of cluster switchovers/failovers. The format is very similar to the content of history files in the ``pg_wal`` directory. The only difference in the timestamp field showing when the new timeline was created. + +.. code-block:: bash + + $ curl -s http://localhost:8008/history | jq . + [ + [ + 1, + 25623960, + "no recovery target specified", + "2019-09-23T16:57:57+02:00" + ], + [ + 2, + 25624344, + "no recovery target specified", + "2019-09-24T09:22:33+02:00" + ], + [ + 3, + 25624752, + "no recovery target specified", + "2019-09-24T09:26:15+02:00" + ], + [ + 4, + 50331856, + "no recovery target specified", + "2019-09-24T09:35:52+02:00" + ] + ] + + +Config endpoint +--------------- + +``GET /config``: Get the current version of the dynamic configuration. + +.. code-block:: bash + + $ curl -s localhost:8008/config | jq . + { + "ttl": 30, + "loop_wait": 10, + "retry_timeout": 10, + "maximum_lag_on_failover": 1048576, + "postgresql": { + "use_slots": true, + "use_pg_rewind": true, + "parameters": { + "hot_standby": "on", + "wal_log_hints": "on", + "wal_keep_segments": 8, + "wal_level": "hot_standby", + "max_wal_senders": 5, + "max_replication_slots": 5, + "max_connections": "100" + } + } + } + + +``PATCH /config``: Change the existing configuration. + +.. code-block:: bash + + $ curl -s -XPATCH -d \ + '{"loop_wait":5,"ttl":20,"postgresql":{"parameters":{"max_connections":"101"}}}' \ + http://localhost:8008/config | jq . + { + "ttl": 20, + "loop_wait": 5, + "maximum_lag_on_failover": 1048576, + "retry_timeout": 10, + "postgresql": { + "use_slots": true, + "use_pg_rewind": true, + "parameters": { + "hot_standby": "on", + "wal_log_hints": "on", + "wal_keep_segments": 8, + "wal_level": "hot_standby", + "max_wal_senders": 5, + "max_replication_slots": 5, + "max_connections": "101" + } + } + } + +The above REST API call patches the existing configuration and returns the new configuration. + +Let's check that the node processed this configuration. First of all it should start printing log lines every 5 seconds (loop_wait=5). The change of "max_connections" requires a restart, so the "restart_pending" flag should be exposed: + +.. code-block:: bash + + $ curl -s http://localhost:8008/patroni | jq . + { + "pending_restart": true, + "database_system_identifier": "6287881213849985952", + "postmaster_start_time": "2016-06-13 13:13:05.211 CEST", + "xlog": { + "location": 2197818976 + }, + "patroni": { + "scope": "batman", + "version": "1.0" + }, + "state": "running", + "role": "master", + "server_version": 90503 + } + +Removing parameters: + +If you want to remove (reset) some setting just patch it with ``null``: + +.. code-block:: bash + + $ curl -s -XPATCH -d \ + '{"postgresql":{"parameters":{"max_connections":null}}}' \ + http://localhost:8008/config | jq . + { + "ttl": 20, + "loop_wait": 5, + "retry_timeout": 10, + "maximum_lag_on_failover": 1048576, + "postgresql": { + "use_slots": true, + "use_pg_rewind": true, + "parameters": { + "hot_standby": "on", + "unix_socket_directories": ".", + "wal_keep_segments": 8, + "wal_level": "hot_standby", + "wal_log_hints": "on", + "max_wal_senders": 5, + "max_replication_slots": 5 + } + } + } + +Above call removes ``postgresql.parameters.max_connections`` from the dynamic configuration. + +``PUT /config``: It's also possible to perform the full rewrite of an existing dynamic configuration unconditionally: + +.. code-block:: bash + + $ curl -s -XPUT -d \ + '{"maximum_lag_on_failover":1048576,"retry_timeout":10,"postgresql":{"use_slots":true,"use_pg_rewind":true,"parameters":{"hot_standby":"on","wal_log_hints":"on","wal_keep_segments":8,"wal_level":"hot_standby","unix_socket_directories":".","max_wal_senders":5}},"loop_wait":3,"ttl":20}' \ + http://localhost:8008/config | jq . + { + "ttl": 20, + "maximum_lag_on_failover": 1048576, + "retry_timeout": 10, + "postgresql": { + "use_slots": true, + "parameters": { + "hot_standby": "on", + "unix_socket_directories": ".", + "wal_keep_segments": 8, + "wal_level": "hot_standby", + "wal_log_hints": "on", + "max_wal_senders": 5 + }, + "use_pg_rewind": true + }, + "loop_wait": 3 + } + + +Switchover and Failover endpoints +--------------------------------- + +``POST /switchover`` or ``POST /failover``. These endpoints are very similar to each other. There are a couple of minor differences though: + +1. The failover endpoint allows to perform a manual failover when there are no healthy nodes, but at the same time it will not allow you to ``schedule`` a switchover. + +2. The switchover endpoint is the opposite. It works only when the cluster is healthy (there is the leader) and allows to ``schedule`` a switchover at a given time. + + +In the JSON body of ``POST`` request you must specify at least the ``leader`` or ``candidate`` fields and optionally the ``scheduled_at`` field if you want to schedule a switchover at a specific time. + + +Example: perform a failover to the specific node + +.. code-block:: bash + + $ curl -s http://localhost:8009/failover -XPOST -d '{"candidate":"postgresql1"}' + Successfully failed over to "postgresql1" + + +Example: schedule a switchover from leader to any other healthy replica in the cluster at a specific time. + +.. code-block:: bash + + $ curl -s http://localhost:8008/switchover -XPOST -d \ + '{"leader":"postgresql0","scheduled_at":"2019-09-24T12:00+00"}' + Switchover scheduled + + +Depending on the situation the request might finish with a different HTTP status code and body. The status code **200** is returned when the switchover or failover successfully completed. If the switchover was successfully scheduled, Patroni will return HTTP status code **202**. In case if something went wrong the error status code (one of **400**, **412** or **503**) will be returned with some details in the response body. For more information please check the source code of ``patroni/api.py:do_POST_failover()`` method. + +The switchover and failover endpoints are used by ``patronictl switchover`` and ``patronictl failover`` respectively. + + +Restart endpoint +---------------- + +- ``POST /restart``: You can restart the postgres on the specific node by performing the ``POST /restart`` call. In the JSON body of ``POST`` request it is possible to optionally specify some restart conditions: + + - **restart_pending**: boolean, if set to ``true`` Patroni will restart PostgreSQL only when restart is pending in order to apply some changes in the PostgreSQL config. + - **role**: perform restart only if the current role of the node matches with the role from the POST request. + - **postgres_version**: perform restart only if the current version of postgres is smaller than specified in the POST request. + - **timeout**: how long should we wait before PostgreSQL starts accepting connections. Overrides ``master_start_timeout``. + - **schedule**: timestamp with time zone, schedule the restart somewhere in the future. + +- ``DELETE /restart``: delete the scheduled restart + +The restart endpoint is used by ``patronictl restart``. + + +Reload endpoint +--------------- + +The ``POST /reload`` call will order Patroni to re-read and apply the configuration file. The equivalent of sending the ``SIGHUP`` signal to the Patroni process. In case if you changed some of postgres parameters which require a restart (like for example **shared_buffers**), you still have to explicitly do the restart of postgres by either calling ``POST /restart`` endpoint of with the help of ``patronictl restart`` + +The reload endpoint is used by ``patronictl reload``. + + +Reinitialize endpoint +--------------------- + +``POST /reinitialize`` - reinitialize the PostgreSQL data directory on specified node. Is allowed to be executed only on the replica. Once called it will remove the data directory and start ``pg_basebackup`` or some alternative :ref:`replica creation method `. +The call might fail if Patroni is in a loop trying to recover(restart) failed postgres. In order to overcome this problem one can specify ``{"force":true}`` in the request body. + +The reinitialize endpoint is used by ``patronictl reinitialize``. From 9f257d2ee6ed8866ed21532d5ff0dc41fb5257f4 Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin Date: Tue, 29 Oct 2019 12:22:39 +0100 Subject: [PATCH 4/7] Correct grammar --- docs/SETTINGS.rst | 12 ++++++------ docs/dynamic_configuration.rst | 2 +- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/SETTINGS.rst b/docs/SETTINGS.rst index d7bd5b411f..f72ec5131c 100644 --- a/docs/SETTINGS.rst +++ b/docs/SETTINGS.rst @@ -12,16 +12,16 @@ Dynamic configuration settings Dynamic configuration is stored in the DCS (Distributed Configuration Store) and applied on all cluster nodes. Some parameters, like **loop_wait**, **ttl**, **postgresql.parameters.max_connections**, **postgresql.parameters.max_worker_processes** and so on could be set only in the dynamic configuration. Some other parameters like **postgresql.listen**, **postgresql.data_dir** could be set only locally, i.e. in the Patroni config file or via :ref:`configuration ` variable. In most cases the local configuration will override the dynamic configuration. In order to change the dynamic configuration you can use either ``patronictl edit-config`` tool or Patroni :ref:`REST API `. - **loop\_wait**: the number of seconds the loop will sleep. Default value: 10 -- **ttl**: the TTL to acquire the leader lock. Think of it as the length of time before initiation of the automatic failover process. Default value: 30 -- **retry\_timeout**: timeout for DCS and PostgreSQL operation retries. DCS or network issues shorter than this will not cause Patroni to demote the leader. Default value: 10 +- **ttl**: the TTL to acquire the leader lock (in seconds). Think of it as the length of time before initiation of the automatic failover process. Default value: 30 +- **retry\_timeout**: timeout for DCS and PostgreSQL operation retries (in seconds). DCS or network issues shorter than this will not cause Patroni to demote the leader. Default value: 10 - **maximum\_lag\_on\_failover**: the maximum bytes a follower may lag to be able to participate in leader election. -- **master\_start\_timeout**: the amount of time a master is allowed to recover from failures before failover is triggered. Default is 300 seconds. When set to 0 failover is done immediately after a crash is detected if possible. When using asynchronous replication a failover can cause lost transactions. Best worst case failover time for master failure is: loop\_wait + master\_start\_timeout + loop\_wait, unless master\_start\_timeout is zero, in which case it's just loop\_wait. Set the value according to your durability/availability tradeoff. +- **master\_start\_timeout**: the amount of time a master is allowed to recover from failures before failover is triggered (in seconds). Default is 300 seconds. When set to 0 failover is done immediately after a crash is detected if possible. When using asynchronous replication a failover can cause lost transactions. Worst case failover time for master failure is: loop\_wait + master\_start\_timeout + loop\_wait, unless master\_start\_timeout is zero, in which case it's just loop\_wait. Set the value according to your durability/availability tradeoff. - **synchronous\_mode**: turns on synchronous replication mode. In this mode a replica will be chosen as synchronous and only the latest leader and synchronous replica are able to participate in leader election. Synchronous mode makes sure that successfully committed transactions will not be lost at failover, at the cost of losing availability for writes when Patroni cannot ensure transaction durability. See :ref:`replication modes documentation ` for details. - **synchronous\_mode\_strict**: prevents disabling synchronous replication if no synchronous replicas are available, blocking all client writes to the master. See :ref:`replication modes documentation ` for details. - **postgresql**: - **use\_pg\_rewind**: whether or not to use pg_rewind - **use\_slots**: whether or not to use replication_slots. - - **recovery\_conf**: additional configuration settings written to recovery.conf when configuring follower. + - **recovery\_conf**: additional configuration settings written to recovery.conf when configuring follower. There is no recovery.conf anymore in PostgreSQL 12, but you may continue using this section, because Patroni handles it transparently. - **parameters**: list of configuration settings for Postgres. - **standby\_cluster**: if this section is defined, we want to bootstrap a standby cluster. - **host**: an address of remote master @@ -60,7 +60,7 @@ Log Bootstrap configuration ----------------------- -- **dcs**: This section will be written into `///config` of a given configuration store after initializing of new cluster. The global dynamic configuration for the cluster. Under the ``bootstrap.dcs`` you can put any of the parameters described in the :ref:`Dynamic Configuration settings ` and after Patroni initialized (bootstrapped) the new cluster it will write this section into `///config` of a given configuration store. All later changes of ``bootstrap.dcs`` will not take any effect! If you want to change them please use either ``patronictl edit-config`` or Patroni :ref:`REST API `. +- **dcs**: This section will be written into `///config` of the given configuration store after initializing of new cluster. The global dynamic configuration for the cluster. Under the ``bootstrap.dcs`` you can put any of the parameters described in the :ref:`Dynamic Configuration settings ` and after Patroni initialized (bootstrapped) the new cluster, it will write this section into `///config` of the configuration store. All later changes of ``bootstrap.dcs`` will not take any effect! If you want to change them please use either ``patronictl edit-config`` or Patroni :ref:`REST API `. - **method**: custom script to use for bootstrapping this cluster. See :ref:`custom bootstrap methods documentation ` for details. When ``initdb`` is specified revert to the default ``initdb`` command. ``initdb`` is also triggered when no ``method`` @@ -202,7 +202,7 @@ PostgreSQL REST API -------- -- **connect\_address**: IP address (or hostname) and port, to access the Patroni's :ref:`REST API `. All the members of the cluster must be able to connect to this address, so unless the Patroni setup is intended for a demo inside the localhost, this address must be a non "localhost" or loopback addres (ie: "localhost" or "127.0.0.1"). It can serve as a endpoint for HTTP health checks (read below about the "listen" REST API parameter), and also for user queries (either directly or via the REST API), as well as for the health checks done by the cluster members during leader elections (for example, to determine whether the master is still running, or if there is a node which has a WAL position that is ahead of the one doing the query; etc.) The connect_address is put in the member key in DCS, making it possible to translate the member name into the address to connect to its REST API. +- **connect\_address**: IP address (or hostname) and port, to access the Patroni's :ref:`REST API `. All the members of the cluster must be able to connect to this address, so unless the Patroni setup is intended for a demo inside the localhost, this address must be a non "localhost" or loopback address (ie: "localhost" or "127.0.0.1"). It can serve as an endpoint for HTTP health checks (read below about the "listen" REST API parameter), and also for user queries (either directly or via the REST API), as well as for the health checks done by the cluster members during leader elections (for example, to determine whether the master is still running, or if there is a node which has a WAL position that is ahead of the one doing the query; etc.) The connect_address is put in the member key in DCS, making it possible to translate the member name into the address to connect to its REST API. - **listen**: IP address (or hostname) and port that Patroni will listen to for the REST API - to provide also the same health checks and cluster messaging between the participating nodes, as described above. to provide health-check information for HAProxy (or any other load balancer capable of doing a HTTP "OPTION" or "GET" checks). diff --git a/docs/dynamic_configuration.rst b/docs/dynamic_configuration.rst index 453ab13c4e..85c7cb1de5 100644 --- a/docs/dynamic_configuration.rst +++ b/docs/dynamic_configuration.rst @@ -14,7 +14,7 @@ Patroni configuration is stored in the DCS (Distributed Configuration Store). Th - Local :ref:`configuration ` (patroni.yml). These options are defined in the configuration file and take precedence over dynamic configuration. - patroni.yml could be changed and reload in runtime (without restart of Patroni) by sending SIGHUP to the Patroni process, performing ``POST /reload`` REST-API request or executing ``patronictl reload``. + patroni.yml could be changed and reloaded in runtime (without restart of Patroni) by sending SIGHUP to the Patroni process, performing ``POST /reload`` REST-API request or executing ``patronictl reload``. - Environment :ref:`configuration `. It is possible to set/override some of the "Local" configuration parameters with environment variables. From cd785007aca80c517db23c52f7ef5c8d8a6bcc5b Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin Date: Thu, 31 Oct 2019 11:28:07 +0100 Subject: [PATCH 5/7] Describe defaults for us_pg_rewind and use_slots --- docs/SETTINGS.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/SETTINGS.rst b/docs/SETTINGS.rst index f72ec5131c..3d0e26c432 100644 --- a/docs/SETTINGS.rst +++ b/docs/SETTINGS.rst @@ -19,8 +19,8 @@ Dynamic configuration is stored in the DCS (Distributed Configuration Store) and - **synchronous\_mode**: turns on synchronous replication mode. In this mode a replica will be chosen as synchronous and only the latest leader and synchronous replica are able to participate in leader election. Synchronous mode makes sure that successfully committed transactions will not be lost at failover, at the cost of losing availability for writes when Patroni cannot ensure transaction durability. See :ref:`replication modes documentation ` for details. - **synchronous\_mode\_strict**: prevents disabling synchronous replication if no synchronous replicas are available, blocking all client writes to the master. See :ref:`replication modes documentation ` for details. - **postgresql**: - - **use\_pg\_rewind**: whether or not to use pg_rewind - - **use\_slots**: whether or not to use replication_slots. + - **use\_pg\_rewind**: whether or not to use pg_rewind. Defaults to `false`. + - **use\_slots**: whether or not to use replication_slots. Defaults to `true` on PostgreSQL 9.4+. - **recovery\_conf**: additional configuration settings written to recovery.conf when configuring follower. There is no recovery.conf anymore in PostgreSQL 12, but you may continue using this section, because Patroni handles it transparently. - **parameters**: list of configuration settings for Postgres. - **standby\_cluster**: if this section is defined, we want to bootstrap a standby cluster. From 57b144cb27d38cf683d3d5b3cd4b147310e6f161 Mon Sep 17 00:00:00 2001 From: Alexander Kukushkin Date: Tue, 5 Nov 2019 17:29:13 +0100 Subject: [PATCH 6/7] Fix K8s.ports example --- docs/SETTINGS.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SETTINGS.rst b/docs/SETTINGS.rst index f72ec5131c..8be1428488 100644 --- a/docs/SETTINGS.rst +++ b/docs/SETTINGS.rst @@ -138,7 +138,7 @@ Kubernetes - **role\_label**: (optional) name of the label containing role (master or replica). Patroni will set this label on the pod it runs in. Default value is ``role``. - **use\_endpoints**: (optional) if set to true, Patroni will use Endpoints instead of ConfigMaps to run leader elections and keep cluster state. - **pod\_ip**: (optional) IP address of the pod Patroni is running in. This value is required when `use_endpoints` is enabled and is used to populate the leader endpoint subsets when the pod's PostgreSQL is promoted. -- **ports**: (optional) if the Service object has the name for the port, the same name must appear in the Endpoint object, otherwise service won't work. For example, if your service is defined as ``{Kind: Service, spec: {ports: [{name: postgresql, port: 5432, targetPort: 5432}]}}``, then you have to set ``kubernetes.ports: {[{"name": "postgresql", "port": 5432}]}`` and Patroni will use it for updating subsets of the leader Endpoint. This parameter is used only if `kubernetes.use_endpoints` is set. +- **ports**: (optional) if the Service object has the name for the port, the same name must appear in the Endpoint object, otherwise service won't work. For example, if your service is defined as ``{Kind: Service, spec: {ports: [{name: postgresql, port: 5432, targetPort: 5432}]}}``, then you have to set ``kubernetes.ports: [{"name": "postgresql", "port": 5432}]`` and Patroni will use it for updating subsets of the leader Endpoint. This parameter is used only if `kubernetes.use_endpoints` is set. .. _postgresql_settings: From 1fae02984ef93e4d3fe72098a0b67c4e7ac75cbe Mon Sep 17 00:00:00 2001 From: Andras Vaczi Date: Tue, 5 Nov 2019 18:15:43 +0100 Subject: [PATCH 7/7] Fix some grammar --- docs/rest_api.rst | 51 ++++++++++++++++++++++++----------------------- 1 file changed, 26 insertions(+), 25 deletions(-) diff --git a/docs/rest_api.rst b/docs/rest_api.rst index ecd773d8aa..cb3d37f2ff 100644 --- a/docs/rest_api.rst +++ b/docs/rest_api.rst @@ -3,13 +3,13 @@ Patroni REST API ================ -Patroni has a reach REST API, which is used by Patroni itself during the leader race, by ``patronictl`` tool in order to perform failovers/switchovers/reinitialize/restarts/reloads, by HAProxy or any other kind of load-balancer to perform HTTP health-checks and of course of could also be used for monitoring. Below you will find the list of Patroni REST API endpoints. +Patroni has a rich REST API, which is used by Patroni itself during the leader race, by the ``patronictl`` tool in order to perform failovers/switchovers/reinitialize/restarts/reloads, by HAProxy or any other kind of load balancer to perform HTTP health checks, and of course could also be used for monitoring. Below you will find the list of Patroni REST API endpoints. -Health-check endpoints +Health check endpoints ---------------------- -For all health-check ``GET`` requests along with HTTP status code Patroni returns a JSON document with the status of the node. If you don't want or don't need the JSON document you might consider using the ``OPTIONS`` method instead of ``GET``. +For all health check ``GET`` requests Patroni returns a JSON document with the status of the node, along with the HTTP status code. If you don't want or don't need the JSON document, you might consider using the ``OPTIONS`` method instead of ``GET``. -- Following requests to Patroni REST API will return HTTP status code **200** only when the Patroni node is running as the leader +- The following requests to Patroni REST API will return HTTP status code **200** only when the Patroni node is running as the leader: - ``GET /`` - ``GET /master`` @@ -17,11 +17,11 @@ For all health-check ``GET`` requests along with HTTP status code Patroni return - ``GET /primary`` - ``GET /read-write`` -- ``GET /replica``: replica health-check endpoint. will return HTTP status code **200** only when the Patroni node is in the state ``running``, the role is ``replica`` and ``noloadbalance`` tag is not set. +- ``GET /replica``: replica health check endpoint. It returns HTTP status code **200** only when the Patroni node is in the state ``running``, the role is ``replica`` and ``noloadbalance`` tag is not set. - ``GET /read-only``: like the above endpoint, but also includes the primary. -- ``GET /standby-leader``: returns HTTP status code **200** only when the Patroni node is running as the leader in the :ref:`standby cluster `. +- ``GET /standby-leader``: returns HTTP status code **200** only when the Patroni node is running as the leader in a :ref:`standby cluster `. - ``GET /synchronous`` or ``GET /sync``: returns HTTP status code **200** only when the Patroni node is running as a synchronous standby. @@ -32,7 +32,7 @@ For all health-check ``GET`` requests along with HTTP status code Patroni return Monitoring endpoint ------------------- -The ``GET /patroni`` is used by Patroni during the leader race. It also could be used by your monitoring system. The JSON document produced by this endpoint has the same structure and the JSON produced by health-check endpoints. +The ``GET /patroni`` is used by Patroni during the leader race. It also could be used by your monitoring system. The JSON document produced by this endpoint has the same structure as the JSON produced by the health check endpoints. .. code-block:: bash @@ -57,7 +57,7 @@ The ``GET /patroni`` is used by Patroni during the leader race. It also could be Cluster status endpoints ------------------------ -- The ``GET /cluster`` endpoint generates JSON document describing the current cluster topology and state. +- The ``GET /cluster`` endpoint generates a JSON document describing the current cluster topology and state: .. code-block:: bash @@ -97,7 +97,7 @@ Cluster status endpoints } -- The ``GET /history`` endpoint provides a view on the history of cluster switchovers/failovers. The format is very similar to the content of history files in the ``pg_wal`` directory. The only difference in the timestamp field showing when the new timeline was created. +- The ``GET /history`` endpoint provides a view on the history of cluster switchovers/failovers. The format is very similar to the content of history files in the ``pg_wal`` directory. The only difference is the timestamp field showing when the new timeline was created. .. code-block:: bash @@ -133,7 +133,7 @@ Cluster status endpoints Config endpoint --------------- -``GET /config``: Get the current version of the dynamic configuration. +``GET /config``: Get the current version of the dynamic configuration: .. code-block:: bash @@ -188,7 +188,7 @@ Config endpoint The above REST API call patches the existing configuration and returns the new configuration. -Let's check that the node processed this configuration. First of all it should start printing log lines every 5 seconds (loop_wait=5). The change of "max_connections" requires a restart, so the "restart_pending" flag should be exposed: +Let's check that the node processed this configuration. First of all it should start printing log lines every 5 seconds (loop_wait=5). The change of "max_connections" requires a restart, so the "pending_restart" flag should be exposed: .. code-block:: bash @@ -238,7 +238,7 @@ If you want to remove (reset) some setting just patch it with ``null``: } } -Above call removes ``postgresql.parameters.max_connections`` from the dynamic configuration. +The above call removes ``postgresql.parameters.max_connections`` from the dynamic configuration. ``PUT /config``: It's also possible to perform the full rewrite of an existing dynamic configuration unconditionally: @@ -267,20 +267,20 @@ Above call removes ``postgresql.parameters.max_connections`` from the dynamic co } -Switchover and Failover endpoints +Switchover and failover endpoints --------------------------------- ``POST /switchover`` or ``POST /failover``. These endpoints are very similar to each other. There are a couple of minor differences though: -1. The failover endpoint allows to perform a manual failover when there are no healthy nodes, but at the same time it will not allow you to ``schedule`` a switchover. +1. The failover endpoint allows to perform a manual failover when there are no healthy nodes, but at the same time it will not allow you to schedule a switchover. -2. The switchover endpoint is the opposite. It works only when the cluster is healthy (there is the leader) and allows to ``schedule`` a switchover at a given time. +2. The switchover endpoint is the opposite. It works only when the cluster is healthy (there is a leader) and allows to schedule a switchover at a given time. -In the JSON body of ``POST`` request you must specify at least the ``leader`` or ``candidate`` fields and optionally the ``scheduled_at`` field if you want to schedule a switchover at a specific time. +In the JSON body of the ``POST`` request you must specify at least the ``leader`` or ``candidate`` fields and optionally the ``scheduled_at`` field if you want to schedule a switchover at a specific time. -Example: perform a failover to the specific node +Example: perform a failover to the specific node: .. code-block:: bash @@ -288,7 +288,7 @@ Example: perform a failover to the specific node Successfully failed over to "postgresql1" -Example: schedule a switchover from leader to any other healthy replica in the cluster at a specific time. +Example: schedule a switchover from the leader to any other healthy replica in the cluster at a specific time: .. code-block:: bash @@ -297,20 +297,20 @@ Example: schedule a switchover from leader to any other healthy replica in the c Switchover scheduled -Depending on the situation the request might finish with a different HTTP status code and body. The status code **200** is returned when the switchover or failover successfully completed. If the switchover was successfully scheduled, Patroni will return HTTP status code **202**. In case if something went wrong the error status code (one of **400**, **412** or **503**) will be returned with some details in the response body. For more information please check the source code of ``patroni/api.py:do_POST_failover()`` method. +Depending on the situation the request might finish with a different HTTP status code and body. The status code **200** is returned when the switchover or failover successfully completed. If the switchover was successfully scheduled, Patroni will return HTTP status code **202**. In case something went wrong, the error status code (one of **400**, **412** or **503**) will be returned with some details in the response body. For more information please check the source code of ``patroni/api.py:do_POST_failover()`` method. -The switchover and failover endpoints are used by ``patronictl switchover`` and ``patronictl failover`` respectively. +The switchover and failover endpoints are used by ``patronictl switchover`` and ``patronictl failover``, respectively. Restart endpoint ---------------- -- ``POST /restart``: You can restart the postgres on the specific node by performing the ``POST /restart`` call. In the JSON body of ``POST`` request it is possible to optionally specify some restart conditions: +- ``POST /restart``: You can restart Postgres on the specific node by performing the ``POST /restart`` call. In the JSON body of ``POST`` request it is possible to optionally specify some restart conditions: - **restart_pending**: boolean, if set to ``true`` Patroni will restart PostgreSQL only when restart is pending in order to apply some changes in the PostgreSQL config. - **role**: perform restart only if the current role of the node matches with the role from the POST request. - **postgres_version**: perform restart only if the current version of postgres is smaller than specified in the POST request. - - **timeout**: how long should we wait before PostgreSQL starts accepting connections. Overrides ``master_start_timeout``. + - **timeout**: how long we should wait before PostgreSQL starts accepting connections. Overrides ``master_start_timeout``. - **schedule**: timestamp with time zone, schedule the restart somewhere in the future. - ``DELETE /restart``: delete the scheduled restart @@ -321,7 +321,7 @@ The restart endpoint is used by ``patronictl restart``. Reload endpoint --------------- -The ``POST /reload`` call will order Patroni to re-read and apply the configuration file. The equivalent of sending the ``SIGHUP`` signal to the Patroni process. In case if you changed some of postgres parameters which require a restart (like for example **shared_buffers**), you still have to explicitly do the restart of postgres by either calling ``POST /restart`` endpoint of with the help of ``patronictl restart`` +The ``POST /reload`` call will order Patroni to re-read and apply the configuration file. This is the equivalent of sending the ``SIGHUP`` signal to the Patroni process. In case you changed some of the Postgres parameters which require a restart (like **shared_buffers**), you still have to explicitly do the restart of Postgres by either calling the ``POST /restart`` endpoint or with the help of ``patronictl restart``. The reload endpoint is used by ``patronictl reload``. @@ -329,7 +329,8 @@ The reload endpoint is used by ``patronictl reload``. Reinitialize endpoint --------------------- -``POST /reinitialize`` - reinitialize the PostgreSQL data directory on specified node. Is allowed to be executed only on the replica. Once called it will remove the data directory and start ``pg_basebackup`` or some alternative :ref:`replica creation method `. -The call might fail if Patroni is in a loop trying to recover(restart) failed postgres. In order to overcome this problem one can specify ``{"force":true}`` in the request body. +``POST /reinitialize``: reinitialize the PostgreSQL data directory on the specified node. It is allowed to be executed only on replicas. Once called, it will remove the data directory and start ``pg_basebackup`` or some alternative :ref:`replica creation method `. + +The call might fail if Patroni is in a loop trying to recover (restart) a failed Postgres. In order to overcome this problem one can specify ``{"force":true}`` in the request body. The reinitialize endpoint is used by ``patronictl reinitialize``.