postgres-ai
diff --git a/‎0057_how_to_convert_a_physical_replica_to_logical.md
Lines changed: 153 additions & 0 deletions b/‎0057_how_to_convert_a_physical_replica_to_logical.md
Lines changed: 153 additions & 0 deletions
diff --git a/‎0058_how_to_use_docker_to_run_postgres.md
Lines changed: 197 additions & 0 deletions b/‎0058_how_to_use_docker_to_run_postgres.md
Lines changed: 197 additions & 0 deletions
@@ -0,0 +1,153 @@
+Originally from: [tweet](https://twitter.com/samokhvalov/status/1727344499943493900), [LinkedIn post]().
+
+---
+
+# How to convert a physical replica to logical
+
+> I post a new PostgreSQL "howto" article every day. Join me in this
+> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
+
+In some cases, it might be beneficial to convert an existing regular asynchronous physical replica to logical, or to
+create a new physical replica first, and then convert it to logical.
+
+This approach:
+
+- on the one hand, eliminates the need to execute initial data load step that can be fragile and quite stressful in case
+  of large, heavily-loaded DB, but
+- on another, the logical replica created in such way has everything that the source Postgres instance has.
+
+So, this method suits better in case when you need all the data from the source be presented in the logical replica
+you're creating, and it is extremely useful if you work with very large, heavily-loaded clusters.
+
+The steps below are quite straightforward. In this case, we use a physical replica that replicates data immediately from
+the primary via streaming replication `primary_conninfo` and replication slot (e.g. under Patroni's control), not
+involving cascaded replication (although it's possible to implement too).
+
+## Step 1: have a physical replica for conversion
+
+Choose a physical replica to convert, or create a new one using `pg_basebackup`, recovering from backups, or creating it
+from a cloud snapshot.
+
+Make sure this replica is not used by regular users while we're converting it.
+
+## Step 2: ensure the requirements are met
+
+First, ensure that the settings are prepared for logical replication, as described
+in the [logical replication config](https://postgresql.org/docs/current/logical-replication-config.html).
+
+Primary settings:
+
+- `wal_level = 'logical'`
+- `max_replication_slots > 0`
+- `max_wal_senders > max_replication_slots`
+
+On the physical replica we are going to convert:
+
+- `max_replication_slots > 0`
+- `max_logical_replication_workers > 0`
+- `max_worker_processes >= max_logical_replication_workers + 1`
+
+Additionally:
+
+- the replication lag is low;
+- every table has a PK or have
+  [REPLICA IDENTITY FULL](https://postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY);
+- `restore_command` is not set on the replica we'll use (if it is, temporarily set its value to an empty string);
+- temporarily, increase `wal_keep_size` (PG13+; in PG12 or older, `wal_keep_segments`) on the primary to a value
+  corresponding to a few hours of WAL generation.
+
+## Step 3: stop physical replica
+
+Shut down physical replica and keep it down during the next step. This is needed so its position is guaranteed to be in
+the past compared to the logical slot we're going to create on the primary.
+
+## Step 4: create publication, logical slot, and remember its LSN
+
+On the primary:
+
+- issue a manual `CHECKPOINT`;
+- create publication;
+- create a logical slot and *remember its LSN position*;
+
+Example:
+
+```sql
+checkpoint;
+
+create publication my_pub for all tables;
+
+select lsn
+from pg_create_logical_replication_slot(
+  'my_slot',
+  'pgoutput'
+);
+```
+
+It is important to remember the `lsn` value from the last command – we'll be using it further.
+
+## Step 5: let the physical replica catch up
+
+Reconfigure the physical replica:
+
+- `recovery_target_lsn` – set it to the LSN value we've got from the previous step
+- `recovery_target_action = 'promote'`
+- `restore_command`, `recovery_target_timeline`, `recovery_target_xid`, `recovery_target_time`, `recovery_target_name`
+  are not set or empty
+
+Now, start the physical replica. Monitor its lag and how the replica catches up reaching the LSN we need and then
+auto-promotes. This can take some time. Once it's done, check it:
+
+```sql
+select pg_is_in_recovery();
+```
+
+- must return `f`, meaning that this node is now a primary itself (a clone) with position, corresponding to the position
+  of the replication slot on the source node.
+
+## Step 6: create subscription and start logical replication
+
+Now, of the freshly created "clone", create logical subscription with `copy_data = false` and `create_slot = false`:
+
+```sql
+create subscription 'my_sub'
+connection 'host=.. port=.. user=.. dbname=..'
+publication my_pub
+with (
+  copy_data = false,
+  create_slot=false,
+  slot_name = 'my_slot'
+);
+```
+
+Ensure that replication is now active – check it on the source primary:
+
+```sql
+select * from pg_replication_slots;
+```
+
+– the field `active` must be `t` for our slot.
+
+## Finalize
+
+- Wait until the logical replication lags fully caught up (occasional acute spikes are OK).
+- Return `wal_keep_size` (`wal_keep_segments`) to its original value on the primary.
+
+## Additional notes
+
+Here we used a single publication and logical slot in this recipe. It is possible to use multiple slots, slightly
+adjusting the procedure. But if you choose to do so, keep in mind the potential complexities of the use of multiple
+slots/publications, first of all, these:
+
+- not guaranteed referential integrity on the logical replica (occasional temporary FK violation),
+- more fragile procedure of publication creation (creation of a publication `FOR ALL TABLES` doesn't require table-level
+  locks; but when we use multiple publications and create publication for certain tables, table-level locks are
+  required – however, this is just `ShareUpdateExclusiveLock`,
+  per [this comment on PostgreSQL source code](https://github.com/postgres/postgres/blob/1b6da28e0668eb977dcab6987d192ddedf32b752/src/backend/commands/publicationcmds.c#L1550)).
+
+And in any case:
+
+- make sure you are prepared to deal with the restrictions of logical replication for your version (e.g.,
+  [for PG16](https://postgresql.org/docs/16/logical-replication-restrictions.html));
+- if you consider using this approach to perform a major upgrade, avoid running `pg_upgrade` on the already-converted
+  node – it may be not safe
+  (see: [pg_upgrade and logical replication](https://postgresql.org/message-id/flat/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud)).
@@ -0,0 +1,197 @@
+Originally from: [tweet](https://twitter.com/samokhvalov/status/1727705412072554585), [LinkedIn post]().
+
+---
+
+# How to use Docker to run Postgres
+
+> I post a new PostgreSQL "howto" article every day. Join me in this
+> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
+
+This howto is for users who use or need to use Postgres, but are not experienced in using Docker.
+
+Running docker in container for development and testing can help you align the sets of libraries, extensions, software
+versions between multiple environments.
+
+## Docker installation – macOS
+
+Installation using [Homebrew](https://brew.sh):
+
+```bash
+brew install docker docker-compose
+```
+
+## Docker installation – Ubuntu
+
+```bash
+sudo apt-get update
+sudo apt-get install -y \
+  apt-transport-https \
+  ca-certificates \
+  curl \
+  gnupg-agent \
+  software-properties-common
+curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
+
+sudo add-apt-repository -y \
+ "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
+
+sudo apt-get update && sudo apt-get install -y \
+  docker-ce \
+  docker-ce-cli \
+  http://containerd.io \
+  docker-compose-plugin
+```
+
+To avoid the need to use `sudo` to run `docker` commands:
+
+```bash
+sudo groupadd docker
+sudo usermod -aG docker $USER
+newgrp docker
+```
+
+## Run Postgres in container with persistent PGDATA
+
+Assuming we want the data directory (`PGDATA`) be in `~/pgdata` and container named as `pg16`:
+
+```bash
+sudo docker run \
+  --detach \
+  --name pg16 \
+  -e POSTGRES_PASSWORD=secret \
+  -v ~/pgdata:/var/lib/postgresql/data \
+  --shm-size=128m \
+  postgres:16
+```
+
+## Check logs
+
+Last 5 minutes of logs, with timestamps, and observing new coming log entries:
+
+```bash
+docker logs --since 5m -tf pg16
+```
+
+## Connect using psql
+
+```bash
+❯ docker exec -it pg16 psql -U postgres -c 'create table t()'
+CREATE TABLE
+
+❯ docker exec -it pg16 psql -U postgres -c '\d t'
+                Table "public.t"
+ Column | Type | Collation | Nullable | Default
+--------+------+-----------+----------+---------
+```
+
+For interactive psql, use:
+
+```bash
+docker exec -it pg16 psql -U postgres
+```
+
+## Connect any application from outside
+
+To connect an application from the host machine, we need to map ports. For this, we'll destroy this container, and
+create a new one, with proper port mapping – noting that `PGDATA` persists (the table we created is there):
+
+```bash
+❯ docker stop pg16
+pg16
+
+❯ docker rm pg16
+pg16
+
+❯ docker run \
+  --detach \
+  --name pg16 \
+  -e POSTGRES_PASSWORD=secret \
+  -v ~/pgdata:/var/lib/postgresql/data \
+  --shm-size=128m \
+  -p 127.0.0.1:15432:5432 \
+  postgres:16
+8b5370107e1be7d3fd01a3180999a253c53610ca9ab764125b1512f65e83b927
+
+❯ PGPASSWORD=secret psql -hlocalhost -p15432 -U postgres -c '\d t'
+Timing is on.
+                Table "public.t"
+ Column | Type | Collation | Nullable | Default
+--------+------+-----------+----------+---------
+```
+
+## Custom image with additional extensions
+
+For example, here is how we can create our own image, based on the original one, to include `plpython3u` (continuing to
+work with the same `PGDATA`)
+
+```bash
+docker stop pg16
+
+docker rm pg16
+
+echo "FROM postgres:16
+RUN apt update
+RUN apt install -y postgresql-plpython3-16" \
+> postgres_plpython3u.Dockerfile
+
+sudo docker build \
+  -t postgres-plpython3u:16 \
+  -f postgres_plpython3u.Dockerfile \
+  .
+
+docker run \
+  --detach \
+  --name pg16 \
+  -e POSTGRES_PASSWORD=secret \
+  -v ~/pgdata:/var/lib/postgresql/data \
+  --shm-size=128m \
+  postgres-plpython3u:16
+
+docker exec -it pg16 \
+  psql -U postgres -c 'create extension plpython3u'
+```
+
+## Shared memory
+
+If you see an error like this one:
+
+```
+> FATAL:  could not resize shared memory segment "/PostgreSQL.12345" to 1048576 bytes: No space left on device1
+```
+
+then increase the `--shm-size` value in the `docker run` command.
+
+## How to upgrade Postgres preserving data
+
+1) In-place upgrades:
+
+   - Traditional Docker images for Postgres include binaries only for one major version, so running `pg_upgrade` is not
+     possible, unless you extend those images
+   - Alternatively, you can use images that include multiple binaries, –
+     e.g., [Spilo by Zalando](https://github.com/zalando/spilo).
+
+2) Simple dump/restore (here I show how to downgrade assuming there are no incompatibilities; upgrade can be done in the
+   same way):
+
+    ```bash
+    docker exec -it pg16 pg_dumpall -U postgres \
+      | bzip2 > dumpall.bz2
+    
+    docker rm -f pg16
+    
+    rm -rf ~/pgdata
+    mkdir ~/pgdata
+    
+    docker run \
+      --detach \
+      --name pg15 \
+      -e POSTGRES_PASSWORD=secret \
+      -v ~/pgdata:/var/lib/postgresql/data \
+      --shm-size=128m \
+      postgres:15
+    
+    bzcat dumpall.bz2  \
+      | docker exec -i pg15 psql -U postgres \
+    >>dump_load.log \
+    2> >(tee -a dump_load.err >&2)
+    ```