Skip to content

Commit 294a6ab

Browse files
msdoustiNikolayS
authored andcommitted
Days 57-62
1 parent 9f8eec5 commit 294a6ab

11 files changed

+843
-0
lines changed
Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
Originally from: [tweet](https://twitter.com/samokhvalov/status/1727344499943493900), [LinkedIn post]().
2+
3+
---
4+
5+
# How to convert a physical replica to logical
6+
7+
> I post a new PostgreSQL "howto" article every day. Join me in this
8+
> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
9+
10+
In some cases, it might be beneficial to convert an existing regular asynchronous physical replica to logical, or to
11+
create a new physical replica first, and then convert it to logical.
12+
13+
This approach:
14+
15+
- on the one hand, eliminates the need to execute initial data load step that can be fragile and quite stressful in case
16+
of large, heavily-loaded DB, but
17+
- on another, the logical replica created in such way has everything that the source Postgres instance has.
18+
19+
So, this method suits better in case when you need all the data from the source be presented in the logical replica
20+
you're creating, and it is extremely useful if you work with very large, heavily-loaded clusters.
21+
22+
The steps below are quite straightforward. In this case, we use a physical replica that replicates data immediately from
23+
the primary via streaming replication `primary_conninfo` and replication slot (e.g. under Patroni's control), not
24+
involving cascaded replication (although it's possible to implement too).
25+
26+
## Step 1: have a physical replica for conversion
27+
28+
Choose a physical replica to convert, or create a new one using `pg_basebackup`, recovering from backups, or creating it
29+
from a cloud snapshot.
30+
31+
Make sure this replica is not used by regular users while we're converting it.
32+
33+
## Step 2: ensure the requirements are met
34+
35+
First, ensure that the settings are prepared for logical replication, as described
36+
in the [logical replication config](https://postgresql.org/docs/current/logical-replication-config.html).
37+
38+
Primary settings:
39+
40+
- `wal_level = 'logical'`
41+
- `max_replication_slots > 0`
42+
- `max_wal_senders > max_replication_slots`
43+
44+
On the physical replica we are going to convert:
45+
46+
- `max_replication_slots > 0`
47+
- `max_logical_replication_workers > 0`
48+
- `max_worker_processes >= max_logical_replication_workers + 1`
49+
50+
Additionally:
51+
52+
- the replication lag is low;
53+
- every table has a PK or have
54+
[REPLICA IDENTITY FULL](https://postgresql.org/docs/current/sql-altertable.html#SQL-ALTERTABLE-REPLICA-IDENTITY);
55+
- `restore_command` is not set on the replica we'll use (if it is, temporarily set its value to an empty string);
56+
- temporarily, increase `wal_keep_size` (PG13+; in PG12 or older, `wal_keep_segments`) on the primary to a value
57+
corresponding to a few hours of WAL generation.
58+
59+
## Step 3: stop physical replica
60+
61+
Shut down physical replica and keep it down during the next step. This is needed so its position is guaranteed to be in
62+
the past compared to the logical slot we're going to create on the primary.
63+
64+
## Step 4: create publication, logical slot, and remember its LSN
65+
66+
On the primary:
67+
68+
- issue a manual `CHECKPOINT`;
69+
- create publication;
70+
- create a logical slot and *remember its LSN position*;
71+
72+
Example:
73+
74+
```sql
75+
checkpoint;
76+
77+
create publication my_pub for all tables;
78+
79+
select lsn
80+
from pg_create_logical_replication_slot(
81+
'my_slot',
82+
'pgoutput'
83+
);
84+
```
85+
86+
It is important to remember the `lsn` value from the last command – we'll be using it further.
87+
88+
## Step 5: let the physical replica catch up
89+
90+
Reconfigure the physical replica:
91+
92+
- `recovery_target_lsn` – set it to the LSN value we've got from the previous step
93+
- `recovery_target_action = 'promote'`
94+
- `restore_command`, `recovery_target_timeline`, `recovery_target_xid`, `recovery_target_time`, `recovery_target_name`
95+
are not set or empty
96+
97+
Now, start the physical replica. Monitor its lag and how the replica catches up reaching the LSN we need and then
98+
auto-promotes. This can take some time. Once it's done, check it:
99+
100+
```sql
101+
select pg_is_in_recovery();
102+
```
103+
104+
- must return `f`, meaning that this node is now a primary itself (a clone) with position, corresponding to the position
105+
of the replication slot on the source node.
106+
107+
## Step 6: create subscription and start logical replication
108+
109+
Now, of the freshly created "clone", create logical subscription with `copy_data = false` and `create_slot = false`:
110+
111+
```sql
112+
create subscription 'my_sub'
113+
connection 'host=.. port=.. user=.. dbname=..'
114+
publication my_pub
115+
with (
116+
copy_data = false,
117+
create_slot=false,
118+
slot_name = 'my_slot'
119+
);
120+
```
121+
122+
Ensure that replication is now active – check it on the source primary:
123+
124+
```sql
125+
select * from pg_replication_slots;
126+
```
127+
128+
– the field `active` must be `t` for our slot.
129+
130+
## Finalize
131+
132+
- Wait until the logical replication lags fully caught up (occasional acute spikes are OK).
133+
- Return `wal_keep_size` (`wal_keep_segments`) to its original value on the primary.
134+
135+
## Additional notes
136+
137+
Here we used a single publication and logical slot in this recipe. It is possible to use multiple slots, slightly
138+
adjusting the procedure. But if you choose to do so, keep in mind the potential complexities of the use of multiple
139+
slots/publications, first of all, these:
140+
141+
- not guaranteed referential integrity on the logical replica (occasional temporary FK violation),
142+
- more fragile procedure of publication creation (creation of a publication `FOR ALL TABLES` doesn't require table-level
143+
locks; but when we use multiple publications and create publication for certain tables, table-level locks are
144+
required – however, this is just `ShareUpdateExclusiveLock`,
145+
per [this comment on PostgreSQL source code](https://github.com/postgres/postgres/blob/1b6da28e0668eb977dcab6987d192ddedf32b752/src/backend/commands/publicationcmds.c#L1550)).
146+
147+
And in any case:
148+
149+
- make sure you are prepared to deal with the restrictions of logical replication for your version (e.g.,
150+
[for PG16](https://postgresql.org/docs/16/logical-replication-restrictions.html));
151+
- if you consider using this approach to perform a major upgrade, avoid running `pg_upgrade` on the already-converted
152+
node – it may be not safe
153+
(see: [pg_upgrade and logical replication](https://postgresql.org/message-id/flat/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud)).
Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
Originally from: [tweet](https://twitter.com/samokhvalov/status/1727705412072554585), [LinkedIn post]().
2+
3+
---
4+
5+
# How to use Docker to run Postgres
6+
7+
> I post a new PostgreSQL "howto" article every day. Join me in this
8+
> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
9+
10+
This howto is for users who use or need to use Postgres, but are not experienced in using Docker.
11+
12+
Running docker in container for development and testing can help you align the sets of libraries, extensions, software
13+
versions between multiple environments.
14+
15+
## Docker installation – macOS
16+
17+
Installation using [Homebrew](https://brew.sh):
18+
19+
```bash
20+
brew install docker docker-compose
21+
```
22+
23+
## Docker installation – Ubuntu
24+
25+
```bash
26+
sudo apt-get update
27+
sudo apt-get install -y \
28+
apt-transport-https \
29+
ca-certificates \
30+
curl \
31+
gnupg-agent \
32+
software-properties-common
33+
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
34+
35+
sudo add-apt-repository -y \
36+
"deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
37+
38+
sudo apt-get update && sudo apt-get install -y \
39+
docker-ce \
40+
docker-ce-cli \
41+
http://containerd.io \
42+
docker-compose-plugin
43+
```
44+
45+
To avoid the need to use `sudo` to run `docker` commands:
46+
47+
```bash
48+
sudo groupadd docker
49+
sudo usermod -aG docker $USER
50+
newgrp docker
51+
```
52+
53+
## Run Postgres in container with persistent PGDATA
54+
55+
Assuming we want the data directory (`PGDATA`) be in `~/pgdata` and container named as `pg16`:
56+
57+
```bash
58+
sudo docker run \
59+
--detach \
60+
--name pg16 \
61+
-e POSTGRES_PASSWORD=secret \
62+
-v ~/pgdata:/var/lib/postgresql/data \
63+
--shm-size=128m \
64+
postgres:16
65+
```
66+
67+
## Check logs
68+
69+
Last 5 minutes of logs, with timestamps, and observing new coming log entries:
70+
71+
```bash
72+
docker logs --since 5m -tf pg16
73+
```
74+
75+
## Connect using psql
76+
77+
```bash
78+
❯ docker exec -it pg16 psql -U postgres -c 'create table t()'
79+
CREATE TABLE
80+
81+
❯ docker exec -it pg16 psql -U postgres -c '\d t'
82+
Table "public.t"
83+
Column | Type | Collation | Nullable | Default
84+
--------+------+-----------+----------+---------
85+
```
86+
87+
For interactive psql, use:
88+
89+
```bash
90+
docker exec -it pg16 psql -U postgres
91+
```
92+
93+
## Connect any application from outside
94+
95+
To connect an application from the host machine, we need to map ports. For this, we'll destroy this container, and
96+
create a new one, with proper port mapping – noting that `PGDATA` persists (the table we created is there):
97+
98+
```bash
99+
❯ docker stop pg16
100+
pg16
101+
102+
❯ docker rm pg16
103+
pg16
104+
105+
❯ docker run \
106+
--detach \
107+
--name pg16 \
108+
-e POSTGRES_PASSWORD=secret \
109+
-v ~/pgdata:/var/lib/postgresql/data \
110+
--shm-size=128m \
111+
-p 127.0.0.1:15432:5432 \
112+
postgres:16
113+
8b5370107e1be7d3fd01a3180999a253c53610ca9ab764125b1512f65e83b927
114+
115+
❯ PGPASSWORD=secret psql -hlocalhost -p15432 -U postgres -c '\d t'
116+
Timing is on.
117+
Table "public.t"
118+
Column | Type | Collation | Nullable | Default
119+
--------+------+-----------+----------+---------
120+
```
121+
122+
## Custom image with additional extensions
123+
124+
For example, here is how we can create our own image, based on the original one, to include `plpython3u` (continuing to
125+
work with the same `PGDATA`)
126+
127+
```bash
128+
docker stop pg16
129+
130+
docker rm pg16
131+
132+
echo "FROM postgres:16
133+
RUN apt update
134+
RUN apt install -y postgresql-plpython3-16" \
135+
> postgres_plpython3u.Dockerfile
136+
137+
sudo docker build \
138+
-t postgres-plpython3u:16 \
139+
-f postgres_plpython3u.Dockerfile \
140+
.
141+
142+
docker run \
143+
--detach \
144+
--name pg16 \
145+
-e POSTGRES_PASSWORD=secret \
146+
-v ~/pgdata:/var/lib/postgresql/data \
147+
--shm-size=128m \
148+
postgres-plpython3u:16
149+
150+
docker exec -it pg16 \
151+
psql -U postgres -c 'create extension plpython3u'
152+
```
153+
154+
## Shared memory
155+
156+
If you see an error like this one:
157+
158+
```
159+
> FATAL: could not resize shared memory segment "/PostgreSQL.12345" to 1048576 bytes: No space left on device1
160+
```
161+
162+
then increase the `--shm-size` value in the `docker run` command.
163+
164+
## How to upgrade Postgres preserving data
165+
166+
1) In-place upgrades:
167+
168+
- Traditional Docker images for Postgres include binaries only for one major version, so running `pg_upgrade` is not
169+
possible, unless you extend those images
170+
- Alternatively, you can use images that include multiple binaries, –
171+
e.g., [Spilo by Zalando](https://github.com/zalando/spilo).
172+
173+
2) Simple dump/restore (here I show how to downgrade assuming there are no incompatibilities; upgrade can be done in the
174+
same way):
175+
176+
```bash
177+
docker exec -it pg16 pg_dumpall -U postgres \
178+
| bzip2 > dumpall.bz2
179+
180+
docker rm -f pg16
181+
182+
rm -rf ~/pgdata
183+
mkdir ~/pgdata
184+
185+
docker run \
186+
--detach \
187+
--name pg15 \
188+
-e POSTGRES_PASSWORD=secret \
189+
-v ~/pgdata:/var/lib/postgresql/data \
190+
--shm-size=128m \
191+
postgres:15
192+
193+
bzcat dumpall.bz2 \
194+
| docker exec -i pg15 psql -U postgres \
195+
>>dump_load.log \
196+
2> >(tee -a dump_load.err >&2)
197+
```

0 commit comments

Comments
 (0)