layout | title | date | categories | tags | image | ||||
---|---|---|---|---|---|---|---|---|---|
post |
PostgresSQL Clustering the hard way... |
2024-12-07 08:00:00 -0500 |
self-hosted |
postgresql homelab |
|
Today is the day that you start running PostgresSQL in a cluster! In this tutorial we'll be setting up a production ready Postgres cluster that's highly available and fault tolerant using PostgreSQL, etcd, Patroni, HA Proxy, and keepalived. This resilient combination will ensure that you can always reach your database even when a node in the cluster goes down!
{% include embed/youtube.html id='RHwglGf_z40' %} 📺 Watch Video
# ha proxy
192.168.60.100 # haproxy-01
192.168.60.101 # haproxy-02
192.168.60.102 # haproxy-03
# postgres
192.168.60.103 # postgres-01
192.168.60.104 # postgres-02
192.168.60.105 # postgres-03
On postgres nodes install latest postgres.
Install updated postgres repositories.
sudo apt update
sudo apt install -y postgresql-common
sudo /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh
Install postgres and additional modules.
sudo apt update
sudo apt install -y postgresql postgresql-contrib
We'll configure postgres later.
Stop and disable the service because Patroni will handle the lifecycle of postgres
sudo systemctl stop postgresql
sudo systemctl disable postgresql
This should be installed on the postgres servers.
Make sure you have wget
and curl
sudo apt update
sudo apt-get install -y wget curl
Find latest release.
https://github.com/etcd-io/etcd/releases
wget https://github.com/etcd-io/etcd/releases/download/v3.5.17/etcd-v3.5.17-linux-amd64.tar.gz
Uncompress and rename.
tar xvf etcd-v3.5.17-linux-amd64.tar.gz
mv etcd-v3.5.17-linux-amd64 etcd
Move all binaries into /usr/local/bin/
for later use.
sudo mv etcd/etcd* /usr/local/bin/
Check etcd
version
etcd --version
Should see something like:
etcd Version: 3.5.17
Git SHA: 507c0deß
Go Version: go1.22.9obs
Go OS/Arch: linux/amd64
etcdctl version
Should see something like:
etcdctl version: 3.5.17
API version: 3.5
Let's create a user for etcd
service.
sudo useradd --system --home /var/lib/etcd --shell /bin/false etcd
Before configuring etcd
, we need to repeat all of these steps for the other 2 nodes.
Let's configure etcd
.
Make dir and edit file.
sudo mkdir -p /etc/etcd
sudo mkdir -p /etc/etcd/ssl
We'll be securing the communication between etcd
nodes and postgres.
On our own machine (not servers!)
Make sure openssl
is installed
Windows
winget install -e --id FireDaemon.OpenSSL
macOS
brew install openssl
Linux
sudo apt install openssl
Verify it's installed and working
openssl version
Should see something like:
OpenSSL 3.4.0 22 Oct 2024 (Library: OpenSSL 3.4.0 22 Oct 2024)
Now let's generate and configure certs
mkdir certs
cd certs
Generate cert authority
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=etcd-ca" -days 7300 -out ca.crt
Generate certificate each node. Note, pay attention to SANS, I am using IP, update with your IP and oh DNS/hostname.
node1
# Generate a private key
openssl genrsa -out etcd-node1.key 2048
# Create temp file for config
cat > temp.cnf <<EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = v3_req
[ req_distinguished_name ]
[ v3_req ]
subjectAltName = @alt_names
[ alt_names ]
IP.1 = 192.168.60.103
IP.2 = 127.0.0.1
EOF
# Create a csr
openssl req -new -key etcd-node1.key -out etcd-node1.csr \
-subj "/C=US/ST=YourState/L=YourCity/O=YourOrganization/OU=YourUnit/CN=etcd-node1" \
-config temp.cnf
# Sign the cert
openssl x509 -req -in etcd-node1.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
-out etcd-node1.crt -days 7300 -sha256 -extensions v3_req -extfile temp.cnf
# Verify the cert and be sure you see Subject Name Alternative
openssl x509 -in etcd-node1.crt -text -noout | grep -A1 "Subject Alternative Name"
# Remove temp file
rm temp.cnf
node 2
# Generate a private key
openssl genrsa -out etcd-node2.key 2048
# Create temp file for config
cat > temp.cnf <<EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = v3_req
[ req_distinguished_name ]
[ v3_req ]
subjectAltName = @alt_names
[ alt_names ]
IP.1 = 192.168.60.104
IP.2 = 127.0.0.1
EOF
# Create a csr
openssl req -new -key etcd-node2.key -out etcd-node2.csr \
-subj "/C=US/ST=YourState/L=YourCity/O=YourOrganization/OU=YourUnit/CN=etcd-node2" \
-config temp.cnf
# Sign the cert
openssl x509 -req -in etcd-node2.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
-out etcd-node2.crt -days 7300 -sha256 -extensions v3_req -extfile temp.cnf
# Verify the cert and be sure you see Subject Name Alternative
openssl x509 -in etcd-node2.crt -text -noout | grep -A1 "Subject Alternative Name"
# Remove temp file
rm temp.cnf
node3
# Generate a private key
openssl genrsa -out etcd-node3.key 2048
# Create temp file for config
cat > temp.cnf <<EOF
[ req ]
distinguished_name = req_distinguished_name
req_extensions = v3_req
[ req_distinguished_name ]
[ v3_req ]
subjectAltName = @alt_names
[ alt_names ]
IP.1 = 192.168.60.105
IP.2 = 127.0.0.1
EOF
# Create a csr
openssl req -new -key etcd-node3.key -out etcd-node3.csr \
-subj "/C=US/ST=YourState/L=YourCity/O=YourOrganization/OU=YourUnit/CN=etcd-node3" \
-config temp.cnf
#Sign the cert
openssl x509 -req -in etcd-node3.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
-out etcd-node3.crt -days 7300 -sha256 -extensions v3_req -extfile temp.cnf
# Verify the cert and be sure you see Subject Name Alternative
openssl x509 -in etcd-node3.crt -text -noout | grep -A1 "Subject Alternative Name"
# Remove temp file
rm temp.cnf
List all files
ls
Should see:
ca.crt
ca.key
ca.srl
etcd-node1.crt
etcd-node1.csr
etcd-node2.crt
etcd-node2.csr
etcd-node2.key
etcd-node3.crt
etcd-node3.csr
etcd-node3.key
Secure copy (scp
) the certs to each node:
scp ca.crt etcd-node1.crt etcd-node1.key serveradmin@192.168.60.103:/tmp/
scp ca.crt etcd-node2.crt etcd-node2.key serveradmin@192.168.60.104:/tmp/
scp ca.crt etcd-node3.crt etcd-node3.key serveradmin@192.168.60.105:/tmp/
Should see:
ssh
into each node
We need to move certs from /tmp
to ssl location (/etc/etcd/ssl
)
sudo mkdir -p /etc/etcd/ssl
sudo mv /tmp/etcd-node*.crt /etc/etcd/ssl/
sudo mv /tmp/etcd-node*.key /etc/etcd/ssl/
sudo mv /tmp/ca.crt /etc/etcd/ssl/
sudo chown -R etcd:etcd /etc/etcd/
sudo chmod 600 /etc/etcd/ssl/etcd-node*.key
sudo chmod 644 /etc/etcd/ssl/etcd-node*.crt /etc/etcd/ssl/ca.crt
Create our config
sudo nano /etc/etcd/etcd.env
node1
ETCD_NAME="postgresql-01"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_INITIAL_CLUSTER="postgresql-01=https://192.168.60.103:2380,postgresql-02=https://192.168.60.104:2380,postgresql-03=https://192.168.60.105:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.60.103:2380"
ETCD_LISTEN_PEER_URLS="https://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="https://0.0.0.0:2379"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.60.103:2379"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.crt"
ETCD_CERT_FILE="/etc/etcd/ssl/etcd-node1.crt"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-node1.key"
ETCD_PEER_CLIENT_CERT_AUTH="true"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.crt"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd-node1.crt"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-node1.key"
node2
ETCD_NAME="postgresql-02"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_INITIAL_CLUSTER="postgresql-01=https://192.168.60.103:2380,postgresql-02=https://192.168.60.104:2380,postgresql-03=https://192.168.60.105:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.60.104:2380"
ETCD_LISTEN_PEER_URLS="https://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="https://0.0.0.0:2379"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.60.104:2379"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.crt"
ETCD_CERT_FILE="/etc/etcd/ssl/etcd-node2.crt"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-node2.key"
ETCD_PEER_CLIENT_CERT_AUTH="true"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.crt"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd-node2.crt"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-node2.key"
node 3
ETCD_NAME="postgresql-03"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_INITIAL_CLUSTER="postgresql-01=https://192.168.60.103:2380,postgresql-02=https://192.168.60.104:2380,postgresql-03=https://192.168.60.105:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.60.105:2380"
ETCD_LISTEN_PEER_URLS="https://0.0.0.0:2380"
ETCD_LISTEN_CLIENT_URLS="https://0.0.0.0:2379"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.60.105:2379"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.crt"
ETCD_CERT_FILE="/etc/etcd/ssl/etcd-node3.crt"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-node3.key"
ETCD_PEER_CLIENT_CERT_AUTH="true"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.crt"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd-node3.crt"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-node3.key"
Now let's create a service for etcd on all 3 nodes
sudo nano /etc/systemd/system/etcd.service
Contents of service file, same for all 3 nodes
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
EnvironmentFile=/etc/etcd/etcd.env
ExecStart=/usr/local/bin/etcd
Restart=always
RestartSec=10s
LimitNOFILE=40000
User=etcd
Group=etcd
[Install]
WantedBy=multi-user.target
We need to create a directory for etcd
ETCD_DATA_DIR
defined in service file.
sudo mkdir -p /var/lib/etcd
sudo chown -R etcd:etcd /var/lib/etcd
Reload daemon and enable the service
sudo systemctl daemon-reload
sudo systemctl enable etcd
Start etcd and check status
this looks like it's hanging on the first node but once you add another node it will complete
sudo systemctl start etcd
sudo systemctl status etcd
Should see something like:
● etcd.service - etcd key-value store
Loaded: loaded (/etc/systemd/system/etcd.service; enabled; preset: enabled)
Active: active (running) since Mon 2024-12-02 14:09:30 CST; 2s ago
Docs: https://github.com/etcd-io/etcd
Main PID: 7266 (etcd)
Tasks: 9 (limit: 4612)
Memory: 29.3M (peak: 30.0M)
CPU: 246ms
CGroup: /system.slice/etcd.service
└─7266 /usr/local/bin/etcd
You can also check the logs for etcd
by running:
journalctl -xeu etcd.service
Should see something like:
░░ Subject: A start job for unit etcd.service has finished successfully
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit etcd.service has finished successfully.
░░
░░ The job identifier is 5757.
Dec 02 14:09:30 postgres-01 etcd[7266]: {"level":"info","ts":"2024-12-02T14:09:30.862122-0600","caller":"v3rpc/health.go:61","msg":"grpc service status changed","service":"","status":"SERVING"}
Dec 02 14:09:30 postgres-01 etcd[7266]: {"level":"info","ts":"2024-12-02T14:09:30.862347-0600","caller":"embed/serve.go:187","msg":"serving client traffic insecurely; this is strongly discouraged!","traffic":"grpc+http","address":"[::]:2379"}
Once cluster is running, we should verify it's working on each by running
etcdctl endpoint health
etcdctl member list
Should see something like on node1:
127.0.0.1:2379 is healthy: successfully committed proposal: took = 1.786976ms
eb8ee14ab5150b4, started, postgresql-01, http://192.168.60.103:2380, http://192.168.60.103:2379, false
34e89b244664f02d, started, postgresql-02, http://192.168.60.104:2380, http://192.168.60.104:2379, false
8ee2a9473a41c400, started, postgresql-03, http://192.168.60.105:2380, http://192.168.60.105:2379, false
Do this for all 3 nodes
Now we should now edit the variables since the cluster is bootstrapped
Then restart etcd and verify the cluster is still working
sudo systemctl restart etcd
If you want to run the following commands without sudo, you can run:
sudo usermod -aG etcd $USER
sudo etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/etcd/ssl/ca.crt \
--cert=/etc/etcd/ssl/etcd-node1.crt \
--key=/etc/etcd/ssl/etcd-node1.key \
endpoint health
sudo etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/etcd/ssl/ca.crt \
--cert=/etc/etcd/ssl/etcd-node1.crt \
--key=/etc/etcd/ssl/etcd-node1.key \
member list
Should see something like
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 4.611121ms
59afb19d7cb2565d, started, postgresql-01, https://192.168.60.103:2380, https://192.168.60.103:2379, false
6338565ebcb76aa2, started, postgresql-02, https://192.168.60.104:2380, https://192.168.60.104:2379, false
9d74b3125c745c74, started, postgresql-03, https://192.168.60.105:2380, https://192.168.60.105:2379, false
You can check to see who is leader by running
sudo etcdctl \
--endpoints=https://192.168.60.103:2379,https://192.168.60.104:2379,https://192.168.60.105:2379 \
--cacert=/etc/etcd/ssl/ca.crt \
--cert=/etc/etcd/ssl/etcd-node1.crt \
--key=/etc/etcd/ssl/etcd-node1.key \
endpoint status --write-out=table
You should see something like this, specifically one of the nodes IS LEADER = true
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://192.168.60.103:2379 | 59afb19d7cb2565d | 3.5.17 | 20 kB | true | false | 2 | 12 | 12 | |
| https://192.168.60.104:2379 | 6338565ebcb76aa2 | 3.5.17 | 20 kB | false | false | 2 | 12 | 12 | |
| https://192.168.60.105:2379 | 9d74b3125c745c74 | 3.5.17 | 20 kB | false | false | 2 | 12 | 12 | |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Once this is all set up and working, we can now configure postgres
and patroni
.
We need to create some dirs for postgres
sudo mkdir -p /var/lib/postgresql/data
sudo mkdir -p /var/lib/postgresql/ssl
Notice we are using ssl, we need to generate a certificate
Generate a self-signed certificate
(this will last 20 years)
openssl genrsa -out server.key 2048 # private key
openssl req -new -key server.key -out server.req # csr
openssl req -x509 -key server.key -in server.req -out server.crt -days 7300 # generate cert, valid for 20 years
Update permissions
chmod 600 server.key
Move files to cert location
sudo mv server.crt server.key server.req /var/lib/postgresql/ssl
After doing this, we need to copy this to our other servers
Copy files locally from (your local machine!) to node1, node2, node3 using scp
scp
them to node1, node2, and node3
scp server.crt server.key server.req serveradmin@192.168.60.103:/tmp
scp server.crt server.key server.req serveradmin@192.168.60.104:/tmp
scp server.crt server.key server.req serveradmin@192.168.60.105:/tmp
On the servers (not your local machine) move files
cd /tmp
sudo mv server.crt server.key server.req /var/lib/postgresql/ssl
Update permissions on certificate
sudo chmod 600 /var/lib/postgresql/ssl/server.key
sudo chmod 644 /var/lib/postgresql/ssl/server.crt
sudo chmod 600 /var/lib/postgresql/ssl/server.req
sudo chown postgres:postgres /var/lib/postgresql/data
sudo chown postgres:postgres /var/lib/postgresql/ssl/server.*
We will need to give the postgres user read access to the etcd certificates using acls
sudo apt update
sudo apt install -y acl
node1
sudo setfacl -m u:postgres:r /etc/etcd/ssl/ca.crt
sudo setfacl -m u:postgres:r /etc/etcd/ssl/etcd-node1.crt
sudo setfacl -m u:postgres:r /etc/etcd/ssl/etcd-node1.key
node2
sudo setfacl -m u:postgres:r /etc/etcd/ssl/ca.crt
sudo setfacl -m u:postgres:r /etc/etcd/ssl/etcd-node2.crt
sudo setfacl -m u:postgres:r /etc/etcd/ssl/etcd-node2.key
node3
sudo setfacl -m u:postgres:r /etc/etcd/ssl/ca.crt
sudo setfacl -m u:postgres:r /etc/etcd/ssl/etcd-node3.crt
sudo setfacl -m u:postgres:r /etc/etcd/ssl/etcd-node3.key
Now that we have postgres certs configured, it's now time to configure patroni to operate or drive postgres
sudo apt install -y patroni
ssh
into node1
Make a dir for patroni
sudo mkdir -p /etc/patroni/
Create a config file and edit
sudo nano /etc/patroni/config.yml
node1
scope: postgresql-cluster
namespace: /service/
name: postgresql-01 # node1
etcd3:
hosts: 192.168.60.103:2379,192.168.60.104:2379,192.168.60.105:2379 # etcd cluster nodes
protocol: https
cacert: /etc/etcd/ssl/ca.crt
cert: /etc/etcd/ssl/etcd-node1.crt # node1's etcd certificate
key: /etc/etcd/ssl/etcd-node1.key # node1's etcd key
restapi:
listen: 0.0.0.0:8008
connect_address: 192.168.60.103:8008 # IP for node1's REST API
certfile: /var/lib/postgresql/ssl/server.pem
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576 # Failover parameters
postgresql:
parameters:
ssl: 'on' # Enable SSL
ssl_cert_file: /var/lib/postgresql/ssl/server.crt # PostgreSQL server certificate
ssl_key_file: /var/lib/postgresql/ssl/server.key # PostgreSQL server key
pg_hba: # Access rules
- hostssl replication replicator 127.0.0.1/32 md5
- hostssl replication replicator 192.168.60.103/32 md5
- hostssl replication replicator 192.168.60.104/32 md5
- hostssl replication replicator 192.168.60.105/32 md5
- hostssl all all 127.0.0.1/32 md5
- hostssl all all 0.0.0.0/0 md5
initdb:
- encoding: UTF8
- data-checksums
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.60.103:5432 # IP for node1's PostgreSQL
data_dir: /var/lib/postgresql/data
bin_dir: /usr/lib/postgresql/17/bin # Binary directory for PostgreSQL 17
authentication:
superuser:
username: postgres
password: cnV2abjbDpbh64e12987wR4mj5kQ3456Y0Qf # Superuser password - be sure to change
replication:
username: replicator
password: sad9a23jga8jsuedrwtsskj74567suiuwe23 # Replication password - be sure to change
parameters:
max_connections: 100
shared_buffers: 256MB
tags:
nofailover: false
noloadbalance: false
clonefrom: false
node2
scope: postgresql-cluster
namespace: /service/
name: postgresql-02 # Unique name for Node 2
etcd3:
hosts: 192.168.60.103:2379,192.168.60.104:2379,192.168.60.105:2379 # etcd cluster nodes
protocol: https
cacert: /etc/etcd/ssl/ca.crt
cert: /etc/etcd/ssl/etcd-node2.crt # Node 2's etcd certificate
key: /etc/etcd/ssl/etcd-node2.key # Node 2's etcd key
restapi:
listen: 0.0.0.0:8008
connect_address: 192.168.60.104:8008 # IP for Node 2's REST API
certfile: /var/lib/postgresql/ssl/server.pem
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
parameters:
ssl: 'on'
ssl_cert_file: /var/lib/postgresql/ssl/server.crt
ssl_key_file: /var/lib/postgresql/ssl/server.key
pg_hba:
- hostssl replication replicator 127.0.0.1/32 md5
- hostssl replication replicator 192.168.60.103/32 md5
- hostssl replication replicator 192.168.60.104/32 md5
- hostssl replication replicator 192.168.60.105/32 md5
- hostssl all all 127.0.0.1/32 md5
- hostssl all all 0.0.0.0/0 md5
initdb:
- encoding: UTF8
- data-checksums
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.60.104:5432 # IP for Node 2's PostgreSQL
data_dir: /var/lib/postgresql/data
bin_dir: /usr/lib/postgresql/17/bin
authentication:
superuser:
username: postgres
password: cnV2abjbDpbh64e12987wR4mj5kQ3456Y0Qf # Superuser password (provided)
replication:
username: replicator
password: sad9a23jga8jsuedrwtsskj74567suiuwe23 # Replication password (provided)
parameters:
max_connections: 100
shared_buffers: 256MB
tags:
nofailover: false
noloadbalance: false
clonefrom: false
node3
scope: postgresql-cluster
namespace: /service/
name: postgresql-03 # Unique name for Node 3
etcd3:
hosts: 192.168.60.103:2379,192.168.60.104:2379,192.168.60.105:2379 #etcd cluster nodes
protocol: https
cacert: /etc/etcd/ssl/ca.crt
cert: /etc/etcd/ssl/etcd-node3.crt # Node 3's etcd certificate
key: /etc/etcd/ssl/etcd-node3.key # Node 3's etcd key
restapi:
listen: 0.0.0.0:8008
connect_address: 192.168.60.105:8008 # IP for Node 3's REST API
certfile: /var/lib/postgresql/ssl/server.pem
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
parameters:
ssl: 'on'
ssl_cert_file: /var/lib/postgresql/ssl/server.crt
ssl_key_file: /var/lib/postgresql/ssl/server.key
pg_hba:
- hostssl replication replicator 127.0.0.1/32 md5
- hostssl replication replicator 192.168.60.103/32 md5
- hostssl replication replicator 192.168.60.104/32 md5
- hostssl replication replicator 192.168.60.105/32 md5
- hostssl all all 127.0.0.1/32 md5
- hostssl all all 0.0.0.0/0 md5
initdb:
- encoding: UTF8
- data-checksums
postgresql:
listen: 0.0.0.0:5432
connect_address: 192.168.60.105:5432 # IP for Node 3's PostgreSQL
data_dir: /var/lib/postgresql/data
bin_dir: /usr/lib/postgresql/17/bin
authentication:
superuser:
username: postgres
password: "cnV2abjbDpbh64e12987wR4mj5kQ3456Y0Qf" # Superuser password (provided)
replication:
username: replicator
password: sad9a23jga8jsuedrwtsskj74567suiuwe23 # Replication password (provided)
parameters:
max_connections: 100
shared_buffers: 256MB
tags:
nofailover: false
noloadbalance: false
clonefrom: false
Let's also use a certificate for this, requires a pem
sudo sh -c 'cat /var/lib/postgresql/ssl/server.crt /var/lib/postgresql/ssl/server.key > /var/lib/postgresql/ssl/server.pem'
sudo chown postgres:postgres /var/lib/postgresql/ssl/server.pem
sudo chmod 600 /var/lib/postgresql/ssl/server.pem
We can verify with:
sudo openssl x509 -in /var/lib/postgresql/ssl/server.pem -text -noout
Restart the service
sudo systemctl restart patroni
Check logs
journalctl -u patroni -f
Should see something like:
Dec 03 22:16:05 postgres-01 patroni[770]: 2024-12-03 22:16:05,399 INFO: no action. I am (postgresql-01), the leader with the lock
Dec 03 22:16:15 postgres-01 patroni[770]: 2024-12-03 22:16:15,399 INFO: no action. I am (postgresql-01), the leader with the lock
and
Dec 03 22:16:21 postgres-02 patroni[768]: 2024-12-03 22:16:21,780 INFO: Lock owner: postgresql-01; I am postgresql-02
Dec 03 22:16:21 postgres-02 patroni[768]: 2024-12-03 22:16:21,823 INFO: bootstrap from leader 'postgresql-01' in progress
sudo nano /etc/etcd/etcd.env
Change
ETCD_INITIAL_CLUSTER_STATE="new"
to
ETCD_INITIAL_CLUSTER_STATE="existing"
Do this on all 3 nodes!
We now have a ha postgres cluster!
However we don't always know who the leader is so we can't use an IP
We can test the patroni endpoint to see who is leader
curl -k https://192.168.60.103:8008/primary
curl -k https://192.168.60.104:8008/primary
curl -k https://192.168.60.105:8008/primary
If you ever want to see your global config you can with
sudo patronictl -c /etc/patroni/config.yml show-config
If you ever want to edit it, you can with:
sudo patronictl -c /etc/patroni/config.yml edit-config
After saving these will be replicated to all nodes, note you might want to update your boostrap config at some pouint
this is where ha proxy comes in
switch nodes
install ha proxy
sudo apt -y install haproxy
Once installed we need to add some config
sudo nano /etc/haproxy/haproxy.cfg
frontend postgres_frontend
bind *:5432
mode tcp
default_backend postgres_backend
backend postgres_backend
mode tcp
option tcp-check
option httpchk GET /primary # patroni provides an endpoint to check node roles
http-check expect status 200 # expect 200 for the primary node
timeout connect 5s
timeout client 30s
timeout server 30s
server postgresql-01 192.168.60.103:5432 port 8008 check check-ssl verify none
server postgresql-02 192.168.60.104:5432 port 8008 check check-ssl verify none
server postgresql-03 192.168.60.105:5432 port 8008 check check-ssl verify none
Do this for all 3 nodes
restart
sudo systemctl reload haproxy
check logs
sudo tail -f /var/log/syslog | grep haproxy
Now we need to install keepalived to create a VIP
sudo apt update
sudo apt install keepalived -y
Now we need to apply a configuration file
sudo nano /etc/keepalived/keepalived.conf
haproxy1
global_defs {
enable_script_security
script_user keepalived_script
}
vrrp_script check_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
fall 3
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface eth0 # update with your nic
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass tDHjh7by # change
}
virtual_ipaddress {
192.168.60.110
}
track_script {
check_haproxy
}
}
haproxy2
global_defs {
enable_script_security
script_user keepalived_script
}
vrrp_script check_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
fall 3
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0 # update with your nic
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS # change
auth_pass tDHjh7by # change
}
virtual_ipaddress {
192.168.60.110
}
track_script {
check_haproxy
}
}
haproxy3
global_defs {
enable_script_security
script_user keepalived_script
}
vrrp_script check_haproxy {
script "/etc/keepalived/check_haproxy.sh"
interval 2
fall 3
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0 # update with your nic
virtual_router_id 51
priority 80
advert_int 1
authentication {
auth_type PASS # change
auth_pass tDHjh7by # change
}
virtual_ipaddress {
192.168.60.110
}
track_script {
check_haproxy
}
}
Create a check script on each
sudo nano /etc/keepalived/check_haproxy.sh
#!/bin/bash
# Define the port to check (e.g., HAProxy frontend port)
PORT=5432
# Check if HAProxy is running
if ! pidof haproxy > /dev/null; then
echo "HAProxy is not running"
exit 1
fi
# Check if HAProxy is listening on the expected port
if ! ss -ltn | grep -q ":${PORT}"; then
echo "HAProxy is not listening on port ${PORT}"
exit 2
fi
# All checks passed
exit 0
we need to add a user to execute these scripts
sudo useradd -r -s /bin/false keepalived_script
sudo chmod +x /etc/keepalived/check_haproxy.sh
sudo chown keepalived_script:keepalived_script /etc/keepalived/check_haproxy.sh
sudo chmod 700 /etc/keepalived/check_haproxy.sh
sudo systemctl restart keepalived
Check logs
sudo journalctl -u keepalived -f
we should not be able to ping the VIP
ping 192.168.60.110
Connected with a client
Connect to your VIP 192.168.60.110
and use the postgres
user and password.
Create a table with data
-- Create a table for Nintendo characters
CREATE TABLE nintendo_characters (
character_id SERIAL PRIMARY KEY, -- Unique identifier for each character
name VARCHAR(50) NOT NULL, -- Name of the character
game_series VARCHAR(50), -- Game series the character belongs to
debut_year INT, -- Year the character debuted
description TEXT, -- Brief description of the character
is_playable BOOLEAN DEFAULT TRUE -- Whether the character is playable
);
-- Insert some example characters
INSERT INTO nintendo_characters (name, game_series, debut_year, description, is_playable)
VALUES
('Mario', 'Super Mario', 1981, 'The iconic plumber and hero of the Mushroom Kingdom.', TRUE),
('Link', 'The Legend of Zelda', 1986, 'A courageous hero tasked with saving Hyrule.', TRUE),
('Samus Aran', 'Metroid', 1986, 'A bounty hunter equipped with a powerful Power Suit.', TRUE),
('Donkey Kong', 'Donkey Kong', 1981, 'A powerful gorilla and protector of the jungle.', TRUE),
('Princess Zelda', 'The Legend of Zelda', 1986, 'The princess of Hyrule and possessor of the Triforce of Wisdom.', FALSE),
('Bowser', 'Super Mario', 1985, 'The King of the Koopas and Marios arch-nemesis.', FALSE),
('Kirby', 'Kirby', 1992, 'A pink puffball with the ability to inhale enemies and copy their powers.', TRUE),
('Pikachu', 'Pokémon', 1996, 'An Electric-type Pokémon and mascot of the Pokémon series.', TRUE),
('Fox McCloud', 'Star Fox', 1993, 'A skilled pilot and leader of the Star Fox team.', TRUE),
('Captain Falcon', 'F-Zero', 1990, 'A bounty hunter and expert racer known for his Falcon Punch.', TRUE);
-- Select all rows to verify the table creation and data insertion
SELECT * FROM nintendo_characters;
SELECT * FROM nintendo_characters;
insert more characters
-- Insert additional Nintendo characters into the table
INSERT INTO nintendo_characters (name, game_series, debut_year, description, is_playable)
VALUES
('Yoshi', 'Super Mario', 1990, 'A friendly green dinosaur and Marios trusted companion.', TRUE),
('Luigi', 'Super Mario', 1983, 'Marios younger brother and a skilled ghost hunter.', TRUE),
('King Dedede', 'Kirby', 1992, 'The self-proclaimed king of Dream Land and occasional ally of Kirby.', TRUE),
('Meta Knight', 'Kirby', 1993, 'A mysterious swordsman who often challenges Kirby.', TRUE),
('Marth', 'Fire Emblem', 1990, 'A legendary hero and prince from the Fire Emblem series.', TRUE),
('Ness', 'EarthBound', 1994, 'A young boy with psychic powers and a bat-wielding hero.', TRUE),
('Jigglypuff', 'Pokémon', 1996, 'A Balloon Pokémon known for its singing abilities.', TRUE),
('Villager', 'Animal Crossing', 2001, 'A customizable character from the Animal Crossing series.', TRUE),
('Isabelle', 'Animal Crossing', 2012, 'A cheerful assistant who helps manage your town.', TRUE),
('Ganondorf', 'The Legend of Zelda', 1998, 'The King of Evil and nemesis of Link.', TRUE);
SELECT * FROM nintendo_characters;
If you want to clean this data up, you can DROP
the table
-- Drop the nintendo_characters table if it exists
DROP TABLE IF EXISTS nintendo_characters;
Reset all data but certs and start over
sudo systemctl stop patroni
sudo systemctl stop etcd
sudo rm -rf /var/lib/etcd/
sudo rm -rf /var/lib/postgresql/data/
sudo mkdir -p /var/lib/postgresql/data
sudo mkdir -p /var/lib/etcd/
sudo chown etcd:etcd /var/lib/etcd/
sudo setfacl -m u:postgres:r /etc/etcd/ssl/ca.crt
sudo setfacl -m u:postgres:r /etc/etcd/ssl/etcd-node*
sudo systemctl start etcd
sudo systemctl start patroni
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>Today, I teach you how you can self-host a production-ready PostgreSQL cluster. I've even included all of the copy pasta commands!https://t.co/OGHUGfDEs2 pic.twitter.com/9NWtYcWcO6
— Techno Tim (@TechnoTimLive) December 7, 2024
🛍️ Check out the new Merch Shop at https://l.technotim.live/shop
⚙️ See all the hardware I recommend at https://l.technotim.live/gear
🚀 Don't forget to check out the 🚀Launchpad repo with all of the quick start source files
🤝 Support me and help keep this site ad-free!