This document explains how we migrated from PostgreSQL 9.6 to PostgreSQL 17 with minimal downtime using pglogical (logical replication). It covers:
- Overview
- Prerequisites & Setup
- Migration Strategy
- Step-by-Step Process
- Sequences After Cutover
- Common Pitfalls
- Conclusion
- References
- Old Server: PostgreSQL 9.6 (EOL), installed around 2017, IP
10.10.10.50
, Databaseacme_db
. - New Server: Ubuntu 24.04 running PostgreSQL 17, IP
10.10.10.60
, Databaseacme_db
. - Goal: Near-zero downtime upgrade to a fully supported PostgreSQL version on a brand-new OS.
Why pglogical?
- Logical replication works across major versions (9.6 → 17).
- Allows continuous data synchronization, drastically reducing downtime.
- We needed a new machine, not an in-place upgrade, so streaming replication or pg_upgrade wasn’t ideal.
-
On the old server:
- Install pglogical (compile from source if not in repos).
- Edit
postgresql.conf
:wal_level = logical max_wal_senders = 10 max_replication_slots = 10
- Update
pg_hba.conf
to allow connections from new server’s IP (10.10.10.60
):host replication repuser 10.10.10.60/32 md5 host all repuser 10.10.10.60/32 md5
- Create a replication user with superuser:
CREATE ROLE repuser WITH LOGIN SUPERUSER PASSWORD 'secret';
- Restart PostgreSQL 9.6 to apply changes.
-
On the new server (Ubuntu 24.04):
- Install PostgreSQL 17 +
postgresql-17-pglogical
. - Create database
acme_db
. - Confirm you can connect via
psql
.
- Install PostgreSQL 17 +
- Dump & restore the schema (no data) so both databases share the same structure.
- (Optional) Pre-copy large tables manually to avoid re-copying if an error occurs during initial sync.
- Use pglogical to replicate the remaining tables, so changes stream continuously to the new server.
- Cut over after verifying data sync and minimal downtime.
First, schema-only dump on the old server, then restore on the new server:
# On old server (9.6)
pg_dump -h 10.10.10.50 -U repuser -s -d acme_db > acme_schema.sql
# On new server (17)
psql -h 10.10.10.60 -U repuser -d acme_db -f acme_schema.sql
This ensures all tables, indexes, and constraints match in both environments.
If you have very large tables (tens of GB or more), consider dumping & restoring them manually:
# On old server
pg_dump -h 10.10.10.50 -U repuser -d acme_db --data-only -t public.large_table > large_table_data.sql
# Copy to new server, then restore:
psql -h 10.10.10.60 -U repuser -d acme_db -f large_table_data.sql
Later, you’ll add these tables to the replication set with synchronize_data = false, so pglogical will not do a big initial copy.
On the old server (9.6), in acme_db:
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
node_name := 'acme_provider',
dsn := 'host=10.10.10.50 dbname=acme_db user=repuser password=secret'
);
Add tables to your replication set:
-- Add all tables:
SELECT pglogical.replication_set_add_all_tables(
set_name := 'default',
schema_names := ARRAY['public'],
synchronize_data := true
);
-- If you've pre-copied "large_table", remove it from auto-sync:
SELECT pglogical.replication_set_remove_table('default', 'public.large_table');
-- Re-add it with no initial copy:
SELECT pglogical.replication_set_add_table(
'default',
'public.large_table',
synchronize_data := false
);
Repeat for any other large tables you’ve manually restored.
On the new server (17), in acme_db:
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
node_name := 'acme_subscriber',
dsn := 'host=10.10.10.60 dbname=acme_db user=repuser password=secret'
);
Create the subscription on the new server, telling it to pull from the old server:
SELECT pglogical.create_subscription(
subscription_name := 'acme_subscription',
provider_dsn := 'host=10.10.10.50 dbname=acme_db user=repuser password=secret',
replication_sets := ARRAY['default'],
synchronize_data := true
);
Now, smaller tables do a full initial copy, while any table added with synchronize_data = false only gets incremental changes.
Monitor:
SELECT * FROM pglogical.show_subscription_status();
Look for status = 'replicating'. If it’s down, check logs on both servers for foreign key issues, missing tables, or WAL retention problems.
Once fully synchronized:
Stop writes on old server (e.g., put the app in maintenance mode). Wait a few seconds for the last WAL to replicate. Point your application to the new server (10.10.10.60). You’re now running on PostgreSQL 17 with minimal downtime!
A common issue is that sequence values might be out of sync with the table’s current MAX(id). If your newly loaded table’s highest id is 100,000 but the sequence’s internal counter is only 5,000, you’ll get duplicate key errors.
Solution: You can find all tables and columns that depend on sequences with below query:
SELECT
t.relname AS table_name,
a.attname AS column_name,
s.relname AS sequence_name
FROM pg_class s
JOIN pg_depend d ON d.objid = s.oid
JOIN pg_class t ON d.refobjid = t.oid
JOIN pg_attribute a
ON a.attrelid = t.oid
AND d.refobjsubid = a.attnum
WHERE s.relkind = 'S'
AND t.relkind IN ('r','p')
AND d.deptype = 'a'
ORDER BY t.relname;
You can set sequences with below query:
SELECT setval('mytable_id_seq', (SELECT COALESCE(MAX(id), 0) FROM mytable) + 1, false);
Run a query like this for each table that uses a serial/bigserial column. Or use a plpgsql block to loop through all sequences automatically.
If tables are too much you can use below script:
DO $$
DECLARE
rec record;
current_max bigint;
BEGIN
FOR rec IN
-- This query finds every (table_name, column_name, sequence_name) triple
-- where the sequence is "owned" by that column (typical of SERIAL/BIGSERIAL).
SELECT t.relname AS table_name,
a.attname AS column_name,
s.relname AS sequence_name
FROM pg_class s
JOIN pg_depend d ON d.objid = s.oid
JOIN pg_class t ON d.refobjid = t.oid
JOIN pg_attribute a ON a.attrelid = t.oid AND d.refobjsubid = a.attnum
WHERE s.relkind = 'S' -- S = sequence
AND t.relkind IN ('r', 'p') -- r = ordinary table, p = partitioned table
AND a.attname = 'id' -- only fix sequences for "id" columns (adjust if needed)
AND d.deptype = 'a' -- 'a' = automatic dependency (SERIAL/IDENTITY)
ORDER BY t.relname
LOOP
-- Get the current max(id) from the table
EXECUTE format('SELECT max(%I)::bigint FROM %I', rec.column_name, rec.table_name)
INTO current_max;
IF current_max IS NULL THEN
current_max := 0;
END IF;
-- Bump the sequence to max(id)+1 so the next insert doesn't conflict
EXECUTE format('SELECT setval(%L, %s, false)', rec.sequence_name, current_max + 1);
RAISE NOTICE 'Fixed sequence for table: %, column: %, sequence: %, set to %',
rec.table_name, rec.column_name, rec.sequence_name, current_max + 1;
END LOOP;
END $$;
- Foreign Key Failures: Load parent tables first or disable constraints if you see FK errors during initial copy.
- WAL Retention: Ensure wal_keep_size or archiving is enough to keep all logs until the subscriber finishes syncing (especially for large data sets).
- “Relation Already Exists”: If the table is already defined, use --data-only to avoid DDL conflicts.
- Subscription Down: Usually indicates mismatched schemas, missing indexes, or insufficient privileges. Check server logs carefully.
By following these steps, we achieved a near-zero-downtime migration from a PostgreSQL 9.6 server (installed around 2017) to a PostgreSQL 17 instance on a modern Ubuntu 24.04 environment. We pre-copied large tables, set up logical replication for the rest, handled sequences post-cutover, and avoided extended service disruption.
- Carefully plan the order of data loading if foreign keys are involved.
- Pre-copy large tables to reduce restart overhead.
- Remember to bump sequence values to avoid insert conflicts.
- Monitor logs on both servers to troubleshoot quickly. This ensures you can retire an old EOL database version, reduce technical debt, and move to a fully supported, more performant environment.