Clone a PostgreSQL data directory locally using reflink copies.
python3 pg_refclone.py ORIGINAL_DIR CLONE_DIRExample:
python3 pg_refclone.py /var/lib/pgsql/18/data /var/lib/pgsql/18/cloneThen start the clone as a standby:
pg_ctl -D /var/lib/pgsql/18/clone -o "-p 5433" start- PostgreSQL 15+ (uses
pg_backup_start/pg_backup_stopAPI) - Python 3.14+
- psycopg 3
- reflink-capable filesystem (e.g., btrfs, XFS)
- The
pg_receivewalbinary must be in the same directory as thepostgresbinary
- Connect to the source PostgreSQL instance using the socket from
postmaster.pid - Create a physical replication slot with a unique random name (
pg_refclone_xxxxxx) - Start a backup with
pg_backup_start() - Copy the data directory (using reflinks)
- Stop the backup with
pg_backup_stop() - Advance the replication slot to the backup start LSN
- Stream the required WAL files using
pg_receivewal - Drop the replication slot
We attempted to use a temporary replication slot (created with temporary = TRUE) as an optimization to avoid manual cleanup. The idea was that the slot would be automatically dropped when the session closes.
However, this approach has a fundamental limitation: PostgreSQL's streaming replication protocol is separate from the SQL protocol. To stream WAL, we need to either:
- Use
pg_receivewal(or similar tool) which requires a persistent slot - Implement the replication protocol in Python
Currently, psycopg3 does not support physical replication protocol. See: psycopg/psycopg#71
Therefore, we must use a non-temporary slot with a unique name to avoid conflicts when running multiple clones concurrently.
- The clone directory must not exist or must be empty
- The original and clone directories must be on the same filesystem (for reflink support)
- WAL is not copied from the source; it is streamed separately after the backup
- The clone is started in recovery mode via
standby.signal - Symlinks are not preserved; files are copied directly to avoid data corruption