# Chapter 34: Logical Replication

Logical replication provides fine-grained, selective data replication at the row level, enabling zero-downtime migrations, data warehousing pipelines, and cross-database data distribution. Unlike physical replication which replicates byte-for-byte disk blocks, logical replication decodes WAL into SQL operations, allowing replication between different PostgreSQL versions, architectures, and selectively filtered datasets.

---

## 34.1 Logical vs. Physical Replication

### 34.1.1 Architectural Differences

| Aspect | Physical Streaming | Logical Replication |
|--------|-------------------|---------------------|
| **Granularity** | Block-level (entire cluster) | Row-level (selected tables) |
| **DDL Replication** | Yes (all schema changes) | No (schema must match manually) |
| **Cross-Version** | No (same major version) | Yes (PG10+ to PG16+) |
| **Cross-Architecture** | No (same endian/OS) | Yes (x86 to ARM, Linux to Windows) |
| **Write Target** | Standby is read-only | Subscriber can be read-write |
| **Conflict Handling** | N/A (single primary) | Configurable conflict resolution |
| **Lag Measurement** | Bytes behind | Rows/Transactions behind |

**Use Case Selection**:
- **Physical**: High availability, disaster recovery, read replicas (identical copies)
- **Logical**: Data warehousing, migrations, selective replication, multi-master edge cases

### 34.1.2 How Logical Replication Works

```text
Publisher (Primary)
  ↓ (WAL decoding)
Logical Decoding Plugin (pgoutput)
  ↓ (Row changes)
WAL Sender (logical slot)
  ↓ (Protocol)
Network
  ↓
Subscription Worker (Apply)
  ↓ (INSERT/UPDATE/DELETE)
Subscriber (Target DB)
```

**Key Insight**: Logical replication uses a **decoding slot** on the publisher to translate binary WAL into logical row changes, then streams these to subscribers who apply them as regular SQL operations.

---

## 34.2 Configuration Prerequisites

### 34.2.1 Publisher Configuration

```ini
# postgresql.conf on PUBLISHER

wal_level = logical              # Must be 'logical' (not just 'replica')
                                 # Increases WAL volume compared to replica level

max_replication_slots = 10       # Each subscription needs one slot
                                 # Count: 1 per logical subscriber + physical standbys

max_logical_replication_workers = 8  # Parallel apply workers (PG14+)
                                     # Default 4, increase for high throughput

max_worker_processes = 16        # Must accommodate logical workers
                                 # Default 8, increase if using many subscriptions
```

**WAL Volume Impact**:
- `logical` level adds information for row decoding (old tuple values for updates/deletes)
- Expect 10-30% more WAL volume than `replica` level
- Use `wal_compression = on` to mitigate

### 34.2.2 Subscriber Configuration

```ini
# postgresql.conf on SUBSCRIBER

max_logical_replication_workers = 8      # Apply workers
max_parallel_apply_workers_per_subscription = 4  # PG14+ parallel apply
max_worker_processes = 16

# For conflict detection (multi-master scenarios)
max_replication_slots = 4                # If this node also publishes
wal_level = logical                      # If bidirectional
```

### 34.2.3 Access Control

```sql
-- On PUBLISHER: Create replication user for logical replication
CREATE USER logical_repl WITH REPLICATION LOGIN ENCRYPTED PASSWORD 'secure_pass';

-- Logical replication requires SELECT on published tables
GRANT SELECT ON TABLE users, orders, products TO logical_repl;

-- Or grant all tables in schema (maintenance burden)
GRANT SELECT ON ALL TABLES IN SCHEMA public TO logical_repl;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO logical_repl;
```

**pg_hba.conf**:
```conf
# Logical replication uses replication connections
hostssl replication logical_repl 10.0.2.0/24 scram-sha-256
# Or specific subscriber IPs
hostssl replication logical_repl 10.0.2.20/32 scram-sha-256
```

---

## 34.3 Publications (The Source)

Publications define what data is available for replication. They act as the "broadcast channel" from the publisher.

### 34.3.1 Creating Publications

```sql
-- Publication for specific tables
CREATE PUBLICATION user_data FOR TABLE users, user_profiles, user_preferences;

-- Publication for all tables in database (including future tables)
CREATE PUBLICATION all_tables FOR ALL TABLES;

-- Publication with schema-qualified names
CREATE PUBLICATION sales_data FOR TABLE public.orders, archive.legacy_orders;

-- View existing publications
SELECT pubname, puballtables, pubinsert, pubupdate, pubdelete, pubtruncate 
FROM pg_publication;

-- View tables in publication
SELECT schemaname, tablename 
FROM pg_publication_tables 
WHERE pubname = 'user_data';
```

### 34.3.2 Operation Filtering

Control which operations are replicated (insert, update, delete, truncate):

```sql
-- Replicate only inserts and updates (ignore deletes for audit trail)
CREATE PUBLICATION audit_trail FOR TABLE user_activity 
WITH (publish = 'insert, update');

-- Replicate only inserts (append-only data warehouse)
CREATE PUBLICATION events_stream FOR TABLE events 
WITH (publish = 'insert');

-- Options: insert, update, delete, truncate
-- Default: all operations
```

### 34.3.3 Row Filtering (PostgreSQL 15+)

Replicate only rows matching a WHERE clause:

```sql
-- Replicate only active users to subscriber
CREATE PUBLICATION active_users FOR TABLE users 
WHERE (status = 'active' AND deleted_at IS NULL);

-- Replicate only recent orders to analytics warehouse
CREATE PUBLICATION recent_orders FOR TABLE orders 
WHERE (created_at > '2024-01-01'::timestamptz);

-- Complex conditions supported
CREATE PUBLICATION high_value_customers FOR TABLE customers 
WHERE (tier = 'enterprise' OR lifetime_value > 10000);
```

**Restrictions**:
- System columns cannot be used in WHERE (no `ctid`, `xmin`, etc.)
- Row-level security policies do not affect replication (publication sees all rows)
- Expression must be immutable or stable (no volatile functions like `now()` in PG15, but `current_timestamp` works)

### 34.3.4 Column Lists (PostgreSQL 15+)

Replicate only specific columns (exclude PII or large blobs):

```sql
-- Replicate only non-sensitive columns to analytics
CREATE PUBLICATION safe_user_data FOR TABLE users 
COLUMNS (user_id, username, created_at, last_login, status) 
WHERE (deleted_at IS NULL);

-- Excluded columns: email, password_hash, ssn, phone

-- Verify column selection
SELECT * FROM pg_publication_tables 
WHERE pubname = 'safe_user_data';
```

### 34.3.5 Partition Handling

```sql
-- Publication automatically includes all partitions
CREATE PUBLICATION partitioned_data FOR TABLE events;

-- When you add new partition to events, it's automatically included
CREATE TABLE events_2024_10 PARTITION OF events 
FOR VALUES FROM ('2024-10-01') TO ('2024-11-01');
-- Automatically replicated if parent is in publication

-- Excluding specific partitions (if needed)
-- Must use row filter on parent or create separate publications
```

### 34.3.6 Publication Maintenance

```sql
-- Add tables to existing publication
ALTER PUBLICATION user_data ADD TABLE user_sessions, user_logs;

-- Remove tables
ALTER PUBLICATION user_data DROP TABLE user_logs;

-- Alter publication options
ALTER PUBLICATION user_data SET (publish = 'insert, update, delete');

-- Drop publication (does not affect data, just stops replication)
DROP PUBLICATION IF EXISTS user_data;
```

---

## 34.4 Subscriptions (The Target)

Subscriptions connect to publications and receive changes. A subscription creates a replication slot on the publisher to track WAL position.

### 34.4.1 Creating Subscriptions

```sql
-- On SUBSCRIBER database
CREATE SUBSCRIPTION user_data_sub
CONNECTION 'host=publisher.internal port=5432 dbname=production user=logical_repl password=secret sslmode=require'
PUBLICATION user_data
WITH (
    copy_data = true,           -- Copy existing data initially
    create_slot = true,         -- Create replication slot on publisher
    slot_name = 'user_data_sub_slot',  -- Named slot (easier to monitor)
    enabled = true,             -- Start replication immediately
    streaming = true,           -- PG14+: Stream in-progress transactions
    binary = false              -- Text vs binary transfer (binary faster but less compatible)
);
```

**Connection String Security**:
- Store passwords in `.pgpass` file on subscriber server (chmod 600)
- Or use certificate authentication: `sslcert=/path/client.crt sslkey=/path/client.key`
- Never hardcode passwords in CREATE SUBSCRIPTION

### 34.4.2 Subscription States

```sql
-- Check subscription status
SELECT 
    subname,
    subenabled,
    subslotname,
    subpublications,
    subconninfo  -- Warning: shows password if in connection string
FROM pg_subscription;

-- Detailed worker status
SELECT 
    subid,
    subname,
    pid,
    relid::regclass as table_name,
    received_lsn,
    last_msg_send_time,
    last_msg_receipt_time
FROM pg_stat_subscription;
```

### 34.4.3 Controlling Replication

```sql
-- Pause replication (maintenance window)
ALTER SUBSCRIPTION user_data_sub DISABLE;

-- Resume
ALTER SUBSCRIPTION user_data_sub ENABLE;

-- Skip current transaction (dangerous - use only for resolution)
-- Must use pg_replication_origin_advance() or drop/recreate subscription

-- Change publication list
ALTER SUBSCRIPTION user_data_sub 
SET PUBLICATION user_data, sales_data;  -- Add sales_data
```

---

## 34.5 Initial Synchronization

When `copy_data = true`, the subscription first copies all existing table data, then begins streaming changes.

### 34.5.1 The Synchronization Process

1. **Slot Creation**: Creates logical slot at current WAL position
2. **Table Copy**: `COPY` command dumps table data to subscriber
3. **Index Creation**: Primary keys and unique constraints created
4. **Catch-up**: Applies changes that occurred during copy
5. **Streaming**: Switches to real-time change streaming

**Monitoring Initial Sync**:
```sql
-- On SUBSCRIBER: Check sync state per table
SELECT 
    schemaname, 
    tablename, 
    state  -- 'i' = initializing, 's' = synchronizing, 'r' = ready/replicating
FROM pg_subscription_rel;

-- Bytes remaining to sync (approximate)
SELECT 
    subname,
    pg_size_pretty(pg_wal_lsn_diff(sent_lsn, flush_lsn)) as lag
FROM pg_stat_replication 
WHERE application_name = 'user_data_sub';
```

### 34.5.2 Large Initial Datasets

For tables > 100GB, initial copy may take hours:

```sql
-- Option 1: Parallel copy using multiple subscriptions (partition by ID ranges)
-- Manually copy ranges, then create subscription with copy_data = false

-- Option 2: Physical restore + logical catch-up
-- 1. pg_basebackup to subscriber
-- 2. Create subscription with copy_data = false (skip initial copy)
-- 3. Subscription starts at current WAL position

CREATE SUBSCRIPTION user_data_sub
CONNECTION '...'
PUBLICATION user_data
WITH (copy_data = false, create_slot = true);
```

---

## 34.6 Conflict Resolution

When the subscriber is writable (not read-only), conflicts can occur if the same row is modified on both publisher and subscriber.

### 34.6.1 Types of Conflicts

1. **INSERT-INSERT**: Same primary key inserted on both nodes
2. **UPDATE-UPDATE**: Same row updated differently on both nodes
3. **UPDATE-DELETE**: Updated on publisher, deleted on subscriber
4. **DELETE-DELETE**: Deleted on both (harmless)

### 34.6.2 Default Behavior (PostgreSQL < 15)

Before PG 15, conflicts halt replication with an error:

```text
ERROR: duplicate key value violates unique constraint "users_pkey"
DETAIL: Key (user_id)=(123) already exists.
CONTEXT: processing remote data for replication origin "pg_16389" during message type "INSERT" for replication target relation "public.users" in transaction 1234, finished at 0/1234567
```

**Manual Resolution**:
```sql
-- On subscriber: Delete conflicting local row
DELETE FROM users WHERE user_id = 123;

-- Or update to match expected state
UPDATE users SET ... WHERE user_id = 123;

-- Restart subscription worker (it will retry)
SELECT pg_reload_conf();
-- Or disable/enable subscription
```

### 34.6.3 Conflict Handlers (PostgreSQL 15+)

PG 15 introduces configurable conflict resolution:

```sql
-- Set conflict resolution method
ALTER SUBSCRIPTION user_data_sub 
SET (conflict_resolution = 'apply_remote');

-- Options:
-- 'error' (default): Stop replication, manual intervention required
-- 'apply_remote': Remote (publisher) change wins, local change overwritten
-- 'keep_local': Local change wins, remote change discarded
-- 'apply_remote_if_not_null': Apply remote unless remote value is NULL
```

**Important**: Conflict resolution only works for simple cases. Complex multi-master scenarios require application-level conflict resolution (last-write-wins timestamps, CRDTs, etc.).

### 34.6.4 Avoiding Conflicts

**Pattern 1**: Directional replication (publisher → subscriber only)
- Make subscriber read-only for replicated tables
- Use different tables for local writes

**Pattern 2**: Partition by origin
- Publisher writes to `orders_east`, subscriber writes to `orders_west`
- Replicate both directions but to different tables

**Pattern 3**: Application-level sharding
- User IDs 1-1000000 on Node A, 1000001-2000000 on Node B
- Replication for backup only, never concurrent writes to same ID

---

## 34.7 Limitations and Restrictions

### 34.7.1 Schema Changes (DDL)

**Critical Limitation**: Logical replication does not replicate schema changes.

**Handling Schema Changes**:
```sql
-- Step 1: Add column to publisher
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Step 2: Add column to subscriber (must match exactly)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Step 3: If subscriber is PG15+, refresh publication to pick up new column
ALTER SUBSCRIPTION user_data_sub REFRESH PUBLICATION;

-- If column order matters, use specific column lists
```

**Failure Mode**: If schemas don't match, replication stops with:
```text
ERROR: logical replication target relation "public.users" is missing some replicated columns
```

### 34.7.2 Sequences

Sequences are **not** replicated. The subscriber's sequences remain at their initial values.

**Workarounds**:
```sql
-- Option 1: Manually sync sequence after initial copy
SELECT setval('users_user_id_seq', (SELECT MAX(user_id) FROM users));

-- Option 2: Use UUIDs instead of sequences (no sync needed)
-- Option 3: Application sets sequence ranges per node
-- Node A: 1-1000000, Node B: 1000001-2000000
```

### 34.7.3 Large Objects (TOAST)

TRUNCATE operations on toasted values (large text, jsonb) may not replicate correctly if subscriber has different toast settings.

### 34.7.4 Triggers and Defaults

**Trigger Execution**:
- Triggers on subscriber execute when replication applies changes
- Can cause double-execution (trigger fires on publisher and subscriber)

**Solution**:
```sql
-- Check if executing in logical replication apply context
CREATE OR REPLACE FUNCTION audit_trigger() RETURNS trigger AS $$
BEGIN
    IF pg_logical_emit_message(false, 'ignore', '') THEN
        -- We're in logical replication, skip to avoid duplication
        RETURN NULL;
    END IF;
    -- Normal audit logic...
END;
$$ LANGUAGE plpgsql;
```

**Defaults**:
- If subscriber column has DEFAULT, and publisher sends NULL, the default applies
- Can cause data divergence if not intentional

---

## 34.8 Use Cases and Patterns

### 34.8.1 Zero-Downtime Major Version Upgrades

**Architecture**:
- PG15 Publisher → Logical Replication → PG16 Subscriber
- Minimal downtime (seconds) for cutover

```bash
# 1. Set up PG16 subscriber, replicate all tables
# 2. Wait for lag = 0
# 3. Stop application writes to PG15
# 4. Wait for final lag catch-up (seconds)
# 5. Drop subscription on PG16 (now independent)
# 6. Point application to PG16
# 7. Drop publication on PG15
```

### 34.8.2 Data Warehousing / ETL

**Pattern**: Selective replication of "safe" data

```sql
-- Publisher: Production OLTP
CREATE PUBLICATION warehouse_export FOR TABLE orders, order_items, customers
WHERE (created_at > CURRENT_DATE - INTERVAL '2 years')  -- Recent data only
COLUMNS (order_id, customer_id, total, created_at);      -- No PII

-- Subscriber: Analytics warehouse (separate hardware, optimized for reads)
-- Can have different indexes (GIN, BRIN) than OLTP
```

### 34.8.3 Microservices Data Distribution

**Pattern**: Service A owns user data, Service B needs read-only copy

```sql
-- Service A database
CREATE PUBLICATION user_profile_public FOR TABLE users 
COLUMNS (user_id, display_name, avatar_url, bio);  -- Public fields only
-- Excludes: email, password_hash, internal_notes

-- Service B database subscribes, has local extensions for caching
```

---

## 34.9 Monitoring and Maintenance

### 34.9.1 Lag Monitoring

```sql
-- Logical replication lag (bytes)
SELECT 
    client_addr,
    state,
    sent_lsn,
    write_lsn,
    flush_lsn,
    replay_lsn,
    pg_wal_lsn_diff(sent_lsn, replay_lsn) as lag_bytes
FROM pg_stat_replication 
WHERE application_name LIKE '%subscription%';

-- Time-based lag (requires PG14+)
SELECT 
    slot_name,
    confirmed_flush_lsn,
    pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) as lag_bytes,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) as lag_size
FROM pg_replication_slots 
WHERE slot_type = 'logical';
```

### 34.9.2 Worker Process Monitoring

```sql
-- Check if logical workers are running
SELECT * FROM pg_stat_activity WHERE backend_type = 'logical replication worker';

-- Check for apply errors
SELECT * FROM pg_stat_subscription;

-- If subscription is disabled due to error, check logs:
-- grep "logical replication" /var/log/postgresql/postgresql-*.log
```

### 34.9.3 Slot Retention (Critical)

Logical slots prevent WAL removal, same as physical slots.

```sql
-- Monitor slot lag (if subscriber down, WAL accumulates)
SELECT 
    slot_name,
    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) as retained_wal
FROM pg_replication_slots
WHERE slot_type = 'logical';

-- Alert if > 50GB retained
```

---

## Chapter Summary

In this chapter, you learned:

1. **Architecture**: Logical replication decodes WAL into row changes using the `pgoutput` plugin, enabling cross-version and selective replication. It requires `wal_level = logical` and uses publications (what to send) and subscriptions (what to receive).

2. **Publications**: Define replicated data with `CREATE PUBLICATION`. Use row filters (`WHERE` clauses, PG15+) for partial replication and column lists to exclude sensitive data. Publications do not include schema changes (DDL)—schema must be managed manually on subscribers.

3. **Subscriptions**: Create with `CREATE SUBSCRIPTION`, specifying connection strings and `copy_data = true` for initial sync. Monitor `pg_subscription_rel` for table-level synchronization state. Use `streaming = true` (PG14+) for in-progress transaction streaming to reduce lag.

4. **Conflict Handling**: By default, replication stops on conflicts (duplicate keys, missing rows). PostgreSQL 15+ offers `conflict_resolution` options (`apply_remote`, `keep_local`), but best practice is to avoid conflicts through directional replication (write to publisher only) or application-level partitioning.

5. **Limitations**: Sequences are not replicated (use UUIDs or manually sync). TRUNCATE replicates but may conflict with foreign keys. DDL changes halt replication until schemas are manually aligned. Large objects and toasted data have edge cases requiring careful handling.

6. **Operational Patterns**: Use logical replication for major version upgrades (PG15→PG16), data warehousing (selective column/row filtering), and microservices data distribution. Always monitor replication slot lag to prevent disk space exhaustion on the publisher, and set `max_slot_wal_keep_size` to limit retention risk.

**Next**: In Chapter 35, we will explore Connection Management and Pooling—covering why connection count matters, PgBouncer configuration and modes (session/transaction/statement), prepared statement handling, and integration patterns with modern application architectures.