Skip to content

jackiedabekar/pods

Repository files navigation

P.O.D.S.

Postgres Orchestrated Docker Stack

High availability PostgreSQL cluster with 1 primary (R/W) and 2 read replicas,
using native streaming replication and PgBouncer connection pooling — all in Docker.

P.O.D.S. - Postgres Orchestrated Docker Stack

PostgreSQL 18 io_uring PgBouncer Docker Compose Streaming Replication Prometheus Grafana


Quick Start

# Launch the entire stack
docker compose up -d

# Check all services are running
docker compose ps

# Verify replication is streaming
docker exec pg_master psql -U postgres -d appdb \
  -c "SELECT client_addr, state, sent_lsn, replay_lsn FROM pg_stat_replication;"

# Tear down (data is preserved in ./data/)
docker compose down

Architecture

Component Role Host Port Details
Master Primary (R/W) 5440 Accepts all writes, streams WAL to replicas
Replica 1 Standby (R/O) 5433 Hot standby, receives WAL via replica1_slot
Replica 2 Standby (R/O) 5434 Hot standby, receives WAL via replica2_slot
Replica N Standby (R/O) 5435+ Dynamically added via ./pods.sh add
PgBouncer Connection Pooler 6432 Transaction-mode pooling with named read/write pools
Health API Monitoring 8080 REST API exposing cluster health, replication lag, node status
Prometheus Metrics 9090 Scrapes all exporters, stores time-series data
Grafana Dashboards 3000 Pre-built dashboards for cluster, replication, resources
Backup Scheduled Backups pg_basebackup + WAL archiving, S3/local, cron-scheduled
cAdvisor Resources 8081 Container CPU, RAM, and network metrics

Data Flow

Client App
    │
    ▼
PgBouncer (:6432)
    │
    ├── appdb_write ──→ Master (:5440)  ── R/W
    ├── appdb_read1 ──→ Replica1 (:5433) ── R/O
    ├── appdb_read2 ──→ Replica2 (:5434) ── R/O
    └── appdb ────────→ Master (:5440)  ── default

Master ──── WAL Stream ───→ Replica1
       ├─── WAL Stream ───→ Replica2
       └─── WAL Stream ───→ ReplicaN...  (dynamically scaled)

Health API (:8080)
    │
    ├── /health            ──→ Full cluster overview (auto-discovers all replicas)
    ├── /health/master     ──→ Master node status
    ├── /health/replicaN   ──→ Specific replica status
    └── /health/replication──→ Replication lag & slot details

Connection Details

# Direct connections
psql -h localhost -p 5440 -U postgres -d appdb        # Master (R/W)
psql -h localhost -p 5433 -U postgres -d appdb        # Replica 1 (R/O)
psql -h localhost -p 5434 -U postgres -d appdb        # Replica 2 (R/O)

# Via PgBouncer (pooled)
psql -h localhost -p 6432 -U postgres -d appdb_write  # → Master
psql -h localhost -p 6432 -U postgres -d appdb_read1  # → Replica 1
psql -h localhost -p 6432 -U postgres -d appdb_read2  # → Replica 2
psql -h localhost -p 6432 -U postgres -d appdb        # → Master (default)

Replica Scaling

Dynamically add or remove read replicas without editing any config files.

./pods.sh add              # Auto-adds replica3 (next available)
./pods.sh add replica5     # Adds a specific replica
./pods.sh remove replica3  # Removes a dynamic replica
./pods.sh list             # Lists all replicas with status
./pods.sh status           # Full cluster health overview

Each add automatically: creates a replication slot, generates a compose override, adds a PgBouncer read pool, spins up a postgres_exporter, updates Prometheus targets, starts the container, and waits for streaming confirmation. See Scaling Documentation for details.

Health API

REST API on port 8080 for monitoring cluster health. Auto-discovers all replicas — no config changes needed when scaling.

curl http://localhost:8080/health          # Full cluster status (200 = healthy, 503 = degraded)
curl http://localhost:8080/health/master   # Master node only
curl http://localhost:8080/health/replica1 # Specific replica
curl http://localhost:8080/health/replication # Replication lag & slots

See Health API Documentation for response schemas and integration guide.

Monitoring (Prometheus + Grafana)

Full observability with a pre-built dashboard. Open http://localhost:3000 (default: admin / pods_admin).

Dashboard panels:

  • Cluster Overview — UP/DOWN status per node, active connections, database size
  • Replication — Lag in bytes over time, WAL retained per slot
  • Queries & Throughput — TPS per node, active/idle queries, row operations (insert/update/delete rates)
  • Connections per Node — Connection state breakdown for master and each replica
  • PgBouncer — Client/server connections, waiting clients, queries per second per pool
  • Container CPU — CPU usage over time + current snapshot per container
  • Container Memory — RAM usage over time + current snapshot per container
  • Resource Summary — Total CPU/RAM for PG nodes and full stack

Exporters auto-scale with replicas — ./pods.sh add creates a postgres_exporter and updates Prometheus targets automatically.

See Monitoring Documentation for metrics reference, PromQL queries, and configuration.

Backup & Restore

Automated backups with scheduled pg_basebackup snapshots and continuous WAL archiving. Supports local storage and S3.

./pods.sh backup             # Immediate full backup
./pods.sh backup-list        # List all backups with sizes
./pods.sh restore full_25-03-2026_14-30-00.tar.gz  # Restore a backup
  • Schedule: Configurable via BACKUP_SCHEDULE (default: daily at 2 AM)
  • Retention: Auto-cleanup after BACKUP_RETENTION_DAYS (default: 7 days)
  • Storage: Auto-detects S3 when AWS_ACCESS_KEY_ID + AWS_S3_BUCKET are set; otherwise stores locally
  • WAL archiving: Continuous via archive_command for point-in-time recovery capability
  • Naming: full_DD-MM-YYYY_HH-MM-SS.tar.gz

See Backup Documentation for S3 setup, restore procedures, and architecture details.

Project Structure

pods/
├── docker-compose.yml              # Service orchestration (base stack)
├── docker-compose.override.yml     # Dynamic replicas (auto-generated by pods.sh)
├── .env                            # Credentials & config (gitignored)
├── .gitignore
├── pods.sh                         # Replica scaling management script
├── config/
│   ├── pgbouncer/
│   │   ├── pgbouncer.ini           # Pool definitions & settings
│   │   └── userlist.txt            # PgBouncer auth credentials
│   ├── prometheus/
│   │   ├── prometheus.yml          # Scrape targets & jobs
│   │   └── pg_targets.json         # Dynamic replica targets (auto-managed)
│   └── grafana/
│       ├── provisioning/           # Auto-provisioned datasource & dashboard config
│       └── dashboards/
│           └── pods-cluster.json   # Pre-built cluster overview dashboard
├── scripts/
│   ├── master-init.sh              # Creates replication user & slots
│   └── replica-init.sh             # Runs pg_basebackup & configures standby
├── health-api/
│   ├── Dockerfile                  # Python 3.13 slim image
│   ├── requirements.txt            # FastAPI, uvicorn, psycopg2
│   └── main.py                     # Health API with auto-discovery
├── backup/
│   ├── Dockerfile                  # Backup container (postgres:18 + cron + awscli)
│   ├── backup.sh                   # Full backup, WAL archive, cleanup, list
│   ├── restore.sh                  # Restore a backup to target directory
│   └── entrypoint.sh               # Cron scheduler entrypoint
├── data/                           # Persistent PostgreSQL data (gitignored)
│   ├── master/
│   ├── replica1/
│   ├── replica2/
│   └── replicaN/                   # Created dynamically by pods.sh
└── documentation/
    ├── pods.png                    # Architecture diagram
    ├── master.md                   # Master node deep-dive
    ├── replication.md              # Streaming replication explained
    ├── pgbouncer.md                # Connection pooling config
    ├── environment.md              # Environment variables reference
    ├── operations.md               # Ops, monitoring & troubleshooting
    ├── health-api.md               # Health API endpoints & responses
    ├── scaling.md                  # Dynamic replica scaling guide
    ├── monitoring.md              # Prometheus, Grafana, exporters, metrics
    └── backup.md                  # Backup & restore guide

Documentation

Module What's Inside
Master Node WAL parameters, pg_hba.conf rules, replication user & slot creation, health checks
Replication How streaming replication works, pg_basebackup flags, standby.signal, slot mechanics, why not Patroni
PgBouncer Pool modes (session/transaction/statement), every pgbouncer.ini parameter, monitoring commands
Environment All .env variables, defaults, how they flow through the stack, production recommendations
Operations Start/stop, log viewing, replication monitoring queries, data reset, troubleshooting guide
Health API REST endpoints, response schemas, auto-discovery, integration with load balancers
Scaling pods.sh commands, how dynamic replicas work, override file, PgBouncer auto-config
Monitoring Prometheus config, Grafana dashboards, all exporters, metrics reference, PromQL queries
Backup & Restore Scheduled backups, WAL archiving, S3/local storage, restore procedures

Key Design Decisions

Decision Reasoning
Native streaming replication over Patroni Fixed topology (1 primary + 2 replicas) — no need for automatic leader election. Simpler, fewer dependencies
Physical replication slots Guarantees WAL retention per replica — prevents data loss if a replica goes offline temporarily
PgBouncer in transaction mode Best connection reuse for web workloads without sacrificing transaction safety
Bind-mount volumes over Docker volumes Data lives in ./data/ — visible, portable, easy to backup and inspect
PostgreSQL 18 + io_uring Native async I/O reduces syscall overhead — measurable throughput gains on read-heavy replicas
Auto-discovery in Health API Queries pg_replication_slots to find replicas — no config changes needed when scaling
docker-compose.override.yml for scaling Dynamic replicas live in an auto-generated override — base compose stays clean and version-controlled

Built with ❤️ PostgreSQL, PgBouncer, and Docker Compose

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors