Skip to content

ToroNZ/postgresql-backup-b2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

postgres-backup-b2

Backup and restore PostgreSQL to/from Backblaze B2 (supports periodic backups and encryption)

A fork from the OG here: https://github.com/itbm/postgresql-backup-s3

Directory Structure

postgresql-backup-b2/
├── scripts/                    # Shell scripts for backup and restore operations
│   ├── backup.sh               # Core backup script
│   ├── restore.sh              # Core restore script
│   ├── run.sh                  # Entry point script
│   ├── emergency-restore.sh    # Emergency restore script
│   ├── restore-commands.sh     # Restore command utilities
│   ├── restore-now.sh          # Immediate restore script
│   ├── restore-one-liner.sh    # One-liner restore commands
│   └── restore-trigger.sh      # Restore trigger script
├── jinja-templates/            # Jinja2 templates for Ansible/automation
│   ├── postgres-backup-with-restore.yml.j2  # Complete template with backup + restore
│   └── postgres-restore-auto.yml.j2         # Standalone restore template
├── kubernetes-manifests/       # Kubernetes YAML manifests
│   ├── kubernetes-restore-auto.yaml         # Scaled-to-zero restore deployment
│   ├── kubernetes-restore-cronjob.yaml      # CronJob for scheduled restores
│   ├── kubernetes-restore-deployment.yaml   # Deployment configuration
│   ├── kubernetes-restore-job.yaml          # One-time restore job
│   └── kubernetes-restore-ready.yaml        # Readiness check configuration
├── main.go                     # Go cron scheduler
├── Dockerfile                  # Container image definition
├── LICENSE                     # License file
├── GO_CRON_SCHEDULER.md        # Documentation for Go cron scheduler
└── README.md                   # This file

Program Flow Diagram

flowchart TD

    A[Container Start] --> B{Check BACKUP_FILE}
    B -->|BACKUP_FILE set| C[Restore Mode]
    B -->|BACKUP_FILE not set| D{Check SCHEDULE}
    
    D -->|SCHEDULE set| E[Scheduled Backup Mode]
    D -->|SCHEDULE not set| F[One-time Backup Mode]
    
    E --> G[Go-Cron Scheduler]
    G --> H[Execute backup.sh on schedule]
    H --> I[Backup Process]
    
    F --> I
    C --> J[Restore Process]
    
    subgraph "Backup Process"
        I --> K[Validate Environment Variables]
        K --> L{Check USE_CUSTOM_FORMAT}
        L -->|yes| M[pg_dump with -Fc flag]
        L -->|no| N[pg_dump with compression]
        M --> O[Create .dump file]
        N --> P[Create .sql.gz file]
        O --> Q{Check ENCRYPTION_PASSWORD}
        P --> Q
        Q -->|set| R[Encrypt with OpenSSL AES-256-CBC]
        Q -->|not set| S[Upload to B2]
        R --> S
        S --> T{Check B2_LIFECYCLE_DAYS}
        T -->|set| U[Configure B2 Lifecycle Rules]
        T -->|not set| V[Backup Complete]
        U --> V
    end
    
    subgraph "Restore Process"
        J --> W[Validate Environment Variables]
        W --> X[Download backup from B2]
        X --> Y{Check if file is encrypted}
        Y -->|yes| Z[Decrypt with OpenSSL]
        Y -->|no| AA{Check DROP_DATABASE}
        Z --> AA
        AA -->|yes| BB[Drop existing database]
        AA -->|no| CC{Check CREATE_DATABASE}
        BB --> CC
        CC -->|yes| DD[Create new database]
        CC -->|no| EE{Check backup format}
        DD --> EE
        EE -->|.sql.gz| FF[Restore with psql]
        EE -->|.dump| GG{Check PARALLEL_JOBS}
        GG -->|>1| HH[pg_restore with parallel jobs]
        GG -->|=1| II[pg_restore single job]
        FF --> JJ[Restore Complete]
        HH --> JJ
        II --> JJ
    end
    
    subgraph "Kubernetes Restore Options"
        KK["Option 1: Scaled-to-Zero Deployment<br/>- Deploy with jinja-templates/<br/>- Auto-find latest backup<br/>- Just scale up to restore"]
        LL["Benefits:<br/>- Zero variable hunting<br/>- Auto-find latest backup<br/>- Inherits config<br/>- Scaled-to-zero when idle"]
    end
    
    subgraph "Environment Variables"
        NN["Required for Backup:<br/>- B2_ACCESS_KEY_ID<br/>- B2_SECRET_ACCESS_KEY<br/>- B2_BUCKET<br/>- POSTGRES_DATABASE<br/>- POSTGRES_HOST<br/>- POSTGRES_USER<br/>- POSTGRES_PASSWORD"]
        OO["Required for Restore:<br/>- All backup variables<br/>- BACKUP_FILE"]
        PP["Optional:<br/>- ENCRYPTION_PASSWORD<br/>- USE_CUSTOM_FORMAT<br/>- COMPRESSION_CMD<br/>- PARALLEL_JOBS<br/>- DROP_DATABASE<br/>- CREATE_DATABASE<br/>- B2_LIFECYCLE_DAYS"]
    end
Loading

Basic Usage

Backup

$ docker run \
  -e B2_ACCESS_KEY_ID=your-key \
  -e B2_SECRET_ACCESS_KEY=your-secret \
  -e B2_BUCKET=my-bucket \
  -e B2_PREFIX=backup \
  -e POSTGRES_DATABASE=dbname \
  -e POSTGRES_HOST=localhost \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=password \
  -e SCHEDULE="@daily" \
  -e USE_CUSTOM_FORMAT=yes \
  -e ENCRYPTION_PASSWORD="superstrongpassword" \
  -e B2_LIFECYCLE_DAYS=30 \
  toronz/postgres-backup-b2

Restore

Docker Example

# Restore specific backup file
$ docker run \
  -e B2_ACCESS_KEY_ID=your-key \
  -e B2_SECRET_ACCESS_KEY=your-secret \
  -e B2_BUCKET=my-bucket \
  -e B2_PREFIX=backup \
  -e BACKUP_FILE=backup/dbname_2024-01-15T02:00:00Z.dump \
  -e POSTGRES_DATABASE=dbname \
  -e POSTGRES_HOST=localhost \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=password \
  -e USE_CUSTOM_FORMAT=yes \
  -e PARALLEL_JOBS=4 \
  -e DROP_DATABASE=yes \
  -e CREATE_DATABASE=yes \
  -e ENCRYPTION_PASSWORD="superstrongpassword" \
  toronz/postgres-backup-b2

# OR restore latest backup automatically
$ docker run \
  -e B2_ACCESS_KEY_ID=your-key \
  -e B2_SECRET_ACCESS_KEY=your-secret \
  -e B2_BUCKET=my-bucket \
  -e B2_PREFIX=backup \
  -e AUTO_FIND_LATEST=yes \
  -e POSTGRES_DATABASE=dbname \
  -e POSTGRES_HOST=localhost \
  -e POSTGRES_USER=user \
  -e POSTGRES_PASSWORD=password \
  -e USE_CUSTOM_FORMAT=yes \
  -e PARALLEL_JOBS=4 \
  -e DROP_DATABASE=yes \
  -e CREATE_DATABASE=yes \
  -e ENCRYPTION_PASSWORD="superstrongpassword" \
  toronz/postgres-backup-b2

Kubernetes Example

apiVersion: v1
kind: Namespace
metadata:
  name: backup

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgresql-restore
  namespace: backup
spec:
  replicas: 0  # Scaled to zero by default
  selector:
    matchLabels:
      app: postgresql-restore
  template:
    metadata:
      labels:
        app: postgresql-restore
    spec:
      containers:
      - name: postgresql-restore
        image: toronz/postgres-backup-b2
        env:
        # Required for restore
        - name: B2_ACCESS_KEY_ID
          value: "your-key"
        - name: B2_SECRET_ACCESS_KEY
          value: "your-secret"
        - name: B2_BUCKET
          value: "my-bucket"
        - name: B2_PREFIX
          value: "backup"
        - name: POSTGRES_DATABASE
          value: "dbname"
        - name: POSTGRES_HOST
          value: "localhost"
        - name: POSTGRES_USER
          value: "user"
        - name: POSTGRES_PASSWORD
          value: "password"
        # Restore options
        - name: AUTO_FIND_LATEST
          value: "yes"  # Set to "yes" to auto-find latest backup, or set BACKUP_FILE for specific backup
        # - name: BACKUP_FILE
        #   value: "backup/dbname_2024-01-15T02:00:00Z.dump"  # Use this instead of AUTO_FIND_LATEST for specific backup
        - name: USE_CUSTOM_FORMAT
          value: "yes"
        - name: PARALLEL_JOBS
          value: "4"
        - name: DROP_DATABASE
          value: "yes"
        - name: CREATE_DATABASE
          value: "yes"
        - name: ENCRYPTION_PASSWORD
          value: "superstrongpassword"

To restore:

# Scale up to restore (uses AUTO_FIND_LATEST or specific BACKUP_FILE)
kubectl scale deployment postgresql-restore --replicas=1 -n backup

# Monitor progress
kubectl logs deployment/postgresql-restore -n backup -f

# Scale back down when done
kubectl scale deployment postgresql-restore --replicas=0 -n backup

Key Points:

  • When BACKUP_FILE is provided, the container runs restore process instead of backup
  • Use AUTO_FIND_LATEST=yes to automatically restore the latest backup (don't set BACKUP_FILE when using this)
  • Use BACKUP_FILE for specific backup restoration (don't set AUTO_FIND_LATEST when using this)

Environment variables

Variable Default Required Description
POSTGRES_DATABASE Y Database you want to backup/restore or 'all' to backup/restore everything
POSTGRES_HOST Y The PostgreSQL host
POSTGRES_PORT 5432 The PostgreSQL port
POSTGRES_USER Y The PostgreSQL user
POSTGRES_PASSWORD Y The PostgreSQL password
POSTGRES_EXTRA_OPTS Extra postgresql options
B2_ACCESS_KEY_ID Y Your Backblaze B2 application key ID
B2_SECRET_ACCESS_KEY Y Your Backblaze B2 application key
B2_BUCKET Y Your Backblaze B2 bucket name
B2_PREFIX backup Path prefix in your bucket
SCHEDULE Backup schedule time, see explainatons below
MEMORY_MONITOR_INTERVAL 60 Memory monitoring interval in minutes for the cron scheduler
ENCRYPTION_PASSWORD Password to encrypt/decrypt the backup
DELETE_OLDER_THAN Deprecated: Use B2_LIFECYCLE_DAYS instead
B2_LIFECYCLE_DAYS Number of days after which B2 will automatically delete old backups (uses B2 Lifecycle Rules)
USE_CUSTOM_FORMAT no Use PostgreSQL's custom format (-Fc) instead of plain text with compression
COMPRESSION_CMD gzip Command used to compress the backup (e.g. pigz for parallel compression) - ignored when USE_CUSTOM_FORMAT=yes
DECOMPRESSION_CMD gunzip -c Command used to decompress the backup (e.g. pigz -dc for parallel decompression) - ignored when USE_CUSTOM_FORMAT=yes
PARALLEL_JOBS 1 Number of parallel jobs for pg_restore when using custom format backups
BACKUP_FILE Y* Required for restore. The path to the backup file in B2, format: B2_PREFIX/filename
AUTO_FIND_LATEST no For restore: Set to yes to automatically find and restore the latest backup (cannot be used with BACKUP_FILE)
CREATE_DATABASE no For restore: Set to yes to create the database if it doesn't exist
DROP_DATABASE no For restore: Set to yes to drop the database before restoring (caution: destroys existing data). Use with CREATE_DATABASE=yes to recreate it

Automatic Periodic Backups

You can additionally set the SCHEDULE environment variable like -e SCHEDULE="@daily" to run the backup automatically.

More information about the scheduling can be found here.

Automatic Backup Cleanup

Recommended approach: Use B2 Lifecycle Rules for reliable, automatic cleanup:

docker run ... -e B2_LIFECYCLE_DAYS=15 ... toronz/postgres-backup-b2

This automatically configures B2 Lifecycle Rules on your bucket to delete backup files older than the specified number of days. This is handled natively by Backblaze B2 and is much more reliable than client-side deletion.

Legacy approach (deprecated): The DELETE_OLDER_THAN environment variable is still supported but deprecated. It's recommended to migrate to B2_LIFECYCLE_DAYS for better reliability.

Encryption

You can additionally set the ENCRYPTION_PASSWORD environment variable like -e ENCRYPTION_PASSWORD="superstrongpassword" to encrypt the backup. The restore process will automatically detect encrypted backups and decrypt them when the ENCRYPTION_PASSWORD environment variable is set correctly. It can be manually decrypted using openssl aes-256-cbc -d -in backup.sql.gz.enc -out backup.sql.gz.

Backup Format and Compression Options

There are two options for backup format:

  1. Plain text format with compression (default):

    • Uses plain SQL text output compressed with gzip/pigz
    • Standard and widely compatible
  2. PostgreSQL custom format:

    • Enable with -e USE_CUSTOM_FORMAT=yes
    • Significantly faster than plain text format
    • Produces smaller backup files (built-in compression)
    • Supports parallel restoration for faster restores
    • Allows selective table/schema restoration
    • Recommended for larger databases

For plain text format, backups are compressed with gzip by default. For improved performance on multi-core systems, you can use pigz (parallel gzip) instead:

$ docker run ... -e COMPRESSION_CMD=pigz ... toronz/postgres-backup-b2

$ docker run ... -e DECOMPRESSION_CMD="pigz -dc" ... toronz/postgres-backup-b2

When using custom format with parallel restore:

$ docker run ... -e USE_CUSTOM_FORMAT=yes ... toronz/postgres-backup-b2

$ docker run ... -e PARALLEL_JOBS=4 -e BACKUP_FILE=backup/dbname_0000-00-00T00:00:00Z.dump ... toronz/postgres-backup-b2

Note: Custom format is not available when using POSTGRES_DATABASE=all as pg_dumpall does not support this format.

Optimizing Restore Performance

For the fastest possible restore with minimal time investment, follow these recommendations:

1. Use PostgreSQL Custom Format (Recommended)

# For backup
docker run ... -e USE_CUSTOM_FORMAT=yes ... toronz/postgres-backup-b2

# For restore with parallel jobs
docker run ... -e USE_CUSTOM_FORMAT=yes -e PARALLEL_JOBS=4 -e BACKUP_FILE=backup/dbname_0000-00-00T00:00:00Z.dump ... toronz/postgres-backup-b2

Benefits:

  • Significantly faster than plain text format
  • Smaller backup files (built-in compression)
  • Supports parallel restoration
  • Allows selective restoration of tables/schemas

2. Optimize Parallel Restoration

Set PARALLEL_JOBS to match your system's capabilities:

  • For 4-core systems: PARALLEL_JOBS=4
  • For 8-core systems: PARALLEL_JOBS=8
  • For high-memory systems: PARALLEL_JOBS=16 (or higher)

3. Pre-configure Database

Use DROP_DATABASE=yes and CREATE_DATABASE=yes for clean restoration:

docker run ... -e DROP_DATABASE=yes -e CREATE_DATABASE=yes ... toronz/postgres-backup-b2

4. Use High-Performance Compression (for plain text format)

If you must use plain text format, use pigz for parallel compression:

# For backup
docker run ... -e COMPRESSION_CMD=pigz ... toronz/postgres-backup-b2

# For restore
docker run ... -e DECOMPRESSION_CMD="pigz -dc" ... toronz/postgres-backup-b2

5. One-Time Restore Setup

For the fastest one-time restore, create a simple script:

#!/bin/bash
# fast-restore.sh
export BACKUP_FILE="backup/your-database_2024-01-01T00:00:00Z.dump"
export USE_CUSTOM_FORMAT="yes"
export PARALLEL_JOBS="8"
export DROP_DATABASE="yes"
export CREATE_DATABASE="yes"

docker run --rm \
  -e B2_ACCESS_KEY_ID="$B2_ACCESS_KEY_ID" \
  -e B2_SECRET_ACCESS_KEY="$B2_SECRET_ACCESS_KEY" \
  -e B2_BUCKET="$B2_BUCKET" \
  -e BACKUP_FILE="$BACKUP_FILE" \
  -e POSTGRES_DATABASE="$POSTGRES_DATABASE" \
  -e POSTGRES_HOST="$POSTGRES_HOST" \
  -e POSTGRES_USER="$POSTGRES_USER" \
  -e POSTGRES_PASSWORD="$POSTGRES_PASSWORD" \
  -e USE_CUSTOM_FORMAT="$USE_CUSTOM_FORMAT" \
  -e PARALLEL_JOBS="$PARALLEL_JOBS" \
  -e DROP_DATABASE="$DROP_DATABASE" \
  -e CREATE_DATABASE="$CREATE_DATABASE" \
  toronz/postgres-backup-b2

Performance Comparison

Format Compression Parallel Jobs Relative Speed File Size
Plain SQL + gzip Yes No 1x Large
Plain SQL + pigz Yes No 1.5x Large
Custom Format Built-in No 3x Medium
Custom Format Built-in Yes (4 jobs) 6x Medium
Custom Format Built-in Yes (8 jobs) 10x+ Medium

Recommendation: Use custom format with parallel jobs for the fastest restore experience.

About

Backup PostgresSQL to B2 (Backblaze)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Shell 66.8%
  • Jinja 16.5%
  • Go 13.5%
  • Dockerfile 3.2%