From 77ec394c84cbc037125bcc95dfa40136b0948c8c Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Tue, 18 Nov 2025 17:06:13 +0100
Subject: [PATCH 01/12] feat(docs): Improve docs and README

---
 README.md                                | 16 +++++++++---
 docs/explanation/architecture.md         |  2 +-
 docs/how-to/index.md                     |  6 +++++
 docs/how-to/postgres-state-store.md      | 32 ++++++++++++++++++++++++
 docs/index.md                            | 15 +++++++----
 docs/tutorials/custom-implementations.md |  2 +-
 docs/tutorials/first-pipeline.md         |  2 +-
 mkdocs.yaml                              |  1 +
 8 files changed, 64 insertions(+), 12 deletions(-)
 create mode 100644 docs/how-to/postgres-state-store.md
diff --git a/README.md b/README.md
index e598b2e3..ef8a63bc 100644
--- a/README.md
+++ b/README.md
@@ -114,18 +114,26 @@ For tutorials and deeper guidance, see the [Documentation](https://supabase.gith
 
 ## Destinations
 
-ETL is designed to be extensible. You can implement your own destinations to send data to any destination you like, however it comes with a few built in destinations:
+ETL is designed to be extensible. You can implement your own destinations, and the project currently ships with the following maintained options:
 
-- BigQuery
+- **BigQuery** – full CRUD-capable replication for analytics workloads.
+- **Apache Iceberg** – append-only log of operations today (no in-place updates yet).
 
-Out-of-the-box destinations are available in the `etl-destinations` crate:
+Enable the destinations you need through the `etl-destinations` crate:
 
 ```toml
 [dependencies]
 etl = { git = "https://github.com/supabase/etl" }
-etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery"] }
+etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery", "iceberg"] }
 ```
 
+## Contributing
+
+We welcome pull requests and GitHub issues. That said, we currently cannot accept new custom destinations unless there 
+is significant community demand. Each destination carries a high long-term maintenance cost, and we are prioritizing core stability, 
+observability, and ergonomics. If you need a destination that is not yet supported, please start a discussion or issue so we can gauge demand 
+before proposing an implementation.
+
 ## License
 
 Apache‑2.0. See `LICENSE` for details.
diff --git a/docs/explanation/architecture.md b/docs/explanation/architecture.md
index 22d092cd..dce61e04 100644
--- a/docs/explanation/architecture.md
+++ b/docs/explanation/architecture.md
@@ -25,7 +25,7 @@ flowchart LR
     end
 
     subgraph Destination[Destination]
-        Dest["BigQuery<br>Custom API<br>Memory"]
+        Dest["BigQuery<br>Apache Iceberg<br>Custom API"]
     end
 
     subgraph Store[Store]
diff --git a/docs/how-to/index.md b/docs/how-to/index.md
index 67eac066..6d1bf049 100644
--- a/docs/how-to/index.md
+++ b/docs/how-to/index.md
@@ -12,6 +12,12 @@ Set up Postgres with the correct settings, and publications for ETL pipelines.
 
 **When to use:** Setting up a new Postgres source for replication.
 
+### [Apply Postgres State Store Migrations](postgres-state-store.md)
+
+Create the `etl` schema, replication state tables, and related objects required by `PostgresStore`.
+
+**When to use:** Before running a pipeline that uses the Postgres-backed state or schema stores.
+
 ## Next Steps
 
 After solving your immediate problem:
diff --git a/docs/how-to/postgres-state-store.md b/docs/how-to/postgres-state-store.md
new file mode 100644
index 00000000..aafd76eb
--- /dev/null
+++ b/docs/how-to/postgres-state-store.md
@@ -0,0 +1,32 @@
+# Apply Postgres State Store Migrations
+
+**Prepare the Postgres-backed state store before running pipelines**
+
+`PostgresStore` (and the matching schema store) keep replication metadata inside your own Postgres database. The tables live in the `etl` schema and must be created before a pipeline starts, otherwise you will see errors such as `relation "etl.table_mappings" does not exist`.
+
+Follow these steps whenever you configure a Postgres-backed store.
+
+## 1. Pick the database and user
+
+- Choose the Postgres database that should store ETL metadata (often separate from the source database).
+- Ensure the user credentials configured in `PgConnectionConfig` have privileges to create schemas, tables, and indexes in that database.
+
+## 2. Apply the migrations
+
+All SQL migrations for the Postgres store reside in `etl-replicator/migrations/`. Apply them in order (they are timestamp-prefixed) using your preferred tooling. With `psql`:
+
+```bash
+cd /path/to/etl
+psql "postgres://user:password@host:port/database" -f etl-replicator/migrations/20250827000000_base.sql
+```
+
+If additional migration files appear in that directory, run them sequentially (for example with `ls etl-replicator/migrations/*.sql | sort | xargs -I{} psql <conn> -f {}`) before restarting your pipeline.
+
+## 3. Verify the schema
+
+After applying the migrations:
+
+- Confirm the `etl` schema exists.
+- Check that tables like `replication_state`, `table_mappings`, and `schema_definitions` are present.
+
+You can now safely configure `PostgresStore`/`PostgresSchemaStore` in your pipeline. Future migrations can be applied on top.
diff --git a/docs/index.md b/docs/index.md
index b7fbee02..9f6e483c 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -33,11 +33,11 @@ Read our **[Explanations](explanation/index.md)** for deeper insights:
 
 **Postgres Logical Replication** streams data changes from Postgres databases in real-time using the Write-Ahead Log (WAL). ETL builds on this foundation to provide:
 
-- 🚀 **Real-time replication** - Stream changes as they happen
-- 🔄 **Multiple destinations** - BigQuery and more coming soon
-- 🛡️ **Fault tolerance** - Built-in error handling and recovery
-- ⚡ **High performance** - Efficient batching and parallel processing
-- 🔧 **Extensible** - Plugin architecture for custom destinations
+- **Real-time replication** - Stream changes as they happen
+- **Multiple destinations** - BigQuery and Apache Iceberg officially supported
+- **Fault tolerance** - Built-in error handling and recovery
+- **High performance** - Efficient batching and parallel processing
+- **Extensible** - Plugin architecture for custom destinations
 
 ## Quick Example
 
@@ -91,5 +91,10 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
 
 - **First time using ETL?** → Start with [Build your first pipeline](tutorials/first-pipeline.md)
 - **Need Postgres setup help?** → Check [Configure Postgres for Replication](how-to/configure-postgres.md)
+- **Using Postgres for state storage?** → Follow [Apply Postgres state store migrations](how-to/postgres-state-store.md)
 - **Need technical details?** → Check the [Reference](reference/index.md)
 - **Want to understand the architecture?** → Read [ETL Architecture](explanation/architecture.md)
+
+## Contributing
+
+Contributions and bug reports are welcome in the GitHub repository. At the moment we cannot accept new custom destination implementations unless a large portion of the community requests them, because every destination adds a long-lived maintenance burden and we are focusing engineering time on stability, observability, and ergonomics. Please open an issue or discussion first if you believe a new destination should be prioritized.
diff --git a/docs/tutorials/custom-implementations.md b/docs/tutorials/custom-implementations.md
index 25de68d2..1e7dfd9b 100644
--- a/docs/tutorials/custom-implementations.md
+++ b/docs/tutorials/custom-implementations.md
@@ -704,7 +704,7 @@ You now have working custom ETL components:
 
 - **Connect to real Postgres** → [Configure Postgres for Replication](../how-to/configure-postgres.md)
 - **Understand the architecture** → [ETL Architecture](../explanation/architecture.md)
-- **Contribute to ETL** → [Open an issue](https://github.com/supabase/etl/issues) with your custom implementations
+- **Contribute thoughtfully** → [Open an issue](https://github.com/supabase/etl/issues) before proposing a new destination; we currently accept new destinations only when there is clear, broad demand due to the maintenance cost.
 
 ## See Also
 
diff --git a/docs/tutorials/first-pipeline.md b/docs/tutorials/first-pipeline.md
index 20ee676b..9ad543fb 100644
--- a/docs/tutorials/first-pipeline.md
+++ b/docs/tutorials/first-pipeline.md
@@ -213,7 +213,7 @@ DELETE FROM users WHERE email = 'bob@example.com';
 
 ## Step 6: Verify Data Replication
 
-The data is now replicated in your memory destination. While this tutorial uses memory (perfect for testing), the same pattern works with BigQuery, DuckDB, or custom destinations.
+The data is now replicated in your memory destination. While this tutorial uses memory (perfect for testing), the same pattern works with any destination.
 
 **Checkpoint:** You've successfully built and tested a complete ETL pipeline!
 
diff --git a/mkdocs.yaml b/mkdocs.yaml
index 047c82e4..2966dade 100644
--- a/mkdocs.yaml
+++ b/mkdocs.yaml
@@ -15,6 +15,7 @@ nav:
   - How-to Guides:
       - Overview: how-to/index.md
       - Configure Postgres: how-to/configure-postgres.md
+      - Apply Postgres State Store Migrations: how-to/postgres-state-store.md
   - Reference:
       - Overview: reference/index.md
   - Explanation:

From fb3968266398fc4f460b2603f979c5fecda27559 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 11:06:46 +0100
Subject: [PATCH 02/12] feat(docs): Improve docs

---
 AGENTS.md                                |   2 +-
 DEVELOPMENT.md                           | 357 +++++++++++++++++++++++
 README.md                                |   4 +
 etl-replicator/scripts/run_migrations.sh |  48 +++
 4 files changed, 410 insertions(+), 1 deletion(-)
 create mode 100644 DEVELOPMENT.md
 create mode 100755 etl-replicator/scripts/run_migrations.sh

diff --git a/AGENTS.md b/AGENTS.md
index 5ff8ff8d..4a8e9e59 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -2,7 +2,7 @@
 
 ## Project Structure & Modules
 - Rust workspace (`Cargo.toml`) with crates: `etl/` (core), `etl-api/` (HTTP API), `etl-postgres/`, `etl-destinations/`, `etl-replicator/`, `etl-config/`, `etl-telemetry/`, `etl-examples/`, `etl-benchmarks/`.
-- Docs in `docs/`; ops tooling in `scripts/` (Docker Compose, DB init, migrations).
+- Docs in `docs/`; development setup in `DEVELOPMENT.md`; ops tooling in `scripts/` (Docker Compose, DB init, migrations).
 - Tests live per crate (`src` unit tests, `tests` integration); benches in `etl-benchmarks/benches/`.
 
 ## Build and Test
diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
new file mode 100644
index 00000000..243f85f5
--- /dev/null
+++ b/DEVELOPMENT.md
@@ -0,0 +1,357 @@
+# Development Guide
+
+This guide covers setting up your development environment, running migrations, and common development workflows for the ETL project.
+
+## Table of Contents
+
+- [Prerequisites](#prerequisites)
+- [Quick Start](#quick-start)
+- [Database Setup](#database-setup)
+  - [Using the Setup Script](#using-the-setup-script)
+  - [Manual Setup](#manual-setup)
+- [Database Migrations](#database-migrations)
+  - [ETL API Migrations](#etl-api-migrations)
+  - [ETL Replicator Migrations](#etl-replicator-migrations)
+- [Running the Services](#running-the-services)
+- [Kubernetes Setup](#kubernetes-setup)
+- [Common Development Tasks](#common-development-tasks)
+
+## Prerequisites
+
+Before starting, ensure you have the following installed:
+
+### Required Tools
+
+- **Rust** (latest stable): [Install Rust](https://rustup.rs/)
+- **PostgreSQL client** (`psql`): Required for database operations
+- **Docker Compose**: For running PostgreSQL and other services
+- **kubectl**: For Kubernetes operations
+- **SQLx CLI**: For database migrations
+
+Install SQLx CLI:
+
+```bash
+cargo install --version='~0.7' sqlx-cli --no-default-features --features rustls,postgres
+```
+
+### Optional Tools
+
+- **OrbStack**: Recommended for local Kubernetes development (alternative to Docker Desktop)
+  - [Install OrbStack](https://orbstack.dev)
+  - Enable Kubernetes in OrbStack settings
+
+## Quick Start
+
+The fastest way to get started is using the setup script:
+
+```bash
+# From the project root
+./scripts/init.sh
+```
+
+This script will:
+1. Start PostgreSQL via Docker Compose
+2. Run etl-api migrations
+3. Seed the default replicator image
+4. Configure the Kubernetes environment (OrbStack)
+
+## Database Setup
+
+### Using the Setup Script
+
+The `scripts/init.sh` script provides a complete development environment setup:
+
+```bash
+# Use default settings (Postgres on port 5430)
+./scripts/init.sh
+
+# Customize database settings
+POSTGRES_PORT=5432 POSTGRES_DB=mydb ./scripts/init.sh
+
+# Skip Docker if you already have Postgres running
+SKIP_DOCKER=1 ./scripts/init.sh
+
+# Use persistent storage
+POSTGRES_DATA_VOLUME=/path/to/data ./scripts/init.sh
+```
+
+**Environment Variables:**
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `POSTGRES_USER` | `postgres` | Database user |
+| `POSTGRES_PASSWORD` | `postgres` | Database password |
+| `POSTGRES_DB` | `postgres` | Database name |
+| `POSTGRES_PORT` | `5430` | Database port |
+| `POSTGRES_HOST` | `localhost` | Database host |
+| `SKIP_DOCKER` | (empty) | Skip Docker Compose if set |
+| `POSTGRES_DATA_VOLUME` | (empty) | Path for persistent storage |
+| `REPLICATOR_IMAGE` | `ramsup/etl-replicator:latest` | Default replicator image |
+
+### Manual Setup
+
+If you prefer manual setup or have an existing PostgreSQL instance:
+
+1. **Set the database URL:**
+
+```bash
+export DATABASE_URL=postgres://USER:PASSWORD@HOST:PORT/DB
+```
+
+2. **Run etl-api migrations:**
+
+```bash
+./etl-api/scripts/run_migrations.sh
+```
+
+3. **Run etl-replicator migrations:**
+
+```bash
+./etl-replicator/scripts/run_migrations.sh
+```
+
+## Database Migrations
+
+The project uses SQLx for database migrations. There are two sets of migrations:
+
+### ETL API Migrations
+
+Located in `etl-api/migrations/`, these create the control plane schema (`app` schema) for managing tenants, sources, destinations, and pipelines.
+
+**Running API migrations:**
+
+```bash
+# From project root
+./etl-api/scripts/run_migrations.sh
+
+# Or manually with SQLx CLI
+sqlx migrate run --source etl-api/migrations
+```
+
+**Creating a new API migration:**
+
+```bash
+cd etl-api
+sqlx migrate add <migration_name>
+```
+
+**Resetting the API database:**
+
+```bash
+cd etl-api
+sqlx migrate revert
+```
+
+**Updating SQLx metadata after schema changes:**
+
+```bash
+cd etl-api
+cargo sqlx prepare
+```
+
+### ETL Replicator Migrations
+
+Located in `etl-replicator/migrations/`, these create the replicator's state store schema (`etl` schema) for tracking replication state, table schemas, and mappings.
+
+**Running replicator migrations:**
+
+```bash
+# From project root
+./etl-replicator/scripts/run_migrations.sh
+
+# Or manually with SQLx CLI (requires setting search_path)
+psql $DATABASE_URL -c "create schema if not exists etl;"
+sqlx migrate run --source etl-replicator/migrations --database-url "${DATABASE_URL}?options=-csearch_path%3Detl"
+```
+
+**Note:** The replicator migrations are also run automatically when the replicator starts (see `etl-replicator/src/migrations.rs:16`). The script is useful for:
+- Pre-creating the state store schema
+- Testing migrations independently
+- CI/CD pipelines
+- Setting up replicator state on new databases
+
+**Creating a new replicator migration:**
+
+```bash
+cd etl-replicator
+sqlx migrate add <migration_name>
+```
+
+## Running the Services
+
+### ETL API
+
+```bash
+cd etl-api
+cargo run
+```
+
+The API requires the `DATABASE_URL` environment variable and a valid configuration file. See `etl-api/README.md` for configuration details.
+
+### ETL Replicator
+
+The replicator is typically deployed as a Kubernetes pod, but can be run locally for testing:
+
+```bash
+cd etl-replicator
+cargo run
+```
+
+## Kubernetes Setup
+
+The project uses Kubernetes for deploying replicators. The setup script configures the necessary resources.
+
+**Prerequisites:**
+- OrbStack with Kubernetes enabled (or another local Kubernetes cluster)
+- `kubectl` configured with the `orbstack` context
+
+**Manual Kubernetes setup:**
+
+```bash
+kubectl --context orbstack apply -f scripts/etl-data-plane.yaml
+kubectl --context orbstack apply -f scripts/trusted-root-certs-config.yaml
+```
+
+**Checking deployed resources:**
+
+```bash
+# List replicator pods
+kubectl get pods -n etl-control-plane -l app=etl-api
+
+# View logs
+kubectl logs -n etl-control-plane -l app=etl-api --tail=100
+
+# Describe a specific pod
+kubectl describe pod <pod-name> -n etl-control-plane
+```
+
+## Common Development Tasks
+
+### Running Tests
+
+```bash
+# Run all tests
+cargo test --workspace
+
+# Run tests for a specific crate
+cargo test -p etl-api
+cargo test -p etl-replicator
+
+# Run with all features enabled
+cargo test --workspace --all-features
+```
+
+### Building the Project
+
+```bash
+# Build all crates
+cargo build --workspace
+
+# Build in release mode
+cargo build --workspace --release
+
+# Build a specific crate
+cargo build -p etl-api
+```
+
+### Checking Code
+
+```bash
+# Run clippy for linting
+cargo clippy --workspace --all-targets
+
+# Format code
+cargo fmt --all
+
+# Check without building
+cargo check --workspace
+```
+
+### Docker Images
+
+```bash
+# Build the API image
+docker build -f etl-api/Dockerfile -t etl-api:dev .
+
+# Build the replicator image
+docker build -f etl-replicator/Dockerfile -t etl-replicator:dev .
+```
+
+### Viewing Logs
+
+```bash
+# Docker Compose logs
+docker-compose -f scripts/docker-compose.yaml logs -f
+
+# PostgreSQL logs specifically
+docker-compose -f scripts/docker-compose.yaml logs -f postgres
+```
+
+### Cleaning Up
+
+```bash
+# Stop Docker Compose services
+docker-compose -f scripts/docker-compose.yaml down
+
+# Remove volumes (WARNING: deletes data)
+docker-compose -f scripts/docker-compose.yaml down -v
+
+# Clean Rust build artifacts
+cargo clean
+```
+
+## Troubleshooting
+
+### Database Connection Issues
+
+If you encounter connection issues:
+
+1. Verify PostgreSQL is running:
+   ```bash
+   docker-compose -f scripts/docker-compose.yaml ps
+   ```
+
+2. Check the connection:
+   ```bash
+   psql $DATABASE_URL -c "SELECT 1;"
+   ```
+
+3. Ensure the correct port is used (default: 5430)
+
+### Migration Issues
+
+If migrations fail:
+
+1. Check if the database exists:
+   ```bash
+   psql $DATABASE_URL -c "\l"
+   ```
+
+2. Verify SQLx CLI is installed:
+   ```bash
+   sqlx --version
+   ```
+
+3. Check migration history:
+   ```bash
+   psql $DATABASE_URL -c "SELECT * FROM _sqlx_migrations;"
+   ```
+
+### Kubernetes Issues
+
+If Kubernetes resources aren't deploying:
+
+1. Verify context:
+   ```bash
+   kubectl config current-context
+   ```
+
+2. Check cluster status:
+   ```bash
+   kubectl cluster-info
+   ```
+
+3. View events:
+   ```bash
+   kubectl get events -n etl-control-plane --sort-by='.lastTimestamp'
+   ```
diff --git a/README.md b/README.md
index ef8a63bc..9acf7bfe 100644
--- a/README.md
+++ b/README.md
@@ -112,6 +112,10 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
 
 For tutorials and deeper guidance, see the [Documentation](https://supabase.github.io/etl) or jump into the [examples](etl-examples/README.md).
 
+## Development
+
+See [DEVELOPMENT.md](DEVELOPMENT.md) for setup instructions, migration workflows, and development guidelines.
+
 ## Destinations
 
 ETL is designed to be extensible. You can implement your own destinations, and the project currently ships with the following maintained options:
diff --git a/etl-replicator/scripts/run_migrations.sh b/etl-replicator/scripts/run_migrations.sh
new file mode 100755
index 00000000..6786f6b5
--- /dev/null
+++ b/etl-replicator/scripts/run_migrations.sh
@@ -0,0 +1,48 @@
+#!/usr/bin/env bash
+set -eo pipefail
+
+if [ ! -d "etl-replicator/migrations" ]; then
+  echo >&2 "❌ Error: 'etl-replicator/migrations' folder not found."
+  echo >&2 "Please run this script from the 'etl' directory."
+  exit 1
+fi
+
+if ! [ -x "$(command -v sqlx)" ]; then
+  echo >&2 "❌ Error: SQLx CLI is not installed."
+  echo >&2 "To install it, run:"
+  echo >&2 "    cargo install --version='~0.7' sqlx-cli --no-default-features --features rustls,postgres"
+  exit 1
+fi
+
+if ! [ -x "$(command -v psql)" ]; then
+  echo >&2 "❌ Error: Postgres client (psql) is not installed."
+  echo >&2 "Please install it using your system's package manager."
+  exit 1
+fi
+
+# Database configuration
+DB_USER="${POSTGRES_USER:=postgres}"
+DB_PASSWORD="${POSTGRES_PASSWORD:=postgres}"
+DB_NAME="${POSTGRES_DB:=postgres}"
+DB_PORT="${POSTGRES_PORT:=5430}"
+DB_HOST="${POSTGRES_HOST:=localhost}"
+
+# Set up the database URL
+export DATABASE_URL=postgres://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:${DB_PORT}/${DB_NAME}
+
+echo "🔄 Running replicator state store migrations..."
+
+# Create the etl schema if it doesn't exist
+# This matches the behavior in etl-replicator/src/migrations.rs
+psql "${DATABASE_URL}" -v ON_ERROR_STOP=1 -c "create schema if not exists etl;" > /dev/null
+
+# Create a temporary sqlx-cli compatible database URL that sets the search_path
+# This ensures the _sqlx_migrations table is created in the etl schema
+SQLX_MIGRATIONS_OPTS="options=-csearch_path%3Detl"
+MIGRATION_URL="${DATABASE_URL}?${SQLX_MIGRATIONS_OPTS}"
+
+# Run migrations with the modified URL
+sqlx database create --database-url "${DATABASE_URL}"
+sqlx migrate run --source etl-replicator/migrations --database-url "${MIGRATION_URL}"
+
+echo "✨ Replicator state store migrations complete! Ready to go!"

From f6e0fd18e2c18062176f89945012b71c718ebe6a Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 11:13:13 +0100
Subject: [PATCH 03/12] Improve

---
 DEVELOPMENT.md                      | 75 -----------------------------
 docs/how-to/index.md                |  6 ---
 docs/how-to/postgres-state-store.md | 32 ------------
 docs/index.md                       |  1 -
 4 files changed, 114 deletions(-)
 delete mode 100644 docs/how-to/postgres-state-store.md

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index 243f85f5..7cc4349c 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -225,81 +225,6 @@ kubectl logs -n etl-control-plane -l app=etl-api --tail=100
 kubectl describe pod <pod-name> -n etl-control-plane
 ```
 
-## Common Development Tasks
-
-### Running Tests
-
-```bash
-# Run all tests
-cargo test --workspace
-
-# Run tests for a specific crate
-cargo test -p etl-api
-cargo test -p etl-replicator
-
-# Run with all features enabled
-cargo test --workspace --all-features
-```
-
-### Building the Project
-
-```bash
-# Build all crates
-cargo build --workspace
-
-# Build in release mode
-cargo build --workspace --release
-
-# Build a specific crate
-cargo build -p etl-api
-```
-
-### Checking Code
-
-```bash
-# Run clippy for linting
-cargo clippy --workspace --all-targets
-
-# Format code
-cargo fmt --all
-
-# Check without building
-cargo check --workspace
-```
-
-### Docker Images
-
-```bash
-# Build the API image
-docker build -f etl-api/Dockerfile -t etl-api:dev .
-
-# Build the replicator image
-docker build -f etl-replicator/Dockerfile -t etl-replicator:dev .
-```
-
-### Viewing Logs
-
-```bash
-# Docker Compose logs
-docker-compose -f scripts/docker-compose.yaml logs -f
-
-# PostgreSQL logs specifically
-docker-compose -f scripts/docker-compose.yaml logs -f postgres
-```
-
-### Cleaning Up
-
-```bash
-# Stop Docker Compose services
-docker-compose -f scripts/docker-compose.yaml down
-
-# Remove volumes (WARNING: deletes data)
-docker-compose -f scripts/docker-compose.yaml down -v
-
-# Clean Rust build artifacts
-cargo clean
-```
-
 ## Troubleshooting
 
 ### Database Connection Issues
diff --git a/docs/how-to/index.md b/docs/how-to/index.md
index 6d1bf049..67eac066 100644
--- a/docs/how-to/index.md
+++ b/docs/how-to/index.md
@@ -12,12 +12,6 @@ Set up Postgres with the correct settings, and publications for ETL pipelines.
 
 **When to use:** Setting up a new Postgres source for replication.
 
-### [Apply Postgres State Store Migrations](postgres-state-store.md)
-
-Create the `etl` schema, replication state tables, and related objects required by `PostgresStore`.
-
-**When to use:** Before running a pipeline that uses the Postgres-backed state or schema stores.
-
 ## Next Steps
 
 After solving your immediate problem:
diff --git a/docs/how-to/postgres-state-store.md b/docs/how-to/postgres-state-store.md
deleted file mode 100644
index aafd76eb..00000000
--- a/docs/how-to/postgres-state-store.md
+++ /dev/null
@@ -1,32 +0,0 @@
-# Apply Postgres State Store Migrations
-
-**Prepare the Postgres-backed state store before running pipelines**
-
-`PostgresStore` (and the matching schema store) keep replication metadata inside your own Postgres database. The tables live in the `etl` schema and must be created before a pipeline starts, otherwise you will see errors such as `relation "etl.table_mappings" does not exist`.
-
-Follow these steps whenever you configure a Postgres-backed store.
-
-## 1. Pick the database and user
-
-- Choose the Postgres database that should store ETL metadata (often separate from the source database).
-- Ensure the user credentials configured in `PgConnectionConfig` have privileges to create schemas, tables, and indexes in that database.
-
-## 2. Apply the migrations
-
-All SQL migrations for the Postgres store reside in `etl-replicator/migrations/`. Apply them in order (they are timestamp-prefixed) using your preferred tooling. With `psql`:
-
-```bash
-cd /path/to/etl
-psql "postgres://user:password@host:port/database" -f etl-replicator/migrations/20250827000000_base.sql
-```
-
-If additional migration files appear in that directory, run them sequentially (for example with `ls etl-replicator/migrations/*.sql | sort | xargs -I{} psql <conn> -f {}`) before restarting your pipeline.
-
-## 3. Verify the schema
-
-After applying the migrations:
-
-- Confirm the `etl` schema exists.
-- Check that tables like `replication_state`, `table_mappings`, and `schema_definitions` are present.
-
-You can now safely configure `PostgresStore`/`PostgresSchemaStore` in your pipeline. Future migrations can be applied on top.
diff --git a/docs/index.md b/docs/index.md
index 9f6e483c..c33df729 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -91,7 +91,6 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
 
 - **First time using ETL?** → Start with [Build your first pipeline](tutorials/first-pipeline.md)
 - **Need Postgres setup help?** → Check [Configure Postgres for Replication](how-to/configure-postgres.md)
-- **Using Postgres for state storage?** → Follow [Apply Postgres state store migrations](how-to/postgres-state-store.md)
 - **Need technical details?** → Check the [Reference](reference/index.md)
 - **Want to understand the architecture?** → Read [ETL Architecture](explanation/architecture.md)
 

From 415f143d91ae528de20db8d6776dfb25796eee24 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 11:18:19 +0100
Subject: [PATCH 04/12] Improve

---
 README.md                        | 37 ++++++++++++++++----------------
 docs/explanation/architecture.md |  2 +-
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/README.md b/README.md
index 9acf7bfe..703ef7ae 100644
--- a/README.md
+++ b/README.md
@@ -40,13 +40,14 @@
 
 ETL is a Rust framework by [Supabase](https://supabase.com) for building high‑performance, real‑time data replication apps on Postgres. It sits on top of Postgres [logical replication](https://www.postgresql.org/docs/current/protocol-logical-replication.html) and gives you a clean, Rust‑native API for streaming changes to your own destinations.
 
-## Highlights
+## Features
 
-- **Real‑time replication**: stream changes in real time to your own destinations.
-- **High performance**: configurable batching and parallelism to maximize throughput.
-- **Fault-tolerant**: robust error handling and retry logic built-in.
-- **Extensible**: implement your own custom destinations and state/schema stores.
-- **Rust native**: typed and ergonomic Rust API.
+- **Real‑time replication**: stream changes in real time to your own destinations
+- **High performance**: configurable batching and parallelism to maximize throughput
+- **Fault-tolerant**: robust error handling and retry logic built-in
+- **Extensible**: implement your own custom destinations and state/schema stores
+- **Production destinations**: BigQuery and Apache Iceberg officially supported
+- **Type-safe**: fully typed Rust API with compile-time guarantees
 
 ## Requirements
 
@@ -102,9 +103,12 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
         max_table_sync_workers: 4,
     };
 
+    // Start the pipeline.
     let mut pipeline = Pipeline::new(config, store, destination);
     pipeline.start().await?;
-    // pipeline.wait().await?; // Optional: block until completion
+  
+    // Wait for the pipeline indefinitely.
+    pipeline.wait().await?;
 
     Ok(())
 }
@@ -112,31 +116,28 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
 
 For tutorials and deeper guidance, see the [Documentation](https://supabase.github.io/etl) or jump into the [examples](etl-examples/README.md).
 
-## Development
-
-See [DEVELOPMENT.md](DEVELOPMENT.md) for setup instructions, migration workflows, and development guidelines.
-
 ## Destinations
 
 ETL is designed to be extensible. You can implement your own destinations, and the project currently ships with the following maintained options:
 
-- **BigQuery** – full CRUD-capable replication for analytics workloads.
-- **Apache Iceberg** – append-only log of operations today (no in-place updates yet).
+- **BigQuery** – full CRUD-capable replication for analytics workloads
+- **Apache Iceberg** – append-only log of operations (updates coming soon)
 
 Enable the destinations you need through the `etl-destinations` crate:
 
 ```toml
 [dependencies]
 etl = { git = "https://github.com/supabase/etl" }
-etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery", "iceberg"] }
+etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery"] }
 ```
 
+## Development
+
+See [DEVELOPMENT.md](DEVELOPMENT.md) for setup instructions, migration workflows, and development guidelines.
+
 ## Contributing
 
-We welcome pull requests and GitHub issues. That said, we currently cannot accept new custom destinations unless there 
-is significant community demand. Each destination carries a high long-term maintenance cost, and we are prioritizing core stability, 
-observability, and ergonomics. If you need a destination that is not yet supported, please start a discussion or issue so we can gauge demand 
-before proposing an implementation.
+We welcome pull requests and GitHub issues. We currently cannot accept new custom destinations unless there is significant community demand, as each destination carries a long-term maintenance cost. We are prioritizing core stability, observability, and ergonomics. If you need a destination that is not yet supported, please start a discussion or issue so we can gauge demand before proposing an implementation.
 
 ## License
 
diff --git a/docs/explanation/architecture.md b/docs/explanation/architecture.md
index dce61e04..6d090960 100644
--- a/docs/explanation/architecture.md
+++ b/docs/explanation/architecture.md
@@ -25,7 +25,7 @@ flowchart LR
     end
 
     subgraph Destination[Destination]
-        Dest["BigQuery<br>Apache Iceberg<br>Custom API"]
+        Dest["BigQuery<br>Apache Iceberg<br>Custom"]
     end
 
     subgraph Store[Store]

From 8d43c1df09b02991ff5e67025506f64f8377fd49 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 11:20:09 +0100
Subject: [PATCH 05/12] Improve

---
 docs/index.md | 42 +++++++++++++++++++-----------------------
 1 file changed, 19 insertions(+), 23 deletions(-)

diff --git a/docs/index.md b/docs/index.md
index c33df729..54fd6304 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -2,7 +2,7 @@
 
 **Build real-time Postgres replication applications in Rust**
 
-ETL is a Rust framework by [Supabase](https://supabase.com) that enables you to build high-performance, real-time data replication applications for Postgres. Whether you're creating ETL pipelines, implementing CDC (Change Data Capture), or building custom data synchronization solutions, ETL provides the building blocks you need.
+ETL is a Rust framework by [Supabase](https://supabase.com) for building high‑performance, real‑time data replication apps on Postgres. It sits on top of Postgres logical replication and gives you a clean, Rust‑native API for streaming changes to your own destinations.
 
 ## Getting Started
 
@@ -29,15 +29,14 @@ Read our **[Explanations](explanation/index.md)** for deeper insights:
 - [ETL architecture overview](explanation/architecture.md)
 - More explanations coming soon
 
-## Core Concepts
+## Features
 
-**Postgres Logical Replication** streams data changes from Postgres databases in real-time using the Write-Ahead Log (WAL). ETL builds on this foundation to provide:
-
-- **Real-time replication** - Stream changes as they happen
-- **Multiple destinations** - BigQuery and Apache Iceberg officially supported
-- **Fault tolerance** - Built-in error handling and recovery
-- **High performance** - Efficient batching and parallel processing
-- **Extensible** - Plugin architecture for custom destinations
+- **Real‑time replication**: stream changes in real time to your own destinations
+- **High performance**: configurable batching and parallelism to maximize throughput
+- **Fault-tolerant**: robust error handling and retry logic built-in
+- **Extensible**: implement your own custom destinations and state/schema stores
+- **Production destinations**: BigQuery and Apache Iceberg officially supported
+- **Type-safe**: fully typed Rust API with compile-time guarantees
 
 ## Quick Example
 
@@ -51,36 +50,33 @@ use etl::{
 
 #[tokio::main]
 async fn main() -> Result<(), Box<dyn std::error::Error>> {
-    // Configure Postgres connection
-    let pg_config = PgConnectionConfig {
-        host: "localhost".to_string(),
+    let pg = PgConnectionConfig {
+        host: "localhost".into(),
         port: 5432,
-        name: "mydb".to_string(),
-        username: "postgres".to_string(),
-        password: Some("password".to_string().into()),
+        name: "mydb".into(),
+        username: "postgres".into(),
+        password: Some("password".into()),
         tls: TlsConfig { enabled: false, trusted_root_certs: String::new() },
     };
 
-    // Create memory-based store and destination for testing
     let store = MemoryStore::new();
     let destination = MemoryDestination::new();
 
-    // Configure the pipeline
     let config = PipelineConfig {
         id: 1,
-        publication_name: "my_publication".to_string(),
-        pg_connection: pg_config,
+        publication_name: "my_publication".into(),
+        pg_connection: pg,
         batch: BatchConfig { max_size: 1000, max_fill_ms: 5000 },
-        table_error_retry_delay_ms: 10000,
+        table_error_retry_delay_ms: 10_000,
         table_error_retry_max_attempts: 5,
         max_table_sync_workers: 4,
     };
 
-    // Create and start the pipeline
+    // Start the pipeline.
     let mut pipeline = Pipeline::new(config, store, destination);
     pipeline.start().await?;
 
-    // Pipeline will run until stopped
+    // Wait for the pipeline indefinitely.
     pipeline.wait().await?;
 
     Ok(())
@@ -96,4 +92,4 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
 
 ## Contributing
 
-Contributions and bug reports are welcome in the GitHub repository. At the moment we cannot accept new custom destination implementations unless a large portion of the community requests them, because every destination adds a long-lived maintenance burden and we are focusing engineering time on stability, observability, and ergonomics. Please open an issue or discussion first if you believe a new destination should be prioritized.
+We welcome pull requests and GitHub issues. We currently cannot accept new custom destinations unless there is significant community demand, as each destination carries a long-term maintenance cost. We are prioritizing core stability, observability, and ergonomics. If you need a destination that is not yet supported, please start a discussion or issue so we can gauge demand before proposing an implementation.

From 5c4669b1ba378e728abfdb212d33b2f49c989d74 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 11:49:29 +0100
Subject: [PATCH 06/12] Improve

---
 docs/explanation/index.md |  2 +-
 docs/reference/index.md   | 38 ++++++++++++++++++++++++++++++++------
 docs/tutorials/index.md   |  4 ++--
 3 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/docs/explanation/index.md b/docs/explanation/index.md
index bec4c325..3ed9aa95 100644
--- a/docs/explanation/index.md
+++ b/docs/explanation/index.md
@@ -33,4 +33,4 @@ After building a conceptual understanding:
 ## Contributing to Explanations
 
 Found gaps in these explanations? See something that could be clearer?
-[Open an issue](https://github.com/supabase/etl/issues) or contribute improvements to help other users build better mental models of ETL.
+[Open an issue](https://github.com/supabase/etl/issues/new) or contribute improvements to help other users build better mental models of ETL.
diff --git a/docs/reference/index.md b/docs/reference/index.md
index df2c3c1b..7683b89c 100644
--- a/docs/reference/index.md
+++ b/docs/reference/index.md
@@ -1,14 +1,40 @@
-
 # Reference
 
-Complete API documentation is available through Rust's built-in documentation system. We will publish comprehensive rustdoc documentation that covers all public APIs, traits, and configuration structures.
-Right now the docs are accessible via the code or by running:
-```shell
+**Complete API documentation for ETL**
+
+API documentation is available through Rust's built-in documentation system. Generate and browse the complete API reference locally:
+
+```bash
 cargo doc --workspace --all-features --no-deps --open
 ```
 
+This opens comprehensive rustdoc documentation covering:
+- All public APIs, traits, and structs
+- Configuration types and options
+- Store and destination trait definitions
+- Code examples and method signatures
+
+## Key Traits
+
+The core extension points in ETL:
+
+- **`Destination`** - Implement to send data to custom destinations
+- **`StateStore`** - Manage replication state and table mappings
+- **`SchemaStore`** - Handle table schema information
+- **`CleanupStore`** - Atomic cleanup operations for removed tables
+
+## Configuration Types
+
+Main configuration structures:
+
+- **`PipelineConfig`** - Complete pipeline configuration
+- **`PgConnectionConfig`** - Postgres connection settings
+- **`BatchConfig`** - Batching and performance tuning
+- **`TlsConfig`** - TLS/SSL configuration
+
 ## See Also
 
 - [How-to guides](../how-to/index.md) - Task-oriented instructions
-- [Tutorials](../tutorials/index.md) - Learning-oriented lessons  
-- [Explanations](../explanation/index.md) - Understanding-oriented discussions
\ No newline at end of file
+- [Tutorials](../tutorials/index.md) - Learning-oriented lessons
+- [Explanations](../explanation/index.md) - Understanding-oriented discussions
+- [GitHub Repository](https://github.com/supabase/etl) - Source code and issues
\ No newline at end of file
diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md
index 8790cbb5..5b0e5aef 100644
--- a/docs/tutorials/index.md
+++ b/docs/tutorials/index.md
@@ -18,7 +18,7 @@ _What you'll build:_ A working pipeline that streams changes from a sample Postg
 
 ### [Build Custom Stores and Destinations](custom-implementations.md)
 
-**45 minutes** • **Advanced**
+**30 minutes** • **Advanced**
 
 Implement production-ready custom stores and destinations. Learn ETL's design patterns, build persistent storage, implement cleanup primitives for safe table removal, and create HTTP-based destinations with retry logic.
 
@@ -52,4 +52,4 @@ If you get stuck:
 1. Double-check the prerequisites
 2. Ensure your Postgres setup matches the requirements
 3. Check the [Postgres configuration guide](../how-to/configure-postgres.md)
-4. [Open an issue](https://github.com/supabase/etl/issues) with your specific problem
+4. [Open an issue](https://github.com/supabase/etl/issues/new) with your specific problem

From 8ea8af29a21d58417157a41dd7f98198d26701ec Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 11:59:31 +0100
Subject: [PATCH 07/12] Improve

---
 DEVELOPMENT.md | 13 +++++++++----
 mkdocs.yaml    |  1 -
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index 7cc4349c..346c1639 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -164,11 +164,16 @@ psql $DATABASE_URL -c "create schema if not exists etl;"
 sqlx migrate run --source etl-replicator/migrations --database-url "${DATABASE_URL}?options=-csearch_path%3Detl"
 ```
 
-**Note:** The replicator migrations are also run automatically when the replicator starts (see `etl-replicator/src/migrations.rs:16`). The script is useful for:
-- Pre-creating the state store schema
+**Important:** Migrations are run automatically when using the `etl-replicator` binary (see `etl-replicator/src/migrations.rs:16`). However, if you integrate the `etl` crate directly into your own application as a library, you should run these migrations manually before starting your pipeline. This design decision ensures:
+- The standalone replicator binary works out-of-the-box
+- Library users have explicit control over when migrations run
+- CI/CD pipelines can pre-apply migrations independently
+
+**When to run migrations manually:**
+- Integrating `etl` as a library in your own application
+- Pre-creating the state store schema before deployment
 - Testing migrations independently
-- CI/CD pipelines
-- Setting up replicator state on new databases
+- CI/CD pipelines that separate migration and deployment steps
 
 **Creating a new replicator migration:**
 
diff --git a/mkdocs.yaml b/mkdocs.yaml
index 2966dade..047c82e4 100644
--- a/mkdocs.yaml
+++ b/mkdocs.yaml
@@ -15,7 +15,6 @@ nav:
   - How-to Guides:
       - Overview: how-to/index.md
       - Configure Postgres: how-to/configure-postgres.md
-      - Apply Postgres State Store Migrations: how-to/postgres-state-store.md
   - Reference:
       - Overview: reference/index.md
   - Explanation:

From e83785e851e88faef4b4cb278ef7646cdcd8bba7 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 12:01:16 +0100
Subject: [PATCH 08/12] Improve

---
 DEVELOPMENT.md | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index 346c1639..c3185d1f 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -184,14 +184,37 @@ sqlx migrate add <migration_name>
 
 ## Running the Services
 
+Both `etl-api` and `etl-replicator` binaries use hierarchical configuration loading from the `configuration/` directory within each crate. Configuration is loaded in this order:
+
+1. **Base configuration**: `configuration/base.yaml` (always loaded)
+2. **Environment-specific**: `configuration/{environment}.yaml` (e.g., `dev.yaml`, `prod.yaml`)
+3. **Environment variable overrides**: Prefixed with `APP_` (e.g., `APP_DATABASE__URL`)
+
+**Environment Selection:**
+
+The environment is determined by the `APP_ENVIRONMENT` variable:
+- **Default**: `prod` (if `APP_ENVIRONMENT` is not set)
+- **Available**: `dev`, `staging`, `prod`
+
+```bash
+# Run with dev environment
+APP_ENVIRONMENT=dev cargo run
+
+# Run with production environment (default)
+cargo run
+
+# Override specific config values
+APP_ENVIRONMENT=dev APP_DATABASE__URL=postgres://localhost/mydb cargo run
+```
+
 ### ETL API
 
 ```bash
 cd etl-api
-cargo run
+APP_ENVIRONMENT=dev cargo run
 ```
 
-The API requires the `DATABASE_URL` environment variable and a valid configuration file. See `etl-api/README.md` for configuration details.
+The API loads configuration from `etl-api/configuration/{environment}.yaml`. See `etl-api/README.md` for available configuration options.
 
 ### ETL Replicator
 
@@ -199,9 +222,11 @@ The replicator is typically deployed as a Kubernetes pod, but can be run locally
 
 ```bash
 cd etl-replicator
-cargo run
+APP_ENVIRONMENT=dev cargo run
 ```
 
+The replicator loads configuration from `etl-replicator/configuration/{environment}.yaml`.
+
 ## Kubernetes Setup
 
 The project uses Kubernetes for deploying replicators. The setup script configures the necessary resources.

From 547519fe55e86d297e03a8337a38b894c5284156 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 12:34:22 +0100
Subject: [PATCH 09/12] Improve

---
 DEVELOPMENT.md | 75 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 46 insertions(+), 29 deletions(-)

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index c3185d1f..6358b3cf 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -31,7 +31,7 @@ Before starting, ensure you have the following installed:
 Install SQLx CLI:
 
 ```bash
-cargo install --version='~0.7' sqlx-cli --no-default-features --features rustls,postgres
+cargo install --version='~0.8.6' sqlx-cli --no-default-features --features rustls,postgres
 ```
 
 ### Optional Tools
@@ -92,24 +92,42 @@ POSTGRES_DATA_VOLUME=/path/to/data ./scripts/init.sh
 
 If you prefer manual setup or have an existing PostgreSQL instance:
 
-1. **Set the database URL:**
+**Important:** The etl-api and etl-replicator migrations can run on **separate databases**. You might have:
+- The etl-api using its own dedicated Postgres instance for the control plane
+- The etl-replicator state store on the same database you're replicating from (source database)
+- Or both on the same database (for simpler local development setups)
 
-```bash
-export DATABASE_URL=postgres://USER:PASSWORD@HOST:PORT/DB
-```
+#### Single Database Setup
 
-2. **Run etl-api migrations:**
+If using one database for both the API and replicator state:
 
 ```bash
+export DATABASE_URL=postgres://USER:PASSWORD@HOST:PORT/DB
+
+# Run both migrations on the same database
 ./etl-api/scripts/run_migrations.sh
+./etl-replicator/scripts/run_migrations.sh
 ```
 
-3. **Run etl-replicator migrations:**
+#### Separate Database Setup
+
+If using separate databases (recommended for production):
 
 ```bash
+# API migrations on the control plane database
+export DATABASE_URL=postgres://USER:PASSWORD@API_HOST:PORT/API_DB
+./etl-api/scripts/run_migrations.sh
+
+# Replicator migrations on the source database
+export DATABASE_URL=postgres://USER:PASSWORD@SOURCE_HOST:PORT/SOURCE_DB
 ./etl-replicator/scripts/run_migrations.sh
 ```
 
+This separation allows you to:
+- Scale the control plane independently from replication workloads
+- Keep the replicator state close to the source data
+- Isolate concerns between infrastructure management and data replication
+
 ## Database Migrations
 
 The project uses SQLx for database migrations. There are two sets of migrations:
@@ -216,45 +234,44 @@ APP_ENVIRONMENT=dev cargo run
 
 The API loads configuration from `etl-api/configuration/{environment}.yaml`. See `etl-api/README.md` for available configuration options.
 
-### ETL Replicator
-
-The replicator is typically deployed as a Kubernetes pod, but can be run locally for testing:
+#### Kubernetes Setup (ETL API Only)
 
-```bash
-cd etl-replicator
-APP_ENVIRONMENT=dev cargo run
-```
-
-The replicator loads configuration from `etl-replicator/configuration/{environment}.yaml`.
-
-## Kubernetes Setup
-
-The project uses Kubernetes for deploying replicators. The setup script configures the necessary resources.
+The etl-api manages replicator deployments on Kubernetes by dynamically creating StatefulSets, Secrets, and ConfigMaps. The etl-api requires Kubernetes, but the **etl-replicator binary can run independently without any Kubernetes setup**.
 
 **Prerequisites:**
 - OrbStack with Kubernetes enabled (or another local Kubernetes cluster)
 - `kubectl` configured with the `orbstack` context
+- Pre-defined Kubernetes resources (see below)
 
-**Manual Kubernetes setup:**
+**Required Pre-Defined Resources:**
+
+The etl-api expects these resources to exist before it can deploy replicators:
+
+1. **Namespace**: `etl-data-plane` - Where all replicator pods and related resources are created
+2. **ConfigMap**: `trusted-root-certs-config` - Provides trusted root certificates for TLS connections
+
+These are defined in `scripts/` and should be applied before running the API:
 
 ```bash
 kubectl --context orbstack apply -f scripts/etl-data-plane.yaml
 kubectl --context orbstack apply -f scripts/trusted-root-certs-config.yaml
 ```
 
-**Checking deployed resources:**
+**Note:** For the complete list of expected Kubernetes resources and their specifications, refer to the constants and resource creation logic in `etl-api/src/k8s/http.rs`.
 
-```bash
-# List replicator pods
-kubectl get pods -n etl-control-plane -l app=etl-api
+### ETL Replicator
 
-# View logs
-kubectl logs -n etl-control-plane -l app=etl-api --tail=100
+The replicator can run as a standalone binary without Kubernetes:
 
-# Describe a specific pod
-kubectl describe pod <pod-name> -n etl-control-plane
+```bash
+cd etl-replicator
+APP_ENVIRONMENT=dev cargo run
 ```
 
+The replicator loads configuration from `etl-replicator/configuration/{environment}.yaml`.
+
+**Note:** While the replicator is typically deployed as a Kubernetes pod managed by the etl-api, it does not require Kubernetes to function. You can run it as a standalone process on any machine with the appropriate configuration.
+
 ## Troubleshooting
 
 ### Database Connection Issues

From e105441be1bbc98467a2dbc3b17ef61199993d49 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 12:57:10 +0100
Subject: [PATCH 10/12] Improve

---
 DEVELOPMENT.md | 44 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index 6358b3cf..e3f8d88f 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -227,6 +227,8 @@ APP_ENVIRONMENT=dev APP_DATABASE__URL=postgres://localhost/mydb cargo run
 
 ### ETL API
 
+#### Running from Source
+
 ```bash
 cd etl-api
 APP_ENVIRONMENT=dev cargo run
@@ -234,6 +236,25 @@ APP_ENVIRONMENT=dev cargo run
 
 The API loads configuration from `etl-api/configuration/{environment}.yaml`. See `etl-api/README.md` for available configuration options.
 
+#### Running with Docker
+
+Docker images are available for the etl-api. You must mount the configuration files and can override settings via environment variables:
+
+```bash
+docker run \
+  -v $(pwd)/etl-api/configuration/base.yaml:/app/configuration/base.yaml \
+  -v $(pwd)/etl-api/configuration/dev.yaml:/app/configuration/dev.yaml \
+  -e APP_ENVIRONMENT=dev \
+  -e APP_DATABASE__URL=postgres://host.docker.internal:5432/mydb \
+  -p 8080:8080 \
+  ramsup/etl-api:latest
+```
+
+**Configuration requirements:**
+- Mount both `base.yaml` and your environment-specific config file (e.g., `dev.yaml`)
+- Set `APP_ENVIRONMENT` to match your mounted environment file
+- Override specific values using `APP_` prefixed environment variables
+
 #### Kubernetes Setup (ETL API Only)
 
 The etl-api manages replicator deployments on Kubernetes by dynamically creating StatefulSets, Secrets, and ConfigMaps. The etl-api requires Kubernetes, but the **etl-replicator binary can run independently without any Kubernetes setup**.
@@ -261,7 +282,9 @@ kubectl --context orbstack apply -f scripts/trusted-root-certs-config.yaml
 
 ### ETL Replicator
 
-The replicator can run as a standalone binary without Kubernetes:
+The replicator can run as a standalone binary without Kubernetes.
+
+#### Running from Source
 
 ```bash
 cd etl-replicator
@@ -270,6 +293,25 @@ APP_ENVIRONMENT=dev cargo run
 
 The replicator loads configuration from `etl-replicator/configuration/{environment}.yaml`.
 
+#### Running with Docker
+
+Docker images are available for the etl-replicator. You must mount the configuration files and can override settings via environment variables:
+
+```bash
+docker run \
+  -v $(pwd)/etl-replicator/configuration/base.yaml:/app/configuration/base.yaml \
+  -v $(pwd)/etl-replicator/configuration/dev.yaml:/app/configuration/dev.yaml \
+  -e APP_ENVIRONMENT=dev \
+  -e APP_SOURCE__HOST=host.docker.internal \
+  -e APP_SOURCE__PASSWORD=mysecret \
+  etl-replicator:latest
+```
+
+**Configuration requirements:**
+- Mount both `base.yaml` and your environment-specific config file (e.g., `dev.yaml`)
+- Set `APP_ENVIRONMENT` to match your mounted environment file
+- Override specific values using `APP_` prefixed environment variables
+
 **Note:** While the replicator is typically deployed as a Kubernetes pod managed by the etl-api, it does not require Kubernetes to function. You can run it as a standalone process on any machine with the appropriate configuration.
 
 ## Troubleshooting

From 80f9caca96a4045a6eb673cf50b39e41ec92f746 Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 13:05:00 +0100
Subject: [PATCH 11/12] Improve

---
 DEVELOPMENT.md         | 3 ---
 etl-config/src/load.rs | 2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
index e3f8d88f..53818284 100644
--- a/DEVELOPMENT.md
+++ b/DEVELOPMENT.md
@@ -245,7 +245,6 @@ docker run \
   -v $(pwd)/etl-api/configuration/base.yaml:/app/configuration/base.yaml \
   -v $(pwd)/etl-api/configuration/dev.yaml:/app/configuration/dev.yaml \
   -e APP_ENVIRONMENT=dev \
-  -e APP_DATABASE__URL=postgres://host.docker.internal:5432/mydb \
   -p 8080:8080 \
   ramsup/etl-api:latest
 ```
@@ -302,8 +301,6 @@ docker run \
   -v $(pwd)/etl-replicator/configuration/base.yaml:/app/configuration/base.yaml \
   -v $(pwd)/etl-replicator/configuration/dev.yaml:/app/configuration/dev.yaml \
   -e APP_ENVIRONMENT=dev \
-  -e APP_SOURCE__HOST=host.docker.internal \
-  -e APP_SOURCE__PASSWORD=mysecret \
   etl-replicator:latest
 ```
 
diff --git a/etl-config/src/load.rs b/etl-config/src/load.rs
index bf358a2c..44f48471 100644
--- a/etl-config/src/load.rs
+++ b/etl-config/src/load.rs
@@ -78,7 +78,7 @@ where
         // Add in settings from the base configuration file.
         .add_source(config::File::from(
             configuration_directory.join(BASE_CONFIG_FILE),
-        ))
+        ).format(config::FileFormat::Yaml))
         // Add in settings from the environment-specific file.
         .add_source(config::File::from(
             configuration_directory.join(environment_filename),

From 8fed2562d83d8b62933c7930cb8d641be8e2624c Mon Sep 17 00:00:00 2001
From: Riccardo Busetti <riccardo.busetti@supabase.io>
Date: Wed, 19 Nov 2025 13:07:36 +0100
Subject: [PATCH 12/12] Improve

---
 etl-config/src/load.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/etl-config/src/load.rs b/etl-config/src/load.rs
index 44f48471..bf358a2c 100644
--- a/etl-config/src/load.rs
+++ b/etl-config/src/load.rs
@@ -78,7 +78,7 @@ where
         // Add in settings from the base configuration file.
         .add_source(config::File::from(
             configuration_directory.join(BASE_CONFIG_FILE),
-        ).format(config::FileFormat::Yaml))
+        ))
         // Add in settings from the environment-specific file.
         .add_source(config::File::from(
             configuration_directory.join(environment_filename),