supabase · iambriccardo · Aug 18, 2025 · Aug 14, 2025 · Aug 14, 2025 · Aug 14, 2025
@@ -1,8 +1,8 @@
 <br />
 <p align="center">
-  <a href="https://supabase.io">
-        <picture>
-      <img alt="Supabase Logo" width="100%" src="docs/assets/etl-logo-extended.png">
+  <a href="https://supabase.com">
+    <picture>
+      <img alt="ETL by Supabase" width="100%" src="docs/assets/etl-logo-extended.png">
     </picture>
   </a>
 
@@ -19,162 +19,85 @@
   </p>
 </p>
 
-**ETL** is a Rust framework by [Supabase](https://supabase.com) that enables you to build high-performance, real-time data replication applications for PostgreSQL. Whether you're creating ETL pipelines, implementing CDC (Change Data Capture), or building custom data synchronization solutions, ETL provides the building blocks you need.
+ETL is a Rust framework by [Supabase](https://supabase.com) for building high‑performance, real‑time data replication apps on PostgreSQL. It sits on top of Postgres [logical replication](https://www.postgresql.org/docs/current/protocol-logical-replication.html) and gives you a clean, Rust‑native API for streaming changes to your own destinations.
 
-Built on top of PostgreSQL's [logical streaming replication protocol](https://www.postgresql.org/docs/current/protocol-logical-replication.html), ETL handles the low-level complexities of database replication while providing a clean, Rust-native API that guides you towards the pit of success.
+## Highlights
 
-## Table of Contents
+- 🚀 Real‑time replication: stream changes as they happen
+- ⚡ High performance: batching and parallel workers
+- 🛡️ Fault tolerant: retries and recovery built in
+- 🔧 Extensible: implement custom stores and destinations
+- 🧭 Typed, ergonomic Rust API
 
-- [Features](#features)
-- [Installation](#installation)
-- [Quickstart](#quickstart)
-- [Database Setup](#database-setup)
-- [Running Tests](#running-tests)
-- [Docker](#docker)
-- [Architecture](#architecture)
-- [Troubleshooting](#troubleshooting)
-- [License](#license)
+## Get Started
 
-## Features
-
-**Core Capabilities:**
-
-- 🚀 **Real-time replication**: Stream changes from PostgreSQL as they happen
-- 🔄 **Multiple destinations**: Support for various data warehouses and databases (coming soon)
-- 🛡️ **Fault tolerance**: Built-in error handling, retries, and recovery mechanisms
-- ⚡ **High performance**: Efficient batching and parallel processing
-- 🔧 **Extensible**: Plugin architecture for custom destinations
-
-**Supported Destinations:**
-
-- [x] **BigQuery** - Google Cloud's data warehouse
-- [ ] **Apache Iceberg** (planned) - Open table format for analytics
-- [ ] **DuckDB** (planned) - In-process analytical database
-
-## Installation
-
-Add ETL to your Rust project via git dependencies in `Cargo.toml`:
+Install via Git while we prepare for a crates.io release:
 
 ```toml
 [dependencies]
 etl = { git = "https://github.com/supabase/etl" }
 ```
 
-> **Note**: ETL is currently distributed via Git while we prepare for the initial crates.io release.
-
-## Quickstart
-
-Get up and running with ETL in minutes using the built-in memory destination:
+Quick example using the in‑memory destination:
 
 ```rust
-use etl::config::{BatchConfig, PgConnectionConfig, PipelineConfig, TlsConfig};
-use etl::pipeline::Pipeline;
-use etl::destination::memory::MemoryDestination;
-use etl::store::both::memory::MemoryStore;
+use etl::{
+    config::{BatchConfig, PgConnectionConfig, PipelineConfig, TlsConfig},
+    destination::memory::MemoryDestination,
+    pipeline::Pipeline,
+    store::both::memory::MemoryStore,
+};
 
 #[tokio::main]
 async fn main() -> Result<(), Box<dyn std::error::Error>> {
-    // Configure PostgreSQL connection
-    let pg_connection_config = PgConnectionConfig {
-        host: "localhost".to_string(),
+    let pg = PgConnectionConfig {
+        host: "localhost".into(),
         port: 5432,
-        name: "mydb".to_string(),
-        username: "postgres".to_string(),
+        name: "mydb".into(),
+        username: "postgres".into(),
         password: Some("password".into()),
-        tls: TlsConfig {
-            trusted_root_certs: String::new(),
-            enabled: false,
-        },
+        tls: TlsConfig { enabled: false, trusted_root_certs: String::new() },
     };
 
-    // Configure the pipeline
-    let pipeline_config = PipelineConfig {
+    let store = MemoryStore::new();
+    let destination = MemoryDestination::new();
+
+    let config = PipelineConfig {
         id: 1,
-        publication_name: "my_publication".to_string(),
-        pg_connection: pg_connection_config,
-        batch: BatchConfig {
-            max_size: 1000,
-            max_fill_ms: 5000,
-        },
-        table_error_retry_delay_ms: 10000,
+        publication_name: "my_publication".into(),
+        pg_connection: pg,
+        batch: BatchConfig { max_size: 1000, max_fill_ms: 5000 },
+        table_error_retry_delay_ms: 10_000,
         max_table_sync_workers: 4,
     };
 
-    // Create in-memory store and destination for testing
-    let store = MemoryStore::new();
-    let destination = MemoryDestination::new();
-
-    // Create and start the pipeline
-    let mut pipeline = Pipeline::new(1, pipeline_config, store, destination);
+    let mut pipeline = Pipeline::new(config, store, destination);
     pipeline.start().await?;
+    // pipeline.wait().await?; // Optional: block until completion
 
     Ok(())
 }
 ```
 
-**Need production destinations?** Add the `etl-destinations` crate with specific features:
+For tutorials and deeper guidance, see the [Documentation](https://supabase.github.io/etl) or jump into the [examples](etl-examples/README.md).
 
-```toml
-[dependencies]
-etl = { git = "https://github.com/supabase/etl" }
-etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery"] }
-```
-
-For comprehensive examples and tutorials, visit the [etl-examples](etl-examples/README.md) crate and our [documentation](https://supabase.github.io/etl).
+## Destinations
 
-## Database Setup
+ETL is designed to be extensible. You can implement your own destinations to send data to any destination you like, however it comes with a few built in destinations:
 
-Before running the examples, tests, or the API and replicator components, you'll need to set up a PostgreSQL database.
-We provide a convenient script to help you with this setup. For detailed instructions on how to use the database setup script, please refer to our [Database Setup Guide](docs/guides/database-setup.md).
+- BigQuery
 
-## Running Tests
+Out-of-the-box destinations are available in the `etl-destinations` crate:
 
-To run the test suite:
-
-```bash
-cargo test --all-features
-```
-
-## Docker
-
-The repository includes Docker support for both the `replicator` and `api` components:
-
-```bash
-# Build replicator image
-docker build -f ./etl-replicator/Dockerfile .
-
-# Build api image
-docker build -f ./etl-api/Dockerfile .
-```
-
-## Architecture
-
-For a detailed explanation of the ETL architecture and design decisions, please refer to our [Design Document](docs/design/etl-crate-design.md).
-
-## Troubleshooting
-
-### Too Many Open Files Error
-
-If you see the following error when running tests on macOS:
-
-```
-called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }
-```
-
-Raise the limit of open files per process with:
-
-```bash
-ulimit -n 10000
+```toml
+[dependencies]
+etl = { git = "https://github.com/supabase/etl" }
+etl-destinations = { git = "https://github.com/supabase/etl", features = ["bigquery"] }
 ```
 
-### Performance Considerations
-
-Currently, the system parallelizes the copying of different tables, but each individual table is still copied in sequential batches.
-This limits performance for large tables. We plan to address this once the ETL system reaches greater stability.
-
 ## License
 
-Distributed under the Apache-2.0 License. See `LICENSE` for more information.
+Apache‑2.0. See `LICENSE` for details.
 
 ---