Everything* is a SQL table. Each source — REST API, log file, CSV, JSONL, syslog — gets its own typed table in your schema. Cross-source JOINs work natively.
*If yours isn't yet, file a bug — that's the contract.
-- Three sources, three tables, one query:
SELECT s.zone, count(*) AS hot_acked
FROM dbfy_rest('https://api.example.com/readings', config := '...') r
JOIN dbfy_rest('https://api.example.com/sensors', config := '...') s USING (sensor_id)
JOIN dbfy_rows_file('/var/log/fleet.jsonl', config := '...') e
ON e.sensor_id = r.sensor_id AND e.kind = 'alarm_acked'
WHERE r.temperature > 22
GROUP BY s.zone;→ Quickstart — five minutes, zero to your first SQL query. → Showcases — four end-to-end demos that double as integration tests: Service health · Auth audit · Finance recon · SaaS metrics. → Hello-world per language — same canonical query in 7 idiomatic shapes (Rust · Python · C# · Java · Kotlin · Node · Swift).
Pick your language:
# Python
pip install dbfy
# Node.js / TypeScript
npm install @typeeffect/dbfy
# .NET
dotnet add package Dbfy// Maven / Gradle (Kotlin DSL)
dependencies {
implementation("io.github.typeeffect:dbfy-kotlin:0.4.1") // Kotlin coroutine API
// -- or, for Java consumers --
implementation("io.github.typeeffect:dbfy-jvm:0.4.1")
runtimeOnly ("io.github.typeeffect:dbfy-jvm:0.4.1:natives-linux-x86_64")
}// Swift Package Manager
dependencies: [
.package(url: "https://github.com/typeeffect/dbfy", from: "0.4.1"),
],CLI binary (Linux x86_64, macOS arm64) — download from Releases and add to PATH.
DuckDB extension — same Releases page (dbfy-<target>.duckdb_extension):
LOAD 'path/to/dbfy.duckdb_extension';From source (Rust 1.85+):
git clone https://github.com/typeeffect/dbfy && cd dbfy
cargo build --release -p dbfy-cli| Mode | Install | Use |
|---|---|---|
| CLI binary | Releases (Linux x86_64, macOS arm64) | dbfy query --config x.yaml "SELECT …" |
| DuckDB extension | Releases → LOAD 'dbfy.duckdb_extension' |
SELECT * FROM dbfy_rest('…') / dbfy_rows_file('…') |
| Python | pip install dbfy |
import dbfy; engine = dbfy.Engine.from_yaml(…) (sync + asyncio, PyArrow zero-copy) |
| Rust library | path / git dep on dbfy-frontend-datafusion |
use dbfy_frontend_datafusion::Engine; |
| C | libdbfy.{a,so} + dbfy.h (cbindgen-generated) |
engine lifecycle + Arrow C Data Interface |
| Java | Maven io.github.typeeffect:dbfy-jvm:0.4.1 + classifier natives-<rid> |
sync byte[] queryArrowIpc(sql) + async CompletableFuture<byte[]> queryAsyncArrowIpc(sql) (FFI callback → complete/completeExceptionally, JNI AttachCurrentThread from tokio worker); Arrow IPC bytes → ArrowStreamReader |
| Kotlin | Maven io.github.typeeffect:dbfy-kotlin:0.4.1 (transitively pulls dbfy-jvm) |
idiomatic suspend fun query(sql) + Flow<ByteArray> streaming, exceptions thrown directly (no ExecutionException unwrap) |
| C# / .NET | dotnet add package Dbfy |
using Dbfy; var engine = Engine.FromYaml(…) — sync Query() + async Task<Result> QueryAsync(sql, ct) (FFI callback → TaskCompletionSource, no thread blocking, continuations off the tokio worker); Apache.Arrow RecordBatch zero-copy via the C Data Interface; net8.0+ |
| Node.js | npm @typeeffect/dbfy@0.4.1 (prebuilt binaries for linux/macOS/windows × x64/arm64) |
import { Engine } from '@typeeffect/dbfy'; async Promise<Buffer> from engine.query(sql) (napi-rs ThreadsafeFunction → V8 main thread) + sync querySync() |
| Swift / iOS / macOS | SwiftPM https://github.com/typeeffect/dbfy from 0.4.1 (binaryTarget xcframework) |
import Dbfy; let engine = try Engine.fromYaml(yaml) — async/await try await engine.query(sql) via withCheckedThrowingContinuation; macOS 12+, iOS 15+, Mac Catalyst 15+ |
| Kind | Status | Coverage |
|---|---|---|
| REST / HTTP | ✅ stable | page / offset / cursor / link-header pagination · 4 auth modes · retry + Retry-After · in-memory TTL + singleflight HTTP cache · filter pushdown into URL params |
| Line-delimited files | ✅ stable | JSONL · CSV · logfmt · syslog (RFC 5424) · regex (CLF / ELF) — with L3 indexing (zone maps + bloom + incremental EXTEND on append) and glob multi-file expansion |
| Parquet | ✅ stable | local files + glob, schema auto-discovery, predicate + projection + row-group pushdown native (DataFusion ListingTable) |
Excel (.xlsx / .xls) |
✅ stable | sheet selection, predicate + projection + limit pushdown applied at row-iteration time (skip-on-read, no full materialisation) |
| GraphQL | ✅ stable | POST + bearer auth, JSONPath root extraction, variables pushdown via pushdown.variables mapping (WHERE col = lit → GraphQL variable) |
| PostgreSQL | ✅ stable | read-only SELECT over the wire — filter + projection + limit pushdown translated into native WHERE / SELECT / LIMIT so the Postgres planner can use indexes |
| LDAP / LDAPS | ✅ stable | anonymous + simple bind, base / one / sub scope, filter pushdown that AND-merges SQL WHERE predicates into native LDAP filter syntax ((&(uid=mario)(objectClass=person))) with RFC 4515 value escaping; multi-valued attributes joined; synthetic __dn__ column exposes entry DN |
| In-memory programmatic | ✅ stable | static Arrow RecordBatch provider · Python-defined custom providers via Arrow C Data Interface |
| Coming next | 🟡 wishlist — vote with an issue | MySQL · gRPC · Parquet remote (S3/GCS) · Kafka · MongoDB · Loki / Prometheus · HTML / DOM scraping |
Three-layer stack. Engine logic lives in Rust only — every binding delegates to it; nothing is reimplemented per language.
┌─────────────────────────────────────────────────────────────────────┐
│ HOST LANGUAGE BINDINGS — glue + marshalling, no engine logic │
│ C# · Java · Kotlin · Swift ~1 100 LoC total │
└─────────────────────────────┬───────────────────────────────────────┘
│ FFI · JNI · napi-rs · PyO3
┌─────────────────────────────▼───────────────────────────────────────┐
│ FFI WRAPPER CRATES — expose `Engine` to each host language │
│ dbfy-c · dbfy-jni · dbfy-node · dbfy-py ~1 100 LoC total │
└─────────────────────────────┬───────────────────────────────────────┘
│ use dbfy_frontend_datafusion::Engine
┌─────────────────────────────▼───────────────────────────────────────┐
│ CORE ENGINE — single source of truth │
│ dbfy-frontend-datafusion + config + provider-* ~8 700 LoC │
│ │
│ Sources, pushdown, streaming, cancellation, schema discovery, │
│ filter translation, async — all here. │
└─────────────────────────────────────────────────────────────────────┘
Why it matters. Every engine improvement (new source, new pushdown operator, optimisation, bug fix) lands once in the Rust core and propagates automatically to all 7 languages. The only per-binding cost is when a new public API surface is added (e.g. async query in v0.3 → v0.4) — one Rust-side design plus N small marshalling implementations.
What's intentionally repeated across bindings:
- The
Engine.fromYaml / .query / .explainshape — each language has different idioms (Java getters, C# properties, Swift actors, Kotlin suspend functions). This is API surface, not logic. - The async FFI-callback bridge — each runtime has its own
primitive for completion (
TaskCompletionSource/CompletableFuture/withCheckedThrowingContinuation/ napi-rsThreadsafeFunction). The pattern is the same; the marshalling has to be per-language. - The Arrow exchange path — two lanes by intent: zero-copy via Arrow C Data Interface (C, C#, Swift, Python) or IPC bytes (Java, Kotlin, Node), depending on what each ecosystem speaks best.
Engine::query()
alone has more code than all the host bindings combined.
crates/
dbfy-config/ # YAML schema + validation
dbfy-provider/ # ProgrammaticTableProvider trait
dbfy-provider-rest/ # HTTP/JSON source + cache
dbfy-provider-rows-file/ # indexed file source
dbfy-provider-static/ # in-memory RecordBatch provider
dbfy-frontend-datafusion/ # standalone SQL engine
dbfy-frontend-duckdb/ # `.duckdb_extension`
dbfy-cli/ dbfy-py/ dbfy-c/ dbfy-jni/ dbfy-node/ # language native crates
bindings/
csharp/ # .NET (NuGet `Dbfy`)
jvm/dbfy-jvm/ # Java (Maven `io.github.typeeffect:dbfy-jvm`)
jvm/dbfy-kotlin/ # Kotlin (Maven `io.github.typeeffect:dbfy-kotlin`)
swift/ # SwiftPM `https://github.com/typeeffect/dbfy`
docs/
quickstart.md
pushdown-matrix.md # which operator pushes down per source
implementation-plan.md
examples/
showcase/ # service-health demo (REST × JSONL × syslog)
auth-audit/ # LDAP × Postgres × syslog
finance-recon/ # Excel × Parquet × REST
saas-metrics/ # GraphQL × Postgres × Excel
lang/ # one canonical hello-world per binding
- Quickstart — install, build, first query (CLI + DuckDB).
- Pushdown matrix — what each source pushes down (filter / projection / limit / aggregates) and how it appears on the wire.
- CHANGELOG — release notes 0.1 → today.
- Showcases — four runnable demos that double as integration tests:
- Service health — GitHub API × 50k JSONL × 5×syslog
- Auth audit — LDAP × Postgres × syslog (login storm + orphan account detection)
- Finance recon — Excel × Parquet × REST (deals reconciliation)
- SaaS metrics — GraphQL × Postgres × Excel (ARPU vs usage cohort)
- Hello-world per language — same canonical query in 7 idiomatic shapes.
- Implementation plan — milestone-by-milestone history.
- DuckDB extension reference — table-function syntax, demo, capabilities matrix.
- Python bindings —
maturin developwheel + API. - Maven publishing setup — Sonatype Central + GPG signing for the JVM artifacts.
- restsql_spec.md — original product spec.
Apache-2.0.