Firn is a high-performance, multi-tenant vector and full-text search engine backed by object storage (S3 / MinIO / R2 / GCS). It is designed as a credible open-source alternative to turbopuffer, proving that a professional-grade tiered storage architecture (RAM → NVMe → S3) is achievable entirely from open-source components.
The cost efficiency of S3 with the speed of local RAM. A multi-tenant vector and full-text search engine backed by S3. Built with LanceDB and Foyer for microsecond-scale search latency on top of object storage.
On real-world cloud infrastructure (AWS S3), a raw linear scan of 100,000 vectors can take 25 seconds per query. By pairing LanceDB with a tiered foyer (RAM + NVMe) cache, Firn collapses that bottleneck:
- Cold Query (S3 Linear Scan): ~25.1s
- Cold Query (ANN Indexed): ~979ms (25x faster)
- Warm Query (Internal Engine): ~72µs (350,000x faster)
- End-to-End HTTP (Warm): < 5ms (including network RTT and JSON overhead)
Every cache hit results in zero S3 requests, directly reducing your cloud bill while providing "instant" search response times.
Cold query, warm query, full-text search, and cache proof — all in 60 seconds. This demo runs against local MinIO; on real AWS S3 the cold query takes 25 seconds instead of 109ms, making the cache speedup even more dramatic.
Firn is built on a "Tiered Storage" philosophy:
- L1: RAM Cache (via foyer) — Sub-microsecond access for the most frequent queries.
- L2: NVMe Cache (via foyer) — Fast, durable cache for high-volume search results.
- L3: Object Storage (via LanceDB + S3/MinIO) — The "Source of Truth" where every namespace is isolated under its own S3 prefix.
- axum: High-performance async REST API.
- LanceDB: Vector and BM25 search engine that runs natively on object storage.
- foyer: Advanced hybrid cache (RAM + NVMe) with LFU/LRU eviction.
- Prometheus: Full operational visibility into cache hits, misses, and S3 request savings.
-
Multi-tenant by Design: Each namespace maps to an isolated S3 prefix (
s3://bucket/namespace/) with near-zero idle cost. -
Instant Invalidation: A "Generation Counter" strategy ensures that after a write, all stale search results for that namespace are invalidated in
$O(1)$ time. -
CAS Consistency: Verified concurrency safety using S3's
If-None-Match: *to prevent data loss when multiple writers fight for the same bucket. -
Zero-Copy Ready: Optimized serialization via
bincode(with architectural triggers to move torkyvif needed). - Operational Excellence: Native Prometheus metrics tracking cache hit rates and S3 request count (the primary signal for cost savings).
Everything you need (MinIO storage + Firn API) is orchestrated via Docker Compose:
git clone https://github.com/gordonmurray/firnflow
cd firnflow
docker compose up --buildThe API is live at http://localhost:3000. Save a vector to the demo namespace:
curl -X POST http://localhost:3000/ns/demo/upsert \
-H 'Content-Type: application/json' \
-d '{
"rows": [
{"id": 1, "vector": [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}
]
}'Query the same namespace for the nearest neighbor:
curl -X POST http://localhost:3000/ns/demo/query \
-H 'Content-Type: application/json' \
-d '{"vector": [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], "k": 1}'See how much S3 traffic you've avoided:
curl http://localhost:3000/metrics | grep s3_requests| Endpoint | Method | Description |
|---|---|---|
/health |
GET |
Liveness check |
/metrics |
GET |
Prometheus exposition format |
/ns/{ns} |
DELETE |
Removes all data (S3 + Cache) for a namespace |
/ns/{ns}/upsert |
POST |
Insert/update vectors and data |
/ns/{ns}/query |
POST |
Vector, FTS, or hybrid search |
/ns/{ns}/warmup |
POST |
Non-blocking cache pre-warm hint |
/ns/{ns}/index |
POST |
Build IVF_PQ vector index (async, returns 202) |
/ns/{ns}/fts-index |
POST |
Build BM25 full-text search index (async, returns 202) |
/ns/{ns}/compact |
POST |
Compact and prune data files (async, returns 202) |
Firn uses a containerized toolchain. No local Rust installation is required.
# Run the full test suite (requires MinIO)
./scripts/cargo test --workspace -- --ignored
# Run the cold-vs-warm latency benchmark
./scripts/cargo run --release -p firnflow-benchBenchmark results are committed at bench/results/cold_vs_warm.md.
