SQLite alternatives for multi-agent helix at scale: drop-in survey #33

mbachaud · 2026-05-06T21:25:21Z

mbachaud
May 6, 2026
Maintainer

Context: PR #32 fixes the immediate WAL-bloat issue (reader connection was pinning a snapshot, blocking checkpoints). This Discussion is the longer-horizon question: as agent concurrency grows past what a single embedded SQLite reader/writer pair handles cleanly, what's the right next step?

What this Discussion is for: soliciting engineer input on drop-in (or near-drop-in) replacements for the storage layer. Not a migration commitment. The bar to actually change is high.

Workload profile

Helix's storage layer has an unusual mix of access patterns that any replacement needs to support:

Mixed: relational + FTS + vector + graph-ish
- Relational: 17+ tables, foreign keys, joins
- FTS: SQLite FTS5 (BM25, MATCH queries)
- Vector: 20-D ΣĒMA cosine, materialized into a numpy matrix at query time
- Graph-ish: harmonic_links, parent-child gene attribution, co-activation
Read-heavy with bursty writes
- Most agent calls are reads (retrieve context, project sessions)
- Writes are clustered: ingest, replication step, session heartbeats
Embedded, single-process today with replication to multi-drive replicas
Sub-second latency target on the read path, multi-second tolerable on writes
Database size: low-millions of rows for genomes, growing
Concurrency: 4-12 simultaneous agents typical, want headroom to 50+

What "drop-in" means for this exercise

Hard requirements (drop-in candidates must support):

SQL or SQL-shaped query language
Embedded mode (no separate server process required) — or a deployment story so painless it doesn't matter
FTS or text-search equivalent we can migrate from FTS5
Either native vector ops or a shape that lets us keep the numpy matrix path
Concurrent reader/writer model better than SQLite's "one writer at a time"
Python integration that's not painful
Survives pip install-style developer onboarding (helix's positioning is "embedded, runs on your laptop")

Candidates

Strong contenders

Engine	Drop-in score	Concurrency	FTS	Vector	Verdict
DuckDB	4/5	OLAP-style, MVCC, multi-reader good, single-writer	FTS extension (BM25, basic)	`vss` extension (HNSW, cosine, L2)	Most interesting incremental path. Can `ATTACH 'genome.db' (TYPE SQLITE)` and run analytical queries against the existing file without migrating. Migrate the 12-signal scoring path first, validate, expand.
libSQL / Turso	5/5	SQLite-compatible w/ better concurrency story; embedded replicas; HTTP API	Same as SQLite (FTS5)	Native (`libsql_vector`)	Lowest-friction migration. File format is SQLite-compatible. Trades: tied to Turso's ecosystem, async story is stronger than sync.
Postgres + pgvector + tsvector	1/5 (not drop-in)	Real concurrency, real auth, MVCC	tsvector + GIN (industrial-grade)	pgvector (best-in-class for FAISS-class workloads)	The boring/correct answer if multi-tenant becomes real. Server process, ops cost, deployment story changes. Worth it iff agent concurrency goes past ~50.
LanceDB	2/5	Embedded, columnar Parquet-backed, MVCC-ish	Tantivy-backed (Rust)	Native, designed for it	Best fit for vector-dominant workloads. Schema is different (Arrow-based, not relational). Strong "agent memory" positioning in 2026. Migration is real work.

Worth knowing about, probably not the right fit

Engine	Why not
rqlite	Distributed SQLite via Raft. Fixes multi-machine, not multi-agent on one machine. Wrong axis.
Tantivy alone	Excellent FTS but you'd still need a relational store. Useful as a complement to one of the above, not a primary.
DuckLake	Newer DuckDB extension for ACID over object storage. Interesting for distributed Helix but ahead of where we are.
PowerSync / ElectricSQL	Sync-focused; solves a multi-device problem we don't have yet.
Redis + RedisSearch + RedisVL	Volatile by default; making it durable changes the ops story enough that you might as well use Postgres. Good cache layer though.

Where I'd start (low-risk path)

Don't replace SQLite. Keep it as the source of truth.
Add DuckDB as a query-side companion. DuckDB can ATTACH SQLite files directly. The 12-signal scoring path (query_genes and friends) is exactly the columnar/aggregate workload DuckDB beats SQLite at. Migrate those queries to run via DuckDB without moving any data.
Measure. If DuckDB-backed scoring is materially faster under multi-agent load, expand its surface. If not, the embedded model itself isn't the bottleneck — look elsewhere (model swap, embedding gen, network).
Only commit to a full backend swap if the diagnostic data (per-stage timing from PR fix(genome): release reader WAL snapshot + cap journal_size_limit #32 follow-ups) shows storage as a meaningful contributor to the 20-40s delays. Right now I don't believe it is.

If a swap becomes necessary, libSQL is the lowest-friction destination because the file format and SQL are SQLite-compatible. Postgres is the ceiling option if helix grows into a multi-tenant deployment context.

Specific things I want input on

Has anyone here run DuckDB's vss extension at scale? The 20-D cosine path is small enough that the numpy matrix wins on throughput; would a DuckDB-resident HNSW index actually be better, or just complexity for complexity's sake?
For folks who've gone SQLite → libSQL: what surprised you? Especially around the WAL-equivalent behavior, write amplification, and Python sync vs. async ergonomics.
Does anyone have data on FTS5 vs Tantivy at the corpus sizes Helix runs? The 12-signal scoring leans heavily on FTS5; if FTS is a meaningful slice of query time, Tantivy might be a more impactful swap than the relational store.
Anyone tried to make pgvector and embedded run in the same process (e.g., via a colocated Postgres-in-process binding) and lived to tell?

Constraints / non-goals

We're not trying to scale to 1000s of concurrent agents on one machine. The target is "many laptops × 4-12 agents each" with a clean replication/sync story between them.
We're not introducing a server-process dependency lightly. Helix's value prop includes "runs locally, owns its data."
We're not abandoning SQL. Engineers know SQL; the replacement query language matters.

PR fix(genome): release reader WAL snapshot + cap journal_size_limit #32 — immediate WAL bloat fix (lands first, independent of this Discussion)
Issue Test suite: dependency gaps cause 13+ failures on minimal install #28, Test suite: 13-15 failures from test/implementation drift + float precision #29 — test-suite cleanup (recently closed)
BENCHMARKS.md notes "context p95 11s" with a flagged-for-investigation comment — separate slowness exists pre-multi-agent and is likely the dominant contributor to the operator's observed 20-40s.

🤖 Generated with Claude Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQLite alternatives for multi-agent helix at scale: drop-in survey #33

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

SQLite alternatives for multi-agent helix at scale: drop-in survey #33

Uh oh!

mbachaud May 6, 2026 Maintainer

Workload profile

What "drop-in" means for this exercise

Candidates

Strong contenders

Worth knowing about, probably not the right fit

Where I'd start (low-risk path)

Specific things I want input on

Constraints / non-goals

Related

Replies: 0 comments

mbachaud
May 6, 2026
Maintainer