Benchmarks

Two benchmark suites. One measures pure algorithm performance; the other compares SpacetimeDB against Neo4j and Apache AGE on real-world graph datasets at scale.

Cross-system benchmarks

A dockerized comparison of SpacetimeDB (Wasm module, zero engine changes), Neo4j Community Edition, and Apache AGE (PostgreSQL extension) across three SNAP datasets. All systems run as Docker containers on the same host.

Hardware: AMD Ryzen 7 3800X 8-Core @ 3.9 GHz, 32 GB RAM, Windows 10.

Systems under test

System	Interface	Notes
SpacetimeDB (baseline)	Reducers over HTTP	Bulk-insert + traversal result tables; 3 HTTP round-trips per query
SpacetimeDB (optimized)	Procedures over HTTP	Single round-trip; short tx for edge snapshot, traversal outside lock
Neo4j (baseline)	Bolt + Cypher	5K batch sizes, index after insert, no warmup
Neo4j (optimized)	Bolt + APOC	10K batch sizes, index before insert, warmup, APOC subgraphNodes
Apache AGE	SQL + recursive CTE	Direct INSERT into internal tables, recursive CTE over edge/vertex tables

ego-Facebook (~4K vertices, ~88K edges)

Metric	STDB	STDB (opt)	Neo4j	Neo4j (opt)	AGE
Load	376 ms	376 ms	4.78 s	1.74 s	1.95 s
Neighbor count	25 ms	1.2 ms	229 ms	8 ms	12 ms
BFS full	31 ms	20 ms	228 ms	54 ms	77 ms
Shortest path	21 ms	16 ms	178 ms	15 ms	127 ms

Neighbors sampled: 347 · BFS visited: 3,829 · SP hops: 5

com-DBLP (~317K vertices, ~1.05M edges)

Metric	STDB	STDB (opt)	Neo4j	Neo4j (opt)	AGE
Load	4.6 s	4.8 s	34.0 s	23.0 s	30.1 s
Neighbor count	274 ms	1.9 ms	201 ms	112 ms	147 ms
BFS full	1.12 s	339 ms	571 ms	453 ms	4.42 s
Shortest path	307 ms	219 ms	145 ms	7.4 ms	2.20 s

Neighbors sampled: 16 · BFS visited: 234,991 · SP hops: 7

com-LiveJournal (~4M vertices, ~34.7M edges)

Metric	STDB	STDB (opt)	Neo4j	Neo4j (opt)	AGE
Load	177 s	179 s	1,069 s	786 s	972 s
Neighbor count	8.16 s	1.5 ms	271 ms	110 ms	1.26 s
BFS full	29.0 s	14.5 s	34.7 s	39.7 s	603 s
Shortest path	9.10 s	7.69 s	184 ms	125 ms	4.47 s

Neighbors sampled: 23 · BFS visited: 3,995,057 · SP hops: 2

Neo4j (optimized) required 8 GB heap for LiveJournal (OOM'd at 4 GB).

Transaction hold times (STDB optimized, procedures)

Dataset	Edges	Neighbor tx	BFS tx	BFS total	SP tx	SP total
ego-Facebook	88K	~40 µs	15 ms	20 ms	13 ms	16 ms
com-DBLP	1.05M	~50 µs	211 ms	339 ms	177 ms	219 ms
com-LiveJournal	34.7M	~60 µs	7.2 s	14.5 s	6.6 s	7.7 s

Neighbor count tx is O(degree), constant across datasets. BFS/SP tx time is dominated by the full edge-table scan to build the adjacency HashMap — the bottleneck Stage 2 eliminates.

Analysis

SpacetimeDB dominates loading at every scale. 4-6x faster than all competitors. On LiveJournal: 179 s vs. 786-1,069 s. Bulk reducer calls on in-memory B-tree tables process the entire insertion in a single transaction without per-row commit overhead.

The optimized neighbor count is the standout result. LiveJournal: 8.16 s → 1.5 ms (5,440x improvement). The baseline rebuilds the 34.7M-edge adjacency map just to count 23 neighbors. The optimized version uses a direct B-tree index scan inside a procedure tx held for ~60 µs. Total ~1.5 ms is constant regardless of graph size.

Neo4j's shortest path is unmatched. LiveJournal: 125 ms vs. 7.7 s (62x gap). DBLP: 7.4 ms vs. 219 ms (30x). Root cause is algorithmic: STDB runs unidirectional BFS after scanning all edges into a HashMap; Neo4j uses a dedicated bidirectional shortestPath() that terminates when frontiers meet. On LiveJournal the path is 2 hops — Neo4j finds it in ~50 node expansions, STDB scans all 4M vertices. Stage 2's bidirectional operator targets 80-150 ms.

AGE degrades sharply on larger graphs. BFS: 77 ms (ego-Facebook) → 4.42 s (DBLP) → 603 s (LiveJournal). Recursive CTEs over PostgreSQL tables lack graph-specific query planning; each iteration performs a full join against the edge table with no pruning strategies.

Scaling behavior summary

Metric	ego-Facebook winner	DBLP winner	LiveJournal winner	Why
Load	STDB	STDB	STDB	Consistent 5-6x advantage at every scale
Neighbors	STDB (opt)	STDB (opt)	STDB (opt)	Index scan is O(degree), constant at ~1.5 ms
BFS	STDB (opt)	STDB (opt)	STDB (opt)	Adjacency map viable, but slowing at scale
Shortest path	Neo4j (opt)	Neo4j (opt)	Neo4j (opt)	Bidirectional BFS wins at every scale

Reproducing

1. Build the SpacetimeDB standalone Docker image:

docker build -f crates/standalone/Dockerfile -t bench-stdb-local .

2. Build the graph module WASM binary:

cargo build -p graph-module --release --target wasm32-unknown-unknown

3. Start all three databases in Docker:

# SpacetimeDB (in-memory, no durability)
docker run -d --name bench-stdb -p 3000:3000 bench-stdb-local start --data-dir=/stdb/data --in-memory --jwt-priv-key-path=/tmp/id_ecdsa --jwt-pub-key-path=/tmp/id_ecdsa.pub

# Neo4j Community (APOC plugin, 8 GB heap, durability off)
docker run -d --name bench-neo4j -p 7474:7474 -p 7687:7687 -e NEO4J_AUTH=none -e NEO4J_PLUGINS='["apoc"]' -e NEO4J_server_memory_heap_initial__size=2g -e NEO4J_server_memory_heap_max__size=8g neo4j:community

# Apache AGE (PostgreSQL 18, fsync off for fair comparison)
docker run -d --name bench-age -p 5455:5432 --shm-size=1g -e POSTGRES_DB=bench_age -e POSTGRES_HOST_AUTH_METHOD=trust apache/age postgres -c fsync=off -c synchronous_commit=off -c full_page_writes=off

Wait for containers to become healthy (~20 seconds), then:

4. Run the benchmark:

# Single dataset
cargo run -p graph-bench-xsys --release -- --systems spacetimedb,stdb-opt,neo4j,neo4j-opt,age --dataset small

# All three datasets
cargo run -p graph-bench-xsys --release -- --systems spacetimedb,stdb-opt,neo4j,neo4j-opt,age

5. Clean up:

docker rm -f bench-stdb bench-neo4j bench-age

Results written to artifacts/benchmarks/graph-cross-system-benchmarks.md.

Algorithm-level benchmarks (`graph-algo` crate)

Pure in-memory BFS, DFS, and shortest-path measurements — the same algorithms invoked by the graph module, measured without table-scan overhead to isolate algorithm scaling.

Hardware: AMD Ryzen 7 3800X 8-Core @ 3.9 GHz, 32 GB RAM, Windows 10. Rust 1.93.1.

Topology: Ring-of-successors with fixed out-degree 3. Vertex counts: 100, 1,000, 10,000. Setup: 2 warmup iterations (discarded), 5 measured iterations, median/min/max reported.

Full-graph traversals

Algorithm	Vertices	Median	Min	Max
BFS	100	0.010 ms	0.010 ms	0.013 ms
DFS	100	0.011 ms	0.011 ms	0.011 ms
BFS	1,000	0.128 ms	0.117 ms	0.147 ms
DFS	1,000	0.123 ms	0.121 ms	0.128 ms
BFS	10,000	1.729 ms	1.668 ms	2.462 ms
DFS	10,000	2.028 ms	1.694 ms	2.698 ms

Linear scaling — ~10x time for 10x vertices. At 10,000 vertices (30,000 edges), full traversal in under 2 ms. BFS and DFS perform comparably; DFS shows slightly higher variance at large sizes.

Depth-bounded traversals

Algorithm	Vertices	max_depth	Visited	Median
BFS	100	3	10	0.002 ms
DFS	100	3	10	0.002 ms
BFS	100	5	16	0.003 ms
DFS	100	5	16	0.003 ms
BFS	100	10	31	0.005 ms
DFS	100	10	31	0.006 ms
BFS	10,000	3	10	0.001 ms
DFS	10,000	3	10	0.001 ms
BFS	10,000	10	31	0.004 ms
DFS	10,000	10	31	0.004 ms

Timing depends on visited count, not total graph size — as expected for O(V_visited + E_visited) algorithms with fixed fan-out.

Shortest path

Vertices	Path	Hops	Median
100	1→4 (near)	1	0.001 ms
100	1→51	17	0.010 ms
1,000	1→4 (near)	1	0.001 ms
1,000	1→501	167	0.113 ms
10,000	1→4 (near)	1	0.001 ms
10,000	1→5,001	1,667	0.979 ms

Near-neighbor lookups: sub-microsecond regardless of graph size (early termination). Cross-ring paths scale with hop count — 1,667 hop path completes in ~1 ms at 10K vertices.

Reproducing

# From the repository root:
cargo bench -p graph-algo --bench traversal

Source: modules/graph-algo/benches/traversal.rs

What these numbers mean for SpacetimeDB

The algorithm benchmarks show the lower bound — how fast the traversal logic itself runs in memory. Inside a live SpacetimeDB instance, the graph module adds B-tree index lookups on the Edge table, transaction overhead, and (for client-invoked queries) HTTP round-trip latency.

The cross-system benchmarks show the real picture: SpacetimeDB's userland module wins loading, neighbors, and BFS at every scale — but shortest path trails Neo4j because the current architecture must scan all edges before it can traverse. Stage 2's bidirectional operator and index-hop BFS close both gaps without sacrificing the loading and neighbor-count advantages.

The architecture that makes Stage 1 fast at data loading and BFS (in-memory bulk operations, single-transaction inserts) is the same architecture that makes it slow at shortest path (must scan all edges before traversing). Stage 2 keeps the strengths and addresses the weaknesses.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks

Benchmarks

Cross-system benchmarks

Systems under test

ego-Facebook (~4K vertices, ~88K edges)

com-DBLP (~317K vertices, ~1.05M edges)

com-LiveJournal (~4M vertices, ~34.7M edges)

Transaction hold times (STDB optimized, procedures)

Analysis

Scaling behavior summary

Reproducing

Algorithm-level benchmarks (`graph-algo` crate)

Full-graph traversals

Depth-bounded traversals

Shortest path

Reproducing

What these numbers mean for SpacetimeDB

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Benchmarks

Benchmarks

Cross-system benchmarks

Systems under test

ego-Facebook (~4K vertices, ~88K edges)

com-DBLP (~317K vertices, ~1.05M edges)

com-LiveJournal (~4M vertices, ~34.7M edges)

Transaction hold times (STDB optimized, procedures)

Analysis

Scaling behavior summary

Reproducing

Algorithm-level benchmarks (graph-algo crate)

Full-graph traversals

Depth-bounded traversals

Shortest path

Reproducing

What these numbers mean for SpacetimeDB

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Algorithm-level benchmarks (`graph-algo` crate)