-
Notifications
You must be signed in to change notification settings - Fork 0
Benchmarks
Two benchmark suites. One measures pure algorithm performance; the other compares SpacetimeDB against Neo4j and Apache AGE on real-world graph datasets at scale.
A dockerized comparison of SpacetimeDB (Wasm module, zero engine changes), Neo4j Community Edition, and Apache AGE (PostgreSQL extension) across three SNAP datasets. All systems run as Docker containers on the same host.
Hardware: AMD Ryzen 7 3800X 8-Core @ 3.9 GHz, 32 GB RAM, Windows 10.
| System | Interface | Notes |
|---|---|---|
| SpacetimeDB (baseline) | Reducers over HTTP | Bulk-insert + traversal result tables; 3 HTTP round-trips per query |
| SpacetimeDB (optimized) | Procedures over HTTP | Single round-trip; short tx for edge snapshot, traversal outside lock |
| Neo4j (baseline) | Bolt + Cypher | 5K batch sizes, index after insert, no warmup |
| Neo4j (optimized) | Bolt + APOC | 10K batch sizes, index before insert, warmup, APOC subgraphNodes |
| Apache AGE | SQL + recursive CTE | Direct INSERT into internal tables, recursive CTE over edge/vertex tables |
| Metric | STDB | STDB (opt) | Neo4j | Neo4j (opt) | AGE |
|---|---|---|---|---|---|
| Load | 376 ms | 376 ms | 4.78 s | 1.74 s | 1.95 s |
| Neighbor count | 25 ms | 1.2 ms | 229 ms | 8 ms | 12 ms |
| BFS full | 31 ms | 20 ms | 228 ms | 54 ms | 77 ms |
| Shortest path | 21 ms | 16 ms | 178 ms | 15 ms | 127 ms |
Neighbors sampled: 347 · BFS visited: 3,829 · SP hops: 5
| Metric | STDB | STDB (opt) | Neo4j | Neo4j (opt) | AGE |
|---|---|---|---|---|---|
| Load | 4.6 s | 4.8 s | 34.0 s | 23.0 s | 30.1 s |
| Neighbor count | 274 ms | 1.9 ms | 201 ms | 112 ms | 147 ms |
| BFS full | 1.12 s | 339 ms | 571 ms | 453 ms | 4.42 s |
| Shortest path | 307 ms | 219 ms | 145 ms | 7.4 ms | 2.20 s |
Neighbors sampled: 16 · BFS visited: 234,991 · SP hops: 7
| Metric | STDB | STDB (opt) | Neo4j | Neo4j (opt) | AGE |
|---|---|---|---|---|---|
| Load | 177 s | 179 s | 1,069 s | 786 s | 972 s |
| Neighbor count | 8.16 s | 1.5 ms | 271 ms | 110 ms | 1.26 s |
| BFS full | 29.0 s | 14.5 s | 34.7 s | 39.7 s | 603 s |
| Shortest path | 9.10 s | 7.69 s | 184 ms | 125 ms | 4.47 s |
Neighbors sampled: 23 · BFS visited: 3,995,057 · SP hops: 2
Neo4j (optimized) required 8 GB heap for LiveJournal (OOM'd at 4 GB).
| Dataset | Edges | Neighbor tx | BFS tx | BFS total | SP tx | SP total |
|---|---|---|---|---|---|---|
| ego-Facebook | 88K | ~40 µs | 15 ms | 20 ms | 13 ms | 16 ms |
| com-DBLP | 1.05M | ~50 µs | 211 ms | 339 ms | 177 ms | 219 ms |
| com-LiveJournal | 34.7M | ~60 µs | 7.2 s | 14.5 s | 6.6 s | 7.7 s |
Neighbor count tx is O(degree), constant across datasets. BFS/SP tx time is dominated by the full edge-table scan to build the adjacency HashMap — the bottleneck Stage 2 eliminates.
SpacetimeDB dominates loading at every scale. 4-6x faster than all competitors. On LiveJournal: 179 s vs. 786-1,069 s. Bulk reducer calls on in-memory B-tree tables process the entire insertion in a single transaction without per-row commit overhead.
The optimized neighbor count is the standout result. LiveJournal: 8.16 s → 1.5 ms (5,440x improvement). The baseline rebuilds the 34.7M-edge adjacency map just to count 23 neighbors. The optimized version uses a direct B-tree index scan inside a procedure tx held for ~60 µs. Total ~1.5 ms is constant regardless of graph size.
Neo4j's shortest path is unmatched. LiveJournal: 125 ms vs. 7.7 s (62x gap). DBLP: 7.4 ms vs. 219 ms (30x). Root cause is algorithmic: STDB runs unidirectional BFS after scanning all edges into a HashMap; Neo4j uses a dedicated bidirectional shortestPath() that terminates when frontiers meet. On LiveJournal the path is 2 hops — Neo4j finds it in ~50 node expansions, STDB scans all 4M vertices. Stage 2's bidirectional operator targets 80-150 ms.
AGE degrades sharply on larger graphs. BFS: 77 ms (ego-Facebook) → 4.42 s (DBLP) → 603 s (LiveJournal). Recursive CTEs over PostgreSQL tables lack graph-specific query planning; each iteration performs a full join against the edge table with no pruning strategies.
| Metric | ego-Facebook winner | DBLP winner | LiveJournal winner | Why |
|---|---|---|---|---|
| Load | STDB | STDB | STDB | Consistent 5-6x advantage at every scale |
| Neighbors | STDB (opt) | STDB (opt) | STDB (opt) | Index scan is O(degree), constant at ~1.5 ms |
| BFS | STDB (opt) | STDB (opt) | STDB (opt) | Adjacency map viable, but slowing at scale |
| Shortest path | Neo4j (opt) | Neo4j (opt) | Neo4j (opt) | Bidirectional BFS wins at every scale |
1. Build the SpacetimeDB standalone Docker image:
docker build -f crates/standalone/Dockerfile -t bench-stdb-local .2. Build the graph module WASM binary:
cargo build -p graph-module --release --target wasm32-unknown-unknown3. Start all three databases in Docker:
# SpacetimeDB (in-memory, no durability)
docker run -d --name bench-stdb -p 3000:3000 bench-stdb-local start --data-dir=/stdb/data --in-memory --jwt-priv-key-path=/tmp/id_ecdsa --jwt-pub-key-path=/tmp/id_ecdsa.pub
# Neo4j Community (APOC plugin, 8 GB heap, durability off)
docker run -d --name bench-neo4j -p 7474:7474 -p 7687:7687 -e NEO4J_AUTH=none -e NEO4J_PLUGINS='["apoc"]' -e NEO4J_server_memory_heap_initial__size=2g -e NEO4J_server_memory_heap_max__size=8g neo4j:community
# Apache AGE (PostgreSQL 18, fsync off for fair comparison)
docker run -d --name bench-age -p 5455:5432 --shm-size=1g -e POSTGRES_DB=bench_age -e POSTGRES_HOST_AUTH_METHOD=trust apache/age postgres -c fsync=off -c synchronous_commit=off -c full_page_writes=offWait for containers to become healthy (~20 seconds), then:
4. Run the benchmark:
# Single dataset
cargo run -p graph-bench-xsys --release -- --systems spacetimedb,stdb-opt,neo4j,neo4j-opt,age --dataset small
# All three datasets
cargo run -p graph-bench-xsys --release -- --systems spacetimedb,stdb-opt,neo4j,neo4j-opt,age5. Clean up:
docker rm -f bench-stdb bench-neo4j bench-ageResults written to artifacts/benchmarks/graph-cross-system-benchmarks.md.
Pure in-memory BFS, DFS, and shortest-path measurements — the same algorithms invoked by the graph module, measured without table-scan overhead to isolate algorithm scaling.
Hardware: AMD Ryzen 7 3800X 8-Core @ 3.9 GHz, 32 GB RAM, Windows 10. Rust 1.93.1.
Topology: Ring-of-successors with fixed out-degree 3. Vertex counts: 100, 1,000, 10,000. Setup: 2 warmup iterations (discarded), 5 measured iterations, median/min/max reported.
| Algorithm | Vertices | Median | Min | Max |
|---|---|---|---|---|
| BFS | 100 | 0.010 ms | 0.010 ms | 0.013 ms |
| DFS | 100 | 0.011 ms | 0.011 ms | 0.011 ms |
| BFS | 1,000 | 0.128 ms | 0.117 ms | 0.147 ms |
| DFS | 1,000 | 0.123 ms | 0.121 ms | 0.128 ms |
| BFS | 10,000 | 1.729 ms | 1.668 ms | 2.462 ms |
| DFS | 10,000 | 2.028 ms | 1.694 ms | 2.698 ms |
Linear scaling — ~10x time for 10x vertices. At 10,000 vertices (30,000 edges), full traversal in under 2 ms. BFS and DFS perform comparably; DFS shows slightly higher variance at large sizes.
| Algorithm | Vertices | max_depth | Visited | Median |
|---|---|---|---|---|
| BFS | 100 | 3 | 10 | 0.002 ms |
| DFS | 100 | 3 | 10 | 0.002 ms |
| BFS | 100 | 5 | 16 | 0.003 ms |
| DFS | 100 | 5 | 16 | 0.003 ms |
| BFS | 100 | 10 | 31 | 0.005 ms |
| DFS | 100 | 10 | 31 | 0.006 ms |
| BFS | 10,000 | 3 | 10 | 0.001 ms |
| DFS | 10,000 | 3 | 10 | 0.001 ms |
| BFS | 10,000 | 10 | 31 | 0.004 ms |
| DFS | 10,000 | 10 | 31 | 0.004 ms |
Timing depends on visited count, not total graph size — as expected for O(V_visited + E_visited) algorithms with fixed fan-out.
| Vertices | Path | Hops | Median |
|---|---|---|---|
| 100 | 1→4 (near) | 1 | 0.001 ms |
| 100 | 1→51 | 17 | 0.010 ms |
| 1,000 | 1→4 (near) | 1 | 0.001 ms |
| 1,000 | 1→501 | 167 | 0.113 ms |
| 10,000 | 1→4 (near) | 1 | 0.001 ms |
| 10,000 | 1→5,001 | 1,667 | 0.979 ms |
Near-neighbor lookups: sub-microsecond regardless of graph size (early termination). Cross-ring paths scale with hop count — 1,667 hop path completes in ~1 ms at 10K vertices.
# From the repository root:
cargo bench -p graph-algo --bench traversalSource: modules/graph-algo/benches/traversal.rs
The algorithm benchmarks show the lower bound — how fast the traversal logic itself runs in memory. Inside a live SpacetimeDB instance, the graph module adds B-tree index lookups on the Edge table, transaction overhead, and (for client-invoked queries) HTTP round-trip latency.
The cross-system benchmarks show the real picture: SpacetimeDB's userland module wins loading, neighbors, and BFS at every scale — but shortest path trails Neo4j because the current architecture must scan all edges before it can traverse. Stage 2's bidirectional operator and index-hop BFS close both gaps without sacrificing the loading and neighbor-count advantages.
The architecture that makes Stage 1 fast at data loading and BFS (in-memory bulk operations, single-transaction inserts) is the same architecture that makes it slow at shortest path (must scan all edges before traversing). Stage 2 keeps the strengths and addresses the weaknesses.