Skip to content

Commit

Permalink
Merge pull request #37 from prrao87/update-kuzu
Browse files Browse the repository at this point in the history
Bump kuzu and neo4j and update benchmarks
  • Loading branch information
prrao87 committed Mar 14, 2024
2 parents cedc932 + e54ced4 commit af4f2e5
Show file tree
Hide file tree
Showing 12 changed files with 111 additions and 110 deletions.
60 changes: 31 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

Code for the benchmark study described in this [blog post](https://thedataquarry.com/posts/embedded-db-2/).

> [!NOTE]
> Neo4j version: `5.18.0`
> KùzuDB version: `0.3.2`

[Kùzu](https://kuzudb.com/) is an in-process (embedded) graph database management system (GDBMS) written in C++. It is blazing fast 🔥, and is optimized for handling complex join-heavy analytical workloads on very large graphs. Kùzu is being actively developed, and its [goal](https://kuzudb.com/docusaurus/blog/what-every-gdbms-should-do-and-vision) is to do in the graph data science space what DuckDB did in the world of tabular data science -- that is, to provide a fast, lightweight, embeddable graph database for analytics (OLAP) use cases, with minimal infrastructure setup.

This study has the following goals:
Expand Down Expand Up @@ -92,23 +97,20 @@ The run times for both ingestion and queries are compared.
* For ingestion, KùzuDB is consistently faster than Neo4j by a factor of **~18x** for a graph size of 100K nodes and ~2.4M edges.
* For OLAP queries, KùzuDB is **significantly faster** than Neo4j, especially for ones that involve multi-hop queries via nodes with many-to-many relationships.

### Testing conditions
### Benchmark conditions

* M3 Macbook Pro, 32 GB RAM
* Neo4j version: `5.16.0`
* KùzuDB version: `0.2.0`
The benchmark is run M3 Macbook Pro with 36 GB RAM.

### Ingestion performance

In total, ~100K nodes and ~2.5 million edges are ingested **~18x** faster in KùzuDB than in Neo4j.

Case | Neo4j (sec) | Kùzu (sec) | Speedup factor
--- | ---: | ---: | ---:
Nodes | 2.3 | 0.1 | 23x
Edges | 30.6 | 2.2 | 14x
Total | 32.9 | 2.3 | 14x
Nodes | 2.4 | 0.2 | 12x
Edges | 30.9 | 0.4 | 77x
Total | 33.3 | 0.6 | 55x

Nodes are ingested significantly faster in Kùzu in this case, and Neo4j's node ingestion remains of the order of seconds despite setting constraints on the ID fields as per their best practices. The speedup factors shown are expected to be even higher as the dataset gets larger and larger, with Kùzu being around two orders of magnitude faster for inserting nodes.
Nodes are ingested significantly faster in Kùzu, and Neo4j's node ingestion remains of the order of seconds despite setting constraints on the ID fields as per their best practices. The speedup factors shown are expected to be even higher as the dataset gets larger and larger using this approach, and
the only way to speed up Neo4j data ingestion is to avoid using Python and use `admin-import` instead.

### Query performance benchmark

Expand All @@ -132,39 +134,39 @@ The following table shows the run times for each query (averaged over the number

Query | Neo4j (sec) | Kùzu (sec) | Speedup factor
--- | ---: | ---: | ---:
1 | 1.5396 | 0.283 | 5.4
2 | 0.5680 | 0.378 | 1.5
3 | 0.0338 | 0.011 | 3.1
4 | 0.0391 | 0.009 | 4.3
5 | 0.0069 | 0.003 | 2.3
6 | 0.0159 | 0.034 | 0.5
7 | 0.1433 | 0.007 | 20.5
8 | 2.9034 | 0.092 | 31.6
9 | 3.6319 | 0.103 | 35.2
1 | 1.7614 | 0.2722 | 6.5x
2 | 0.6149 | 0.3340 | 1.8x
3 | 0.0388 | 0.0112 | 3.5x
4 | 0.0426 | 0.0094 | 4.5x
5 | 0.0080 | 0.0037 | 2.2x
6 | 0.0212 | 0.0335 | 0.6x
7 | 0.1592 | 0.0070 | 22.7x
8 | 3.2919 | 0.0901 | 36.5x
9 | 4.0125 | 0.1016 | 39.5x

#### Neo4j vs. Kùzu multi-threaded

KùzuDB (by default) supports multi-threaded execution of queries. The following results are for the same queries as above, but allowing Kùzu to choose the optimal number of threads for each query. Again, the run times for each query (averaged over the number of rounds run, guaranteed to be a minimum of 5 runs) are shown.

Query | Neo4j (sec) | Kùzu (sec) | Speedup factor
--- | ---: | ---: | ---:
1 | 1.5396 | 0.171 | 9.0
2 | 0.5680 | 0.203 | 2.8
3 | 0.0338 | 0.013 | 2.6
4 | 0.0391 | 0.012 | 3.3
5 | 0.0069 | 0.004 | 1.7
6 | 0.0159 | 0.033 | 0.5
7 | 0.1433 | 0.008 | 17.9
8 | 2.9034 | 0.074 | 39.3
9 | 3.6319 | 0.087 | 41.8
1 | 1.7614 | 0.1678 | 10.5x
2 | 0.6149 | 0.2025 | 3.0x
3 | 0.0388 | 0.0145 | 2.7x
4 | 0.0426 | 0.0136 | 3.1x
5 | 0.0080 | 0.0046 | 1.7x
6 | 0.0212 | 0.0346 | 0.6x
7 | 0.1592 | 0.0079 | 20.1x
8 | 3.2919 | 0.0777 | 42.4x
9 | 4.0125 | 0.0664 | 60.4x

> 🔥 The second-degree path-finding queries (8 and 9) show the biggest speedup over Neo4j, due to innovations in KùzuDB's query planner and execution engine.
### Ideas for future work

#### Scale up the dataset

It's possible to regenerate a fake dataset of ~100M nodes and ~2.5B edges, and see how the performance of KùzuDB and Neo4j compare -- it's likely that Neo4j cannot handle 2-hop path-finding queries at that scale on a single node, so queries 8 and 9 can be disabled for that larger dataset.
It's possible to regenerate an artificial dataset of ~100M nodes and ~2.5B edges, and see how the performance of KùzuDB and Neo4j compare -- it's likely that Neo4j cannot handle 2-hop path-finding queries at that scale on a single node, so queries 8 and 9 can be disabled for that larger dataset.

```sh
# Generate data with 100M persons and ~2.5B edges (Might take a while in Python!)
Expand Down
Binary file modified data/output/nodes/cities.parquet
Binary file not shown.
Binary file modified data/output/nodes/persons.parquet
Binary file not shown.
Binary file modified data/output/nodes/states.parquet
Binary file not shown.
74 changes: 37 additions & 37 deletions kuzudb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ As expected, the nodes load much faster than the edges, since there are many mor

```bash
$ python build_graph.py
Nodes loaded in 0.1509s
Edges loaded in 2.2402s
Nodes loaded in 0.1542s
Edges loaded in 0.3803s
Successfully loaded nodes and edges into KùzuDB!
```

Expand Down Expand Up @@ -420,67 +420,67 @@ Queries completed in 0.7561s
The benchmark is run using `pytest-benchmark` package as follows.

```sh
$ pytest benchmark_query.py --benchmark-min-rounds=5 --benchmark-warmup-iterations=5 --benchmark-disable-gc --benchmark-sort=fullname
========================================================================================= test session starts ==========================================================================================
platform darwin -- Python 3.11.7, pytest-8.0.0, pluggy-1.4.0
$ pytest benchmark_query.py --benchmark-min-rounds=5 --benchmark-warmup-iterations=5 --benchmark-disable-gc --benchmark-sort=fullname ✘ 130 update-kuzu ✱
====================================================================================================== test session starts =======================================================================================================
platform darwin -- Python 3.11.7, pytest-8.1.1, pluggy-1.4.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=True min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=5)
rootdir: /Users/prrao/code/kuzudb-study/kuzudb
plugins: Faker-23.1.0, benchmark-4.0.0
collected 9 items
collected 9 items

benchmark_query.py ......... [100%]
benchmark_query.py ......... [100%]


--------------------------------------------------------------------------------------- benchmark: 9 tests --------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_query1 252.0117 (86.58) 347.7262 (58.99) 283.1725 (84.65) 40.3382 (189.01) 263.0762 (79.45) 55.3614 (233.22) 1;0 3.5314 (0.01) 5 1
test_benchmark_query2 295.4568 (101.50) 490.8745 (83.28) 378.4995 (113.15) 82.3863 (386.04) 403.3876 (121.83) 127.7752 (538.28) 2;0 2.6420 (0.01) 5 1
test_benchmark_query3 10.3258 (3.55) 12.6975 (2.15) 10.8966 (3.26) 0.4811 (2.25) 10.7724 (3.25) 0.3823 (1.61) 11;4 91.7722 (0.31) 66 1
test_benchmark_query4 8.0921 (2.78) 9.1896 (1.56) 8.7203 (2.61) 0.2134 (1.0) 8.7555 (2.64) 0.2837 (1.20) 25;1 114.6747 (0.38) 78 1
test_benchmark_query5 2.9108 (1.0) 5.8945 (1.0) 3.3450 (1.0) 0.3156 (1.48) 3.3112 (1.0) 0.2374 (1.0) 13;3 298.9503 (1.0) 114 1
test_benchmark_query6 32.9890 (11.33) 36.0460 (6.12) 34.3424 (10.27) 0.7993 (3.75) 34.3895 (10.39) 1.1800 (4.97) 10;0 29.1185 (0.10) 27 1
test_benchmark_query7 6.1617 (2.12) 7.5800 (1.29) 6.7920 (2.03) 0.3007 (1.41) 6.7980 (2.05) 0.4178 (1.76) 34;0 147.2325 (0.49) 93 1
test_benchmark_query8 87.3487 (30.01) 94.6871 (16.06) 92.0254 (27.51) 2.4032 (11.26) 92.1223 (27.82) 3.5121 (14.80) 3;0 10.8666 (0.04) 9 1
test_benchmark_query9 99.9393 (34.33) 105.5227 (17.90) 103.5184 (30.95) 1.7100 (8.01) 104.0372 (31.42) 1.4556 (6.13) 2;1 9.6601 (0.03) 8 1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------- benchmark: 9 tests --------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_query1 254.0487 (77.28) 331.9569 (65.09) 272.2583 (72.23) 33.5090 (118.10) 258.9573 (70.13) 24.4201 (82.49) 1;1 3.6730 (0.01) 5 1
test_benchmark_query2 293.4545 (89.27) 388.5836 (76.19) 334.0680 (88.63) 48.9350 (172.46) 301.8217 (81.74) 88.7258 (299.73) 2;0 2.9934 (0.01) 5 1
test_benchmark_query3 10.4950 (3.19) 12.3280 (2.42) 11.2442 (2.98) 0.3642 (1.28) 11.2188 (3.04) 0.4407 (1.49) 19;2 88.9345 (0.34) 62 1
test_benchmark_query4 8.6238 (2.62) 11.0205 (2.16) 9.3746 (2.49) 0.4232 (1.49) 9.2816 (2.51) 0.4236 (1.43) 15;6 106.6709 (0.40) 76 1
test_benchmark_query5 3.2872 (1.0) 5.1003 (1.0) 3.7691 (1.0) 0.3535 (1.25) 3.6925 (1.0) 0.2960 (1.0) 23;9 265.3119 (1.0) 104 1
test_benchmark_query6 32.8883 (10.00) 35.4205 (6.94) 33.5387 (8.90) 0.5317 (1.87) 33.3696 (9.04) 0.6214 (2.10) 6;1 29.8163 (0.11) 28 1
test_benchmark_query7 6.2537 (1.90) 7.7147 (1.51) 7.0166 (1.86) 0.2837 (1.0) 7.0423 (1.91) 0.3966 (1.34) 34;0 142.5183 (0.54) 91 1
test_benchmark_query8 86.9893 (26.46) 91.6528 (17.97) 90.1817 (23.93) 1.5253 (5.38) 90.8585 (24.61) 2.1688 (7.33) 1;0 11.0887 (0.04) 9 1
test_benchmark_query9 98.5566 (29.98) 105.5151 (20.69) 101.6341 (26.96) 2.2933 (8.08) 101.5073 (27.49) 2.8376 (9.59) 2;0 9.8392 (0.04) 7 1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
OPS: Operations Per Second, computed as 1 / Mean
========================================================================================== 9 passed in 11.55s ==========================================================================================
======================================================================================================= 9 passed in 11.30s =======================================================================================================
```

#### Query performance (Kùzu multi-threaded)

```sh
$ pytest benchmark_query.py --benchmark-min-rounds=5 --benchmark-warmup-iterations=5 --benchmark-disable-gc --benchmark-sort=fullname
========================================================================================= test session starts ==========================================================================================
platform darwin -- Python 3.11.7, pytest-8.0.0, pluggy-1.4.0
$ pytest benchmark_query.py --benchmark-min-rounds=5 --benchmark-warmup-iterations=5 --benchmark-disable-gc --benchmark-sort=fullname ✘ 130 update-kuzu ✱
====================================================================================================== test session starts =======================================================================================================
platform darwin -- Python 3.11.7, pytest-8.1.1, pluggy-1.4.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=True min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=5)
rootdir: /Users/prrao/code/kuzudb-study/kuzudb
plugins: Faker-23.1.0, benchmark-4.0.0
collected 9 items
collected 9 items

benchmark_query.py ......... [100%]
benchmark_query.py ......... [100%]


-------------------------------------------------------------------------------------- benchmark: 9 tests --------------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_query1 144.0958 (38.16) 268.6453 (45.91) 171.1153 (37.70) 54.5636 (238.27) 147.4435 (32.35) 34.3697 (105.56) 1;1 5.8440 (0.03) 5 1
test_benchmark_query2 201.5580 (53.38) 206.7633 (35.33) 203.4183 (44.81) 2.1432 (9.36) 203.2485 (44.60) 3.0854 (9.48) 1;0 4.9160 (0.02) 5 1
test_benchmark_query3 12.5962 (3.34) 13.7345 (2.35) 13.0013 (2.86) 0.2290 (1.0) 12.9419 (2.84) 0.3256 (1.0) 16;1 76.9153 (0.35) 57 1
test_benchmark_query4 11.6342 (3.08) 13.6797 (2.34) 12.4356 (2.74) 0.5213 (2.28) 12.2805 (2.69) 0.7021 (2.16) 21;0 80.4144 (0.37) 59 1
test_benchmark_query5 3.7759 (1.0) 5.8518 (1.0) 4.5393 (1.0) 0.3693 (1.61) 4.5571 (1.0) 0.4914 (1.51) 31;1 220.2987 (1.0) 102 1
test_benchmark_query6 31.2499 (8.28) 49.5773 (8.47) 33.4805 (7.38) 3.3679 (14.71) 32.5581 (7.14) 1.9644 (6.03) 1;1 29.8682 (0.14) 29 1
test_benchmark_query7 7.1773 (1.90) 10.1346 (1.73) 8.2812 (1.82) 0.6263 (2.74) 8.1582 (1.79) 0.6606 (2.03) 19;6 120.7552 (0.55) 80 1
test_benchmark_query8 64.1151 (16.98) 83.9229 (14.34) 73.9044 (16.28) 5.2735 (23.03) 73.8122 (16.20) 4.6661 (14.33) 4;2 13.5310 (0.06) 12 1
test_benchmark_query9 55.9865 (14.83) 147.0649 (25.13) 84.8630 (18.70) 27.7042 (120.98) 74.6289 (16.38) 3.9108 (12.01) 3;4 11.7837 (0.05) 14 1
test_benchmark_query1 143.7831 (36.84) 252.6395 (38.09) 167.8478 (36.78) 47.4400 (111.50) 147.9915 (33.16) 29.3312 (60.86) 1;1 5.9578 (0.03) 5 1
test_benchmark_query2 198.2216 (50.79) 205.8762 (31.04) 202.4746 (44.37) 3.0336 (7.13) 203.0530 (45.50) 4.6756 (9.70) 2;0 4.9389 (0.02) 5 1
test_benchmark_query3 13.5389 (3.47) 15.5465 (2.34) 14.4884 (3.17) 0.4255 (1.0) 14.4661 (3.24) 0.4819 (1.0) 15;1 69.0209 (0.31) 52 1
test_benchmark_query4 12.5585 (3.22) 14.5405 (2.19) 13.6137 (2.98) 0.4390 (1.03) 13.5607 (3.04) 0.5406 (1.12) 20;1 73.4555 (0.34) 55 1
test_benchmark_query5 3.9030 (1.0) 6.6330 (1.0) 4.5634 (1.0) 0.4712 (1.11) 4.4623 (1.0) 0.4962 (1.03) 16;5 219.1327 (1.0) 101 1
test_benchmark_query6 32.6305 (8.36) 42.6955 (6.44) 34.6170 (7.59) 2.0708 (4.87) 34.1572 (7.65) 0.7366 (1.53) 3;3 28.8876 (0.13) 27 1
test_benchmark_query7 6.9358 (1.78) 9.6718 (1.46) 7.8832 (1.73) 0.4438 (1.04) 7.8641 (1.76) 0.4891 (1.02) 22;2 126.8526 (0.58) 91 1
test_benchmark_query8 65.6220 (16.81) 125.4942 (18.92) 77.7316 (17.03) 21.5360 (50.61) 66.9292 (15.00) 3.3628 (6.98) 3;3 12.8648 (0.06) 14 1
test_benchmark_query9 64.6778 (16.57) 68.5543 (10.34) 66.3754 (14.55) 1.0579 (2.49) 66.3023 (14.86) 1.0378 (2.15) 4;1 15.0658 (0.07) 14 1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
OPS: Operations Per Second, computed as 1 / Mean
========================================================================================== 9 passed in 10.20s ==========================================================================================
```
======================================================================================================= 9 passed in 10.13s =======================================================================================================
```

0 comments on commit af4f2e5

Please sign in to comment.