- extended to include datahike
- munged the core bench stuff to work on local windows setup
- benchmark code should be equivalent (just adapted for API nuances)
Designed to be run from clojure cli. Assuming you have that installed and the `clojure` or `clj` command available:
clj -m bench
Invokes a somewhat lengthy processing of all the libs and query benchmarks. There should be support similar to the original for passing in specific libs to test, right now all versions are “latest” and really only resolve to what’s in the deps.edn; so there’s a difference in that respect.
Median run times in ms (per extant benchmarking infrastructure):
- openjdk version “1.8.0_222”
- OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_222-b10)
- OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.222-b10, mixed mode)
- 4gb Xmx
- Dell XPS 15 9550
If we run the bench for the first time, we get good performance. On windows the tmp path will just populate a database at c:\tmp\blah\ . The database file is about 100mb.
version | q1 | q2 | q3 | q4 | qpred1 | qpred2 |
---|---|---|---|---|---|---|
latest-datascript | 2.7 | 6.4 | 9.3 | 14.8 | 8.4 | 31.4 |
latest-datalevin | 0.79 | 3.5 | 4.6 | 6.7 | 8.1 | 9.6 |
latest-datahike-mem | 10.6 | 22.7 | 35.4 | 48.9 | 24.4 | 46.4 |
latest-datahike-file | 11.3 | 23.2 | 35.3 | 49.4 | 24.0 | 46.0 |
In the original version of the benchmark, we don’t clean up this resource, so datalevin will re-use the existing database every time, which can lead to growing the database (lots of new random entries) from a 20K person db to far more, leading to a deceptive benchmark. By the time this was pointed out, I had run the benchmark many times, which ultimately led to a 1.3gb database file. That explains the performance differential!
version | q1 | q2 | q3 | q4 | qpred1 | qpred2 |
---|---|---|---|---|---|---|
latest-datascript | 2.3 | 6.2 | 11.3 | 16.8 | 10.3 | 38.0 |
latest-datalevin | 5.4 | 23.6 | 26.8 | 41.6 | 53.3 | 60.4 |
latest-datahike-mem | 12.2 | 25.0 | 35.1 | 47.9 | 23.7 | 46.2 |
latest-datahike-file | 10.6 | 22.1 | 34.6 | 47.8 | 23.8 | 46.4 |
- 2.3 GHz Intel Xeon® E5-2686 v4 (Broadwell) processors or 2.4 GHz Intel Xeon® E5-2676 v3 (Haswell) processors
Model | vCPU* | Mem (GiB) | Storage | Dedicated EBS Bandwidth (Mbps) | Network Performance |
---|---|---|---|---|---|
m4.large | 2 | 8 | EBS-only | 450 | Moderate |
- openjdk version “1.8.0_265”
- OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~16.04-b01)
- OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)
- 4gb Xmx
version | q1 | q2 | q3 | q4 | qpred1 | qpred2 |
---|---|---|---|---|---|---|
latest-datascript | 2.9 | 7.9 | 11.6 | 18.4 | 10.5 | 35.5 |
latest-datalevin | 0.99 | 4.5 | 5.7 | 8.7 | 9.9 | 12.0 |
latest-datahike-mem | 16.6 | 34.5 | 53.5 | 74.9 | 33.2 | 58.1 |
latest-datahike-file | 16.1 | 34.1 | 53.8 | 72.4 | 33.0 | 58.0 |
Ensure resource cleanup on windows for datalevin, since it will reuse the database (of course). Datahike and datascript didn’t experience this since they cleaned up / deleted their databases as part of the original benchmark.