Skip to content

A benchmark of different persistent data structures across JVM libraries and languages

Notifications You must be signed in to change notification settings

JohnnyJayJay/persistent-data-structures-benchmark

Repository files navigation

persistent-collections-benchmark

A set of benchmarks comparing the performance of various operations from four different JVM collections frameworks.

Jump to the results

Goals

The goal of these benchmarks is to compare the computational performance of addition, removal and lookup on different types of persistent collections from three different JVM libraries/languages:

Another goal is to see how persistent collections compare to the mutable java.util Collections framework.

Non-Goals

Claiming Authority

This benchmark does not aim to determine what collections are ultimately better than others in general.
It does not emulate a real-world-scenario, because the operations are each benchmarked individually. In reality, most uses of collections involve addition, removal and lookup all at once and in a much less concentrated fashion. Also, the size of the collections is a significant factor when it comes to speed, so there may be notable differences in results when focusing more on collections of different sizes.

Benchmarking Undesirable Operations

This benchmark only considers the ways these collections are supposed to be used, i.e., it examines the performance of the operations the respective collections were built and optimised for (e.g. value-/key-based lookup on sets/maps, random access on vectors/arraylists, head/tail lookup on LinkedLists/queues/stacks). Thus, there are no benchmarks measuring the performance of random access on non-random-access collections, for example.

Benchmarking Memory Efficiency

Memory use and efficiency are not measured.

Examining Concurrent Contexts

The benchmarks all run on a single thread and do not make a statement about performance in multi-threaded contexts. The reason for this decision are that

  1. Writing correct concurrent code is hard, writing meaningful concurrent benchmarks is harder
  2. It is unclear how to compare Java's mutable collections to the persistent ones. Do we use locks for Java and atomic references for the others? If so, does that accurately represent the way we write concurrent code and to which extent is that still comparable?

Comparing Collections of Different Types

The benchmark results are generally not suitable to compare different types of collections (e.g. Lists and Sets). The reason for that is that the benchmarks may work differently: For example, In the Set lookup benchmarks, there is the additional overhead of creating random strings to measure lookup of nonexistent elements, while this is not done for lists and therefore may result in a significant difference.
The random string creation method is benchmarked independently if you want to get a vague impression nonetheless.

Benchmarking all Available Operations

Lastly, the benchmarks do not cover every operation. For instance, Clojure's vectors and sorted maps support a constant time reversing operation that is not included.

Structure

Below you can see how many collections are benchmarked in each category for each library. The last column gives information about how long each benchmark takes.

Benchmark/Subject Benchmark count Java PCollections Kotlin Clojure
Addition 8 5 5 7 5 * 1M ops Warmup, 10 * 1M ops Measurement
Removal 8 5 5 6 5 * 1M ops Warmup, 10 * 1M ops Measurement
Lookup 8 5 5 7 5 * 10s Warmup, 5 * 10s Measurement

Here are all the collections that are benchmarked. Clojure is missing the removal benchmark for its PersistentVector, because it doesn't define an index-based removal operation.

Data Structure/Library Equivalent Java PCollections Kotlin Clojure
Random Access List ArrayList TreePVector PersistentVector PersistentVector
Queue (FIFO) LinkedList - - PersistentQueue
Stack (LIFO) LinkedList ConsPStack - PersistentList
Unordered Set HashSet HashTreePSet PersistentHashSet PersistentHashSet
Unordered Map HashMap HashTreePMap PersistentHashMap PersistentHashMap
Linked Set (entry order) LinkedHashSet - PersistentOrderedSet -
Linked Map (entry order) LinkedHashMap - PersistentOrderedMap -
Sorted Set TreeSet - - PersistentTreeSet
Sorted Map TreeMap - - PersistentTreeMap
Bag (unordered, duplicate elements) - HashTreePBag - -
  • The addition benchmarks fill an empty collection with random strings each iteration

  • The removal benchmarks remove elements from a fresh collection each iteration

  • The lookup benchmarks lookup random elements/indices from a collection

Every single benchmark is forked 3 times to make up for differences in VM configurations, randomness and other environmental factors.

How to run

If you do want to run everything, do the following:

$ git clone https://github.com/johnnyjayjay/persistent-data-structures-benchmark
$ cd persistent-data-structures-benchmark
$ ./gradlew jmh

Results

Many thanks to those who provided their computing power to run these benchmarks for my initial evaluation:

The above results mostly match up and show similar patterns. Here is a simplified view of PiggyPiglet's results:

Results as a graph

This graph only shows the 4 most common data types in all 4 libraries:

  • List (random access list - included because it is the de facto default data structure for most applications)
  • Stack (aka cons, list - included because it is the simplest persistent data structure)
  • Set (unordered set)
  • Map (unordered map)

Lower scores are better, as they indicate execution time.

There are a couple of additional things to note about this data:

  • Kotlin does not provide a persistent stack implementation or similar, thus there is no data in those areas.
  • For the "List removal" benchmark, only PCollection results are available, because:
    • Clojure does not have indexed-based removal on its vectors
    • Java is too big of an outlier, taking around 20 seconds to remove 1M elements
    • Kotlin is an even bigger outlier, in this case taking over 200 seconds. In the other results, it did not even terminate without timing out.
  • The black lines on top of the bars show the 99.9% error relative to the measurement result
  • While it is hard to make out, there are indeed removal and lookup results for Stacks. They just happen to be very efficient in those cases, to an extent where the difference between the tested libraries almost becomes insignificant.

Contributions

Whether you find an issue in the benchmark code, want to improve the default settings or simply want to submit your own benchmark results: Feel free to open an issue or a pull request! The more people work on benchmarks like this, the more useful their results become.

About

A benchmark of different persistent data structures across JVM libraries and languages

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published