The Capsule Hash Trie Collections Library
Capsule aims to become a full-fledged (immutable) collections library for Java 8+ that is solely built around persistent tries. The library is designed for standalone use and for being embedded in domain-specific languages. Capsule still has to undergo some incubation before it can ship as a well-rounded collection library. Nevertheless, the code is stable and performance is solid. Feel free to use it and let us know about your experiences!
More extensive tests and performance benchmarks will be added soon. The preliminary API for the immutable interfaces will be reworked as soon as possible as well.
Binary builds of capsule are deployed in the usethesource repository. In case you use Maven for dependency management, you have to add another repository location to your pom.xml file:
<repositories> <repository> <id>usethesource</id> <url>http://nexus.usethesource.io/content/repositories/public/</url> </repository> </repositories>
Furthermore, you have to declare capsule as a dependency. To obtain the latest stable version for Java 8+, insert the following snippet in your pom.xml file:
<dependency> <groupId>io.usethesource</groupId> <artifactId>capsule</artifactId> <version>0.6.1</version> </dependency>
To obtain the latest stable backport for Java 7 (that lags behind the Java 8+ release), insert the following snippet in your pom.xml file:
<dependency> <groupId>io.usethesource</groupId> <artifactId>capsule-jdk7</artifactId> <version>0.2.1</version> </dependency>
Snippets for other build tools and dependency management systems may vary slightly.
Background: Efficient Immutable Data Structures on the JVM
The standard libraries of recent Java Virtual Machine languages, such as Clojure or Scala, contain scalable and well-performing immutable collection data structures that are implemented as Hash-Array Mapped Tries (HAMTs). HAMTs already feature efficient lookup, insert, and delete operations, however due to their tree-based nature their memory footprints and the runtime performance of iteration and equality checking lag behind array-based counterparts.
We introduce CHAMP (Compressed Hash-Array Mapped Prefix-tree), an evolutionary improvement over HAMTs. The new design increases the overall performance of immutable sets and maps. Furthermore, its resulting general purpose design increases cache locality and features a canonical representation.
References and Further Readings
- JVM Language Summit 2016 - Efficient and Expressive Immutable Collections (Speaker: Michael Steindorfer)
- JVM Language Summit 2017 - Lightweight Relations (Speaker: Michael Steindorfer)
- Clojure/west 2016 - Hash Maps: More Room on the Bottom (Speaker: Peter Schuck)
- PhD Thesis: Efficient Immutable Collections (2017)
- Paper: Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections (OOPSLA 2015)
- Paper: Fast and Lean Immutable Multi-Maps on the JVM based on Heterogeneous Hash-Array Mapped Tries (Draft, 2016)
- Paper: Towards a Software Product Line of Trie-Based Collections (Short Paper, GPCE 2016)
- Paper: Code Specialization for Memory Efficient Hash Tries (Short Paper, GPCE 2014)