Tr13 is a library for constructing and using read-only compact (memory-efficient) in-memory trie data structures, using raw (byte-sequences,
byte) keys and values. Raw values can be automatically
converted to/from basic JDK types such as
tries are "raw", in that they are stored as raw byte sequences (
ByteBuffer, off-heap or in-heap, or
byte), meaning that garbage-collection overhead should be minimal as the main data area is either off-heap (for native or mmap'ed
ByteBuffers), or a single
Main benefits for using such tries is both relatively low memory usage (typically 20 - 35% less than raw input size) and low GC impact.
Mailing list: http://groups.google.com/group/ning-tr13-users
Tries need to be built in lexicographic order, so pre-sorting may be needed. For that, you may want to check out Java Merge-sort project. Key and value types
Building is done in two steps:
- Construct raw trie structure in streaming fashion, either into a memory buffer (like
ByteArrayOutputStream), or a File.
- Construct actual
triefrom serialized raw sequence (
Two-phase processing is needed since the actual result size is not known in advance.
This tr13 from LevelDB gist shows simple usage.
Check out [https://github.com/ning/tr13/wiki]