Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Succinct

Succinct, compact, and compressed data structures for data-intensive applications.

Notable features

  • State of the art broadword select implementation based on Sebastiano Vigna's fastutil library.

  • "Space-Efficient, High-Performance Rank and Select Structures on Uncompressed Bit Sequences" that supports bit arrays with up to 2^64 bits. This is a data structure that endows Python's bitarray data structure with the following operations:

    • rank(i: int) -> int: The number of one bits to the left of, and including, the ith position.

    • rank_zero(i: int) -> int: The number of zero bits to the left of, and including, the ith position.

    • select(bit_rank: int) -> int: The index of the left-most bit in the bitarray whose rank is bit_rank.

    • select_zero(bit_rank_zero: int) -> int: The index of the left-most bit in the bitarray whose rank_zero is bit_rank. The current implementation of select_zero is not as efficient as the implementation of select, as it uses a binary search over rank_zero rather than a sophisticated data structure. This may change in the future.

  • Elias-Fano representation of monotone sequences of natural numbers. Using this encoding, "an element occupies a number of bits bounded by two plus the logarithm of the average gap" (source). This can be an excellent data structure for representing lists of monotonically-increasing natural numbers. Applications include inverted indexes, pointers into massive arrays, etc. See this blog post for more information.

  • "On Compressing Permutations and Adaptive Sorting" by Barbay and Navarro, which can be very useful for maintaining massive permutations in memory while allowing efficient inverse lookups. This data structure is most useful when the permutation consists of many runs of increasing values.

  • LOUDS representation of binary tree topology, using (in theory) slightly more than two bits per tree node while providing efficient tree navigation.

Installation

At the command line:

$ pip install succinct

Alternatively, you can install succinct from source by cloning this repo and running the provided setup.sh script.

Statement of public good

This project is made possible by The Mathematics and Informatics Institute of Ohio. The author gratefully acknowledges Root Insurance Company for providing 12 "hack days" per year for engineers to work on projects such as this one.

About

Succinct, compact, and compressed data structures for data-intensive applications

Resources

License

Releases

No releases published
You can’t perform that action at this time.