Skip to content

brettwooldridge/SparseBitSet

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.

SparseBitSet

TL;DR

Basically, if you need to set a large number of bits, or bits at extremely high offsets, you probably want to use this Sparse BitSet. All other alternatives are essentialy off the table; the Java BitSet class is a non-starter. Performance is superior in almost all cases to the standard Java BitSet.

Preface

You know how the internets are; a link that exists today is gone tomorrow. A while ago I had a need for an efficient Sparse BitSet in Java, and found a presentation and code by Dr. Bruce K. Haddon. Going back later, I found the links I had used to find it were dead. Some internet sleuthing later, I found and contacted Dr. Haddon and he was kind enough to send me the presentation again. I have created this project to capture the code for others, as well as the presentation. I can take credit for neither.

Maven

<dependency>
   <groupId>com.zaxxer</groupId>
   <artifactId>SparseBitSet</artifactId>
   <version>1.2</version>
   <scope>compile</scope>
</dependency>

The Problem and Alternatives

The standard Java BitSet is terribly memory inefficient. To store a single bit using BitSet at bit 232-1 takes 227 32-bit words (226 64bit “words”), not counting any Java object overhead.

Using a HashSet of Integers results in (for each bit), 7 32-bit words overhead, or for 64 bits ~448 32-bit words overhead.

Using a HashMap, where the key = bitvalue / 64, and the value is a Long of packed bits, results in (for 64 bits) ~8 32-bit words overhead.

Using a custom hash table, where the key is an int = bitvalue / 64, and the value is a packed long, results in (for 64 bits) ~4 32-bit words overhead.

The Solution: SparseBitSet

Using a virtual-memory like structure, the SparseBitSet overhead is ~0.03 32-bit words overhead per 64 bits.

For a full analysis, read Dr. Haddon's slide stack.

About

An efficient sparse bit set implementation for Java

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages