Skip to content
master
Go to file
Code

Latest commit

Currently it sets the length to the position of the needle in the first
word it looks up and only searches indices below that. This is incorrect
after the initial word where we need to search from the top.
cc7b16f

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

SparseBitSet

TL;DR

Basically, if you need to set a large number of bits, or bits at extremely high offsets, you probably want to use this Sparse BitSet. All other alternatives are essentialy off the table; the Java BitSet class is a non-starter. Performance is superior in almost all cases to the standard Java BitSet.

Preface

You know how the internets are; a link that exists today is gone tomorrow. A while ago I had a need for an efficient Sparse BitSet in Java, and found a presentation and code by Dr. Bruce K. Haddon. Going back later, I found the links I had used to find it were dead. Some internet sleuthing later, I found and contacted Dr. Haddon and he was kind enough to send me the presentation again. I have created this project to capture the code for others, as well as the presentation. I can take credit for neither.

Maven

<dependency>
   <groupId>com.zaxxer</groupId>
   <artifactId>SparseBitSet</artifactId>
   <version>1.1</version>
   <scope>compile</scope>
</dependency>

The Problem and Alternatives

The standard Java BitSet is terribly memory inefficient. To store a single bit using BitSet at bit 232-1 takes 227 32-bit words (226 64bit “words”), not counting any Java object overhead.

Using a HashSet of Integers results in (for each bit), 7 32-bit words overhead, or for 64 bits ~448 32-bit words overhead.

Using a HashMap, where the key = bitvalue / 64, and the value is a Long of packed bits, results in (for 64 bits) ~8 32-bit words overhead.

Using a custom hash table, where the key is an int = bitvalue / 64, and the value is a packed long, results in (for 64 bits) ~4 32-bit words overhead.

The Solution: SparseBitSet

Using a virtual-memory like structure, the SparseBitSet overhead is ~0.03 32-bit words overhead per 64 bits.

For a full analysis, read Dr. Haddon's slide stack.

About

An efficient sparse bit set implementation for Java

Resources

License

Packages

No packages published

Languages

You can’t perform that action at this time.