Skip to content
Go library implementing xor filters
Go
Branch: master
Clone or download
lemire Merge pull request #17 from el10savio/xorfilter-codebase
Refactored xorfilter.go codebase
Latest commit fef157c Jan 25, 2020
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
figures
.drone.yml
.gitignore Initial commit Dec 16, 2019
LICENSE
README.md
fusefilter.go Made more moular Jan 19, 2020
fusefilter_test.go Made more moular Jan 19, 2020
go.mod
go.sum
xorfilter.go
xorfilter_definitions.go
xorfilter_test.go Simplified make() for tests Jan 25, 2020

README.md

xorfilter: Go library implementing xor filters

GoDoc Build Status

Bloom filters are used to quickly check whether an element is part of a set. Xor filters are a faster and more concise alternative to Bloom filters. They are also smaller than cuckoo filters.

Reference: Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters, Journal of Experimental Algorithmics (to appear).

We are assuming that your set is made of 64-bit integers. If you have strings or other data structures, you need to hash them first to a 64-bit integer. It is not important to have a good hash function, but collision should be unlikely (~1/2^64).

The current implementation has a false positive rate of about 0.3% and a memory usage of less than 9 bits per entry for sizeable sets.

You construct the filter as follows starting from a slice of 64-bit integers:

filter,_ := xorfilter.Populate(keys) // keys is of type []uint64

It returns an object of type Xor8. The 64-bit integers would typically be hash values of your objects.

You can then query it as follows:

filter.Contains(v) // v is of type uint64

It will always return true if v was part of the initial construction (Populate) and almost always return false otherwise.

An xor filter is immutable, it is concurrent. The expectation is that you build it once and use it many times.

Though the filter itself does not use much memory, the construction of the filter needs many bytes of memory per set entry.

For persistence, you only need to serialize the following data structure:

type Xor8 struct {
	Seed         uint64
	BlockLength  uint32
	Fingerprints []uint8
}

Duplicate keys

When constructing the filter, you should ensure that there is no duplicate keys. If you think that this might happen, then you should check the error condition.

filter,err := xorfilter.Populate(keys) // keys is of type []uint64
if err != nil {
	// you have duplicate keys, de-duplicate them?
}

Effectively, an error is returned when the filter could not be build after MaxIterations iterations (default to 100).

Implementations of xor filters in other programming languages

You can’t perform that action at this time.