Skip to content
a sha1-based bloom filter for go
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
.travis.yml
LICENSE
README.md
bloomfilter.go
bloomfilter_test.go
words.txt

README.md

License BSD Go Report Card GoDoc Build Status

bloomfilter

This package implements a bloom filter in Go.

http://en.wikipedia.org/wiki/Bloom_filter

provides an explanation of what a bloom filter is. Essentially it is a probabilistic memebership function with good size characteristics. For example, we may wish to read in the words from the dictionary file into the filter. Over 99% of the words can be entered into the filter before a collision occurs.

The approach in this package for hashing items into the filter is to obtain the 160 bit SHA1 hash of the original input item, which should give a good distribution. Then, this 160 bit value is decomposed into five 32-bit integers which are then used as modulo (wrapping) offsets into a BitSet (the storage mechanism used by this package). The bits at those offsets are set to true (1).

Should an input token ever hash to five locations that are already set in the BitSet, it will be considered a collision (false positive). Experiment with size settings that minimize collisions.

The size argument provided to the constructor is a desired size in bits for the BitSet used by the bloom filter as a storage mechanism. This value is rounded up to a byte boundary.

You can’t perform that action at this time.