Skip to content

pkriete/Ploomy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ploomy

Ploomy implements a very simple python bloom filter.

What is a bloom filter?

A bloom filter is a fast and memory efficient way to check if an element has not been added to it (the reverse is not true). This makes it a good choice as a front to data store or when you need unique random data.

It works by hashing your input multiple times using a hashing algorithm that returns a number between 0 and n. These numbers represent indexes in a bit array. When an element is added to the filter, its hash indexes are set to 1. When you query for an element it is once again hashed and the bloom filter will return false iff any of the buckets are not 1.

More info ...

Usage

Any hash function has collisions, so correctly sizing your bloom filter is important to get the best possible performance from it. You must specify both the number of buckets and the amount of hashing that should be done.

buckets = 500
hashes = 5
bfilter = Bloom(buckets, hashes)

With that in place, you can add data and check if it exists:

bfilter.add('pony')
'pony' in bfilter # true
'frog' in bfilter # false

About

Python Bloom Filter

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages