Skip to content

bloom filter for each 2k of disk seems inefficient #19

matthewvon opened this Issue Jul 13, 2012 · 1 comment

1 participant


The google provided bloom filter code segregates a file into 2K chunks. It creates a bloom filter for each chunk. This 2k is hard coded and completely ignores block_size. The code also creates 2k placeholder objects if a given key/value record covers more than one 2k region of disk. Again, seems inefficient.


Addressed in bloom2 code. Left in Google's original bloom code.

@matthewvon matthewvon closed this Aug 14, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.