bloom filter for each 2k of disk seems inefficient #19

Closed
matthewvon opened this Issue Jul 13, 2012 · 1 comment

Comments

Projects
None yet
1 participant
@matthewvon
Contributor

matthewvon commented Jul 13, 2012

The google provided bloom filter code segregates a file into 2K chunks. It creates a bloom filter for each chunk. This 2k is hard coded and completely ignores block_size. The code also creates 2k placeholder objects if a given key/value record covers more than one 2k region of disk. Again, seems inefficient.

@matthewvon

This comment has been minimized.

Show comment
Hide comment
@matthewvon

matthewvon Aug 14, 2013

Contributor

Addressed in bloom2 code. Left in Google's original bloom code.

Contributor

matthewvon commented Aug 14, 2013

Addressed in bloom2 code. Left in Google's original bloom code.

@matthewvon matthewvon closed this Aug 14, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment