Permalink
Browse files

Now using a LCG defined internally to ensure the different C

implementations on different platforms can refer to the same redis item
and agree on what's in the filter
  • Loading branch information...
1 parent b6204af commit d88f6e91428549952e7beca7a52f2015236d8ee3 Dan Lecocq committed Nov 1, 2012
Showing with 29 additions and 4 deletions.
  1. +10 −0 README.md
  2. +18 −3 pyreBloom/bloom.c
  3. +1 −1 setup.py
View
@@ -5,6 +5,16 @@ One of Salvatore's suggestions for Redis' GETBIT and SETBIT commands is to
implement bloom filters. There was an existing python project that we used
for inspiration.
+Notice
+======
+__Important__ -- The most recent version uses different seed values from all
+previous releases. Previous releases were using `srand` and `rand`, though they
+are not guaranteed to yield the same values on different systems. For example,
+two clients compiled on different platforms with different C implementations
+may not necessarily agree on what's in the filters. This latest version fixes
+this, but will also be incompatible with filters constructed with any previous
+versions.
+
Installation
============
View
@@ -36,10 +36,25 @@ int init_pyrebloom(pyrebloomctxt * ctxt, unsigned char * key, uint32_t capacity,
ctxt->key = key;//(unsigned char *)(malloc(strlen(key)));
//strcpy(ctxt->key, key);
ctxt->seeds = (uint32_t *)(malloc(ctxt->hashes * sizeof(uint32_t)));
- // Generate all the seeds
- srand(1);
+
+ /* The implementation here used to rely on srand(1) and then repeated
+ * calls to rand(), but I no longer trust that to provide correct behavior
+ * when working between different platforms. As such, We'll be using a LCG
+ *
+ * http://en.wikipedia.org/wiki/Linear_congruential_generator
+ *
+ * Hopefully this will be the last of interoperability issues. Note that
+ * updating to this version will unfortunately require rebuilding old
+ * bloom filters.
+ *
+ * Our m is implicitly going to be 2^32 by storing the result into a
+ * uint32_t */
+ uint32_t a = 1664525;
+ uint32_t c = 1013904223;
+ uint32_t x = 314159265;
for (i = 0; i < ctxt->hashes; ++i) {
- ctxt->seeds[i] = (uint32_t)(rand());
+ ctxt->seeds[i] = x;
+ x = a * x + c;
}
// Now for the redis context
View
@@ -19,7 +19,7 @@
setup(
name = 'pyreBloom',
- version = "0.1.2",
+ version = "1.0.0",
author = "Dan Lecocq",
author_email = "dan@seomoz.org",
license = "MIT License",

0 comments on commit d88f6e9

Please sign in to comment.