Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bloom_get_indexes function #27

Open
zane-a-karl opened this issue Sep 22, 2023 · 0 comments
Open

Add bloom_get_indexes function #27

zane-a-karl opened this issue Sep 22, 2023 · 0 comments

Comments

@zane-a-karl
Copy link

zane-a-karl commented Sep 22, 2023

This simple yet wonderful library is great! :) Thanks for all the hard work. I've been using it lately to implement some cryptographic protocols that use bloom filters as building blocks and having a function which returns the indexes output by each hash function given an entry would be enormously convenient.

This function would check whether a given element exists (probabilistically) in the bloom filter, and then hash it with each of the "bloom->hashes"-many hash functions associated with the bloom filter. Finally, it would store the output index (digest) in an output argument array.

Here's what I've been using to accomplish this task so far:

void bloom_get_indexes(unsigned long * indexes,
		       void * element, unsigned long len, bloom_filter_t * bloom)
{
    if (1 == bloom_check(bloom, element, len)) {
	unsigned int a = murmurhash2(element, len, 0x9747b28c); // Magic number from bloom_check_add()
	unsigned int b = murmurhash2(element, len, a);

	for (unsigned long i = 0; i < bloom->hashes; i++) {
	    indexes[i] = ((a + b*i) % bloom->bits) >> 3;
	}
    }
}

Here's what I would expect the function prototype to look like matching your header file's style.

/** ***************************************************************************
 * Retrieve and store the indexes of a given bloom filter element.
 *
 * Upon return, the indexes holds the output of the hash function
 * applied to the given element.
 *
 * Parameters:
 * -----------
 *     indexes - Pointer to an allocated array of length bloom->hashes.
 *     buffer  - Pointer to buffer containing element.
 *     len     - Size of 'buffer'.
 *     bloom   - Pointer to an allocated struct bloom (see above).
 *
 * Return: none
 *
 */
void bloom_get_indexes(unsigned long * indexes,
		       void * element, unsigned long len, bloom_filter_t * bloom);

If this isn't something you feel fits with the minimalism of the library I totally understand. Also, if I made some kind of logic error in the implementation I would love to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant