Skip to content

Conversation

@ayazhafiz
Copy link
Contributor

This commit adds a Fuse8 filter, a xor filter with 8-bit fingerprints
that uses the fuse graph data structure
to achieves better space ratios than the Xor8 filter (and is probably
faster to populate). The implementation is very similar to the C one,
which was added in FastFilter/xor_singleheader@4392b5e.

Fuse8 filters require a large number of keys (> 100_000), but have the
same false positive percentage as Xor8 filters (< 0.4%) while using much
less space (~9.1 bits/entry as compared to Xor8's ~9.9 bits/entry).

16-bit fingerprint versions of the xor filter using fuse graphs should
be trivial to implement.

Closes #8

This commit adds a `Fuse8` filter, a xor filter with 8-bit fingerprints
that uses the [fuse graph](https://arxiv.org/abs/1907.04749) data structure
to achieves better space ratios than the `Xor8` filter (and is probably
faster to populate). The implementation is very similar to the C one,
which was added in FastFilter/xor_singleheader@4392b5e.

`Fuse8` filters require a large number of keys (> 100_000), but have the
same false positive percentage as `Xor8` filters (< 0.4%) while using much
less space (~9.1 bits/entry as compared to `Xor8`'s ~9.9 bits/entry).

16-bit fingerprint versions of the xor filter using fuse graphs should
be trivial to implement.

Closes FastFilter#8
ayazhafiz added a commit to ayazhafiz/xorf that referenced this pull request Dec 27, 2019
This commit adds `Fuse8` and `Fuse16` filters, xor filters with 8-bit
and 16-bit fingerprints that uses the fuse graph data structure
to achieve better space ratios than the `Xor8`/`Xor16` filters (and are
faster to populate). The implementation is very similar to the go one,
located to FastFilter/xorfilter#13.

Fuse filters require a large number of keys (> 100_000), but have the
same false positive percentage as Xor filters while using much
less space (`Fuse8` uses ~9.1 bits/entry as compared to `Xor8`'s ~9.9
bits/entry).

This commit also refactors the BTS/macro implementations of the filters
to be contained in a new `prelude` module with submodules rather than
all in a singular file. This allows pseudo-namespacing of filter
implementation macros, like `xor_from_impl`.
ayazhafiz added a commit to ayazhafiz/xorf that referenced this pull request Dec 27, 2019
This commit adds `Fuse8` and `Fuse16` filters, xor filters with 8-bit
and 16-bit fingerprints that uses the fuse graph data structure
to achieve better space ratios than the `Xor8`/`Xor16` filters (and are
faster to populate). The implementation is very similar to the go one,
located to FastFilter/xorfilter#13.

Fuse filters require a large number of keys (> 100_000), but have the
same false positive percentage as Xor filters while using much
less space (`Fuse8` uses ~9.1 bits/entry as compared to `Xor8`'s ~9.9
bits/entry).

This commit also refactors the BTS/macro implementations of the filters
to be contained in a new `prelude` module with submodules rather than
all in a singular file. This allows pseudo-namespacing of filter
implementation macros, like `xor_from_impl`.
@lemire
Copy link
Member

lemire commented Jan 4, 2020

I am not forgetting about this. On my todo. It looks exciting.

@lemire
Copy link
Member

lemire commented Jan 6, 2020

Merging.

@lemire lemire merged commit 1b7f8be into FastFilter:master Jan 6, 2020
@lemire
Copy link
Member

lemire commented Jan 6, 2020

Looks great!

@ayazhafiz ayazhafiz deleted the e/fuse-filter branch January 7, 2020 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement fuse approach for better compression

2 participants