Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] Rust based Cuckoo Filter for m_addr_known #21837

Closed
wants to merge 2 commits into from

Conversation

fanquake
Copy link
Member

@fanquake fanquake commented May 3, 2021

This proof of concept PR (leveraging work done by Cory Fields & Jeremy Rubin, see #16834) replaces the rolling bloom filter used for m_addr_known with a Cuckoo Filter written in Rust. I've have made some minor build related adjustments to the Rust code, which can be seen here: https://github.com/fanquake/rust-cuckoofilter/tree/cabi_build_adjustments.

See "Cuckoo Filter: Practically Better Than Bloom":

In many networking systems, Bloom filters are used for high-speed set membership tests. They permit a small fraction of false positive answers with very good space efficiency. However, they do not permit deletion of items from the set, and previous attempts to extend “standard” Bloom filters to support deletion all degrade either space or performance.

We propose a new data structure called the cuckoo filter that can replace Bloom filters for approximate set member- ship tests. Cuckoo filters support adding and removing items dynamically while achieving even higher performance than Bloom filters.

For applications that store many items and target moderately low false positive rates, cuckoo filters have lower space overhead than space-optimized Bloom filters. Our experimental results also show that cuckoo filters out-perform previous data structures that extend Bloom filters to support deletions substantially in both time and space.

Using this is a matter of:

./autogen.sh
./configure --enable-experimental-rust
make
make -C src rusty-check
src/bitcoind 

Note that sometimes compilation will finish (i.e due to ccache) before cargo has finished generating the header and Rust lib, which will result in a compile error:

./net.h:43:10: fatal error: rusty/out/rcf_cuckoofilter.h: No such file or directory
   43 | #include <rusty/out/rcf_cuckoofilter.h>

In this case you can just re-run make. Has been tested on macOS and Linux. Sometimes p2p_getaddr_caching.py fails, because the number of records returned falls a few short of MAX_ADDR_TO_SEND. Need investigating.

I'm not suggesting that this be merged as-is, or that this is the ideal way of integrating Rust code (i.e copying sources in tree, using cbindgen), into Bitcoin Core. What I am suggesting is that the Rust discussion should continue, particularly in regards to integrations that can be done in a very non-invasive / modular fashion.

Doesn't crash, but may catch your machine on fire 🔥, use with caution. More Rust related discussion available in #17090.

fanquake and others added 2 commits May 3, 2021 12:09
Integrates a slightly modified version of https://github.com/axiomhq/rust-cuckoofilter,
so that headers are generated for use from our C++ code. You can see the
modifications in my fork: https://github.com/fanquake/rust-cuckoofilter.

This leverages existing work from Cory & Jeremy.

Co-Authored-By: Jeremy Rubin <j@rubin.io>
Co-Authored-By: Cory Fields <cory-nospam-@coryfields.com>
@maflcko
Copy link
Member

maflcko commented May 3, 2021

What I am suggesting is that the Rust discussion should continue, particularly in regards to integrations that can be done in a very non-invasive / modular fashion.

I am not sure if net processing is the right place for a rust playground. It might be better to allow devs to write some non-production code in rust. For example, let them write the unit tests in rust if they wish to do so. This wouldn't be much different than the option to write tests today in either C++ or Python.

I understand that your goal is to make the newly added code optional by effectively duplicating it, but the more-than-doubling of the review and maintenance burden is reason enough to Approach NACK this pull. Wasn't one of the suggestions last time this came up to have a separate net-processing module written in rust that speaks with Bitcoin Core to give it blocks (completely separate from the existing C++ net-processing)?

@sipa
Copy link
Member

sipa commented May 3, 2021

I don't have too much opinion on the Rust integration side of things. If that's a direction people want to go in, integration of a existing, functioning, library with a limited interface is probably one of the better places to start. As @MarcoFalke says, something test-only at first might be even better.

Regarding cuckoo filters... they're great, and there are really nice ways of using them for rolling probabilistic data structures.

I don't think this library is really what we want though. As I understand it, it deletes random elements when they overflow. That implies that there is basically 0 minimal retaining time for elements in the filter. For m_addr_known I don't think that matters very much, but we have half a dozen other RollingBloomFilters, and for some of them the guaranteed time of retaining matters.

I'm a bit biased here of course; @gmaxwell and I have done some research on constructing a rolling cuckoo filter, and found a way to build one with pretty good properties. I have a old branch, but it may be a while before I get it picked up again.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 3, 2021

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@kiminuo
Copy link
Contributor

kiminuo commented Jun 9, 2021

Too bad. I hoped it would get more traction.

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 18, 2022
@fanquake fanquake deleted the rust_cuckoo_filters branch November 9, 2022 16:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants