-
Notifications
You must be signed in to change notification settings - Fork 36.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[POC] Rust based Cuckoo Filter for m_addr_known #21837
Conversation
Integrates a slightly modified version of https://github.com/axiomhq/rust-cuckoofilter, so that headers are generated for use from our C++ code. You can see the modifications in my fork: https://github.com/fanquake/rust-cuckoofilter. This leverages existing work from Cory & Jeremy. Co-Authored-By: Jeremy Rubin <j@rubin.io> Co-Authored-By: Cory Fields <cory-nospam-@coryfields.com>
I am not sure if net processing is the right place for a rust playground. It might be better to allow devs to write some non-production code in rust. For example, let them write the unit tests in rust if they wish to do so. This wouldn't be much different than the option to write tests today in either C++ or Python. I understand that your goal is to make the newly added code optional by effectively duplicating it, but the more-than-doubling of the review and maintenance burden is reason enough to Approach NACK this pull. Wasn't one of the suggestions last time this came up to have a separate net-processing module written in rust that speaks with Bitcoin Core to give it blocks (completely separate from the existing C++ net-processing)? |
I don't have too much opinion on the Rust integration side of things. If that's a direction people want to go in, integration of a existing, functioning, library with a limited interface is probably one of the better places to start. As @MarcoFalke says, something test-only at first might be even better. Regarding cuckoo filters... they're great, and there are really nice ways of using them for rolling probabilistic data structures. I don't think this library is really what we want though. As I understand it, it deletes random elements when they overflow. That implies that there is basically 0 minimal retaining time for elements in the filter. For m_addr_known I don't think that matters very much, but we have half a dozen other RollingBloomFilters, and for some of them the guaranteed time of retaining matters. I'm a bit biased here of course; @gmaxwell and I have done some research on constructing a rolling cuckoo filter, and found a way to build one with pretty good properties. I have a old branch, but it may be a while before I get it picked up again. |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Too bad. I hoped it would get more traction. |
This proof of concept PR (leveraging work done by Cory Fields & Jeremy Rubin, see #16834) replaces the rolling bloom filter used for
m_addr_known
with a Cuckoo Filter written in Rust. I've have made some minor build related adjustments to the Rust code, which can be seen here: https://github.com/fanquake/rust-cuckoofilter/tree/cabi_build_adjustments.See "Cuckoo Filter: Practically Better Than Bloom":
Using this is a matter of:
Note that sometimes compilation will finish (i.e due to
ccache
) beforecargo
has finished generating the header and Rust lib, which will result in a compile error:In this case you can just re-run
make
. Has been tested on macOS and Linux. Sometimesp2p_getaddr_caching.py
fails, because the number of records returned falls a few short ofMAX_ADDR_TO_SEND
. Need investigating.I'm not suggesting that this be merged as-is, or that this is the ideal way of integrating Rust code (i.e copying sources in tree, using
cbindgen
), into Bitcoin Core. What I am suggesting is that the Rust discussion should continue, particularly in regards to integrations that can be done in a very non-invasive / modular fashion.Doesn't crash, but may catch your machine on fire 🔥, use with caution. More Rust related discussion available in #17090.