Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parallel P2P Client in Rust #17376

Closed
wants to merge 22 commits into from

Conversation

TheBlueMatt
Copy link
Contributor

@TheBlueMatt TheBlueMatt commented Nov 5, 2019

This was actually the original goal of the Rust work, but we had to start somewhere a bit simpler. This adds a full parallel P2P client, written in Rust, which can download and serve both headers and blocks, listening on a separate port. It is designed to be simple and robust, letting the C++ net_processing handle being optimized. It avoids wasting bandwidth needlessly by not (like all Rust modules so far) starting up until either IBD completes or it seems stuck and by waiting 30 seconds any time it hears about a new block that it decides to fetch before actually fetching it.

Based on (but could probably go in independently of) #16834, #16762, and #16974.

Some tasks that need to happen to clean this up a bit:

  • Build out the Rust addrman a bit - it currently always queries DNS seeds until its built out a database of 500 addresses and then never changes it. I don't think this is a particularly terrible design at a high level for something with a goal of simply having a different failure model from the C++ addrman, but the over-reliance on the DNS seeds needs to go, at least (plus we dont want to ever query them after we've built up an addrman that works, eg upon upgrade).
  • Build out connection logic a bit - we currently just open connections willy nilly very aggressively if our best header falls behind our best tip. This is great, but we should probably, like, disconnect peers that don't manage to help us along our journey (especially since the Rust code is hard-limited to 128MB of memory allocations, though this may want to change to 256MB, we have very little room for peers given our socket buffers are like 8-16MB just to receive a block and send one).
  • Support Windows. Currently uses poll() for socket handling and a pipe() to wake up the poll() thread. Needs a Windows version of that module, though that shouldn't be too hard...right?
  • A XXX or two, at least one of them probably requires @elichai explain Rust to me.
  • Subtree the libsecp bump, and the addition of rust-bitcoin crates (all managed in the rust-bitcoin org on GH by the Usual Folks) and libc crates (which just exposes libc functions and little more, though in theory we could probably drop this if we wanted to expose them ourselves, but it should be pretty simple). Subtrees are a PITA to handle, so this will just happen right before merge.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 5, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #17581 (refactor: Remove settings merge reverse precedence code by ryanofsky)
  • #17580 (refactor: Add ALLOW_LIST flags and enforce usage in CheckArgFlags by ryanofsky)
  • #17493 (util: Forbid ambiguous multiple assignments in config file by ryanofsky)
  • #17452 (lib: update fuzz directory in .gitignore by jonatack)
  • #17398 (build: Update leveldb to 1.22+ by laanwj)
  • #17383 (Refactor: Move consts to their correct translation units by jnewbery)
  • #17227 (Qt: Add Android packaging support by icota)
  • #16722 (build: Disable warnings for leveldb subtree by default by hebasto)
  • #16688 (log: Add validation interface logging by jkczyz)
  • #16365 (Log RPC parameters (arguments) if -debug=rpcparams by LarryRuane)
  • #15606 ([experimental] UTXO snapshots by jamesob)
  • #15367 (feature: Added ability for users to add a startup command by benthecarman)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@naumenkogs
Copy link
Member

naumenkogs commented Nov 5, 2019

Concept ACK.

My personal interest here is to easier deploy alternative protocols. For example, new block relay strategy or new address relay.

Replacing those existing protocols in Bitcoin core doesn't seem feasible (unless we come up with something really brilliant), because those are critical for security and have been proving robustness for years.
At the same time, building something in an additive manner seems possible.

These protocols also might be more flexible. I don't want to have overwhelming configurations in Bitcoin Core, but if someone wants to enable, let's say, a more anonymous but less fast block propagation locally — it's good to enable that with a module in rust.

@TheBlueMatt
Copy link
Contributor Author

My personal interest here is to easier deploy alternative protocols. For example, new block relay strategy or new address relay.

While new address relay in-scope for "reliability" reasons, I'm not sure that new block relay protocols would be - I don't know that this code wants to grow much beyond what it is already...less code, less bugs :) (nor do I see any coming down the pipe on the horizon?).

@naumenkogs
Copy link
Member

I'm not sure that new block relay protocols would be

We're doing great right now, but I think there's a non-zero probability of us wanting something different in future :)
Anyway, don't want to flood this PR with this particular discussion.

@jgarzik
Copy link
Contributor

jgarzik commented Nov 5, 2019

In the "nice to have" list, I would add "fork(2)" to the OP checklist.

forking does a far better job creating a security barrier between network and the rest of the code.

It was always a long term goal to create a process boundary separation between wallet and network, for example. One way to accomplish that is removing the wallet from the core validation engine. Another way to accomplish that is forking. in re Windows, the cygwin method should work here.

A kernel guarantee is stronger than a programming language guarantee, by far.

(to be crystal clear, this is not objecting to this PR, nor proposing an alternative, just recalling relevant historical material)

@practicalswift
Copy link
Contributor

Very interesting work @TheBlueMatt!

I really appreciate that proper fuzz testing is integrated from the very start. Very nice!

Is the inclusion of the fuzzing corpus intentional (src/rusty/rust-bitcoin/fuzz/hfuzz_input/deserialize_udecimal/input/id:000121,src:000007,op:havoc,rep:2, etc.)?

I suggest removing the corpus files in order to not clutter the PR "Files changed" view (currently listing 1233 files as changed! :)).

@TheBlueMatt
Copy link
Contributor Author

@practicalswift the parts you're referring to here are just dependencies from the rust-bitcoin github org pulled in here (which are included as-is so that we can subtree them in the future, though they may need to go in via depends/ I'm just not sure how that process works), not fresh code. Indeed, most of the rust-bitcoin libraries are well-tested and have a bunch of fuzz testing built in, the inclusion of the corpus is largely because we run a few rounds of fuzz on travis and we want a good starting point for it.

src/Makefile.am Outdated
# panics in the Rust threads and simply close the thread, instead of bringing
# down the rest of Bitcoin Core.
$(LIBBITCOIN_RUSTY): $(LIBBITCOIN_RUSTY_SRCS) rusty/libbitcoin.rlib rusty/liblibc.rlib
$(RUSTC) --crate-name rusty $< --target=$(RUST_TARGET) --crate-type=staticlib --emit=link -g -O -C lto=on -C overflow-checks=on -C panic=unwind --out-dir $(@D) -L dependency=$(@D) --extern libc=$(@D)/liblibc.rlib --extern bitcoin_hashes=$(@D)/libbitcoin_hashes.rlib --extern bitcoin=$(@D)/libbitcoin.rlib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-L rusty pointing to directory should resolve rlibs automatically instead of using all these explicit --externs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to for me, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested and it seems like --extern is only necessary for libc (rustc hits E0658 otherwise). This works for me:
/home/igor/.cargo/bin/rustc --crate-name rusty rusty/src/lib.rs --target=x86_64-unknown-linux-gnu --crate-type=staticlib --emit=link -g -C lto=on -C overflow-checks=on -C panic=unwind --out-dir rusty -L rusty --extern libc=rusty/liblibc.rlib

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ehh, might as well just leave it, then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not a big deal

@bitcoin bitcoin deleted a comment from Khailand69 Nov 18, 2019
JeremyRubin and others added 13 commits November 21, 2019 17:24
…st code.

The demonstration library compiles a rust hello world example and auto-generates
a header which can be included in C++ code.

Co-Authored-By: Jeremy Rubin <j@rubin.io>
Co-Authored-By: Cory Fields <cory-nospam-@coryfields.com>
Various changes by: Matt Corallo <git@bluematt.me>
Also, break circular dependency with a new helper lib courtesy of
Cory Fields <cory-nospam-@coryfields.com>.
This starts the process of moving the last bits of
validation-critical logic out of net_processing - the fallback
logic of which blocks to download is incredibly critical to
validation and would likely also be used for any parallel block
sync systems. Further, assumeutxo will require additional ability
to select blocks for download, which is a validation-specific
concept and shouldn't really live in net_processing.

This moves a few responsibilities to the FindNextBlocksToDownload
callsite instead of passing in a CNodeId, but otherwise has no
functional changes.
This makes it more general than just net_processing
Instead of keeping pindexBestHeader set to the best header we've
ever seen, reset it back to our validated tip if we find an ancestor
of it turns out to be invalid. While the name is now a bit confusing,
this matches much better with how it is used in practice, see below.
Further, this opens up more use-cases for it in the future, namely
aggressively searching for new peers in case we have discovered
(possibly via some covert channel) headers which we do not know to be
invalid, but which we cannot find block data for.

Places pindexBestHeader is used:

 * Various GUI displays of the best header and getblockchaininfo["headers"],
   I don't think changing this is bad, and if anything this is less confusing
   in the presence of an invalid block.
 * IsCurrentForFeeEstimation(): If anything I think ensuring pindexBestHeader
   isn't some crazy invalid chain is better than the alternative, even in the
   case where you are rejecting the current chain due to hardware error (since
   hopefully in that case you won't get any new blocks anyway).
 * ConnectBlock assumevalid checks: We use pindexBestHeader to check that the
   block we're connecting leads to something with nMinimumChainWork (preventing
   a user-set assumevalid from having bogus work) and that the block we're
   connecting leads to pindexBestHeader (I'm not too worried about this one -
   it's nice to "disable" assumevalid if we have a long invalid headers chain,
   but I don't see it as a critical protection).
 * BlockRequestAllowed() uses pindexBestHeader as its target to ensure the
   requested block is within a month of the "current chain". I don't think this
   is a meaningful difference, if we're rejecting the current tip we're
   trivially fingerprintable anyway, and if the chain really does have a bunch
   of invalid crap near the tip, using the best not-invalid header is likely a
   better criteria.
 * ProcessGetBlockData uses pindexBestHeader as the "current chain" definition
   of whether a block request is "historical" for the purpose of bandwidth
   limiting. Similarly, I don't see why this is a meaningful change.
 * We use pindexBestHeader for requesting missing headers on receipt of a
   headers/compact block message or block inv as well as for initial getheaders.
   I think this is definitely wrong, using the best not-invalid header for such
   requests is much better.
 * We use pindexBestHeader to define the "current chain" for deciding when
   we're close to done with initial headers sync. I don't think this is a
   meaningful change.
 * We use pindexBestHeader to decide if initial headers sync has timed out. If
   we're rejecting the chain due to hardware error this may result in
   additional cases where we ban a peer, but this is already true, so I think
   its fine.
Note that we have to use LTO to work around the missing der_lax
symbol in rust-secp256k1.
See rust-bitcoin/rust-secp256k1#168
@TheBlueMatt
Copy link
Contributor Author

Closing due to lack of interest.

@TheBlueMatt TheBlueMatt closed this Mar 5, 2020
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants