-
Notifications
You must be signed in to change notification settings - Fork 35.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stratum v2 Template Provider (take 3) #29432
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
87f0527
to
684890f
Compare
{ | ||
bool started = m_tp->Start(Sv2TemplateProviderOptions { .port = 18447 }); | ||
if (! started) return false; | ||
// Avoid "Connection refused" on CI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The template provider tests are quite brittle because they use a real socket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the time being I just added handling for MSG_MORE
(on e.g. macOS sequential messages are sent separately while on Linux they're combined). I also made the timeouts a bit longer.
Hopefully that does the trick. This can be revisited closer to the time when the Template Provider is ready for its own PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except the macOS native CI... https://github.com/Sjors/bitcoin/actions/runs/7905485495/job/21578223892?pr=34#step:7:6060
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on a fix in Sjors#34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably look into using StaticContentsSock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vasild any thoughts on how to make mock Sock
s that can be used to play messages in two directions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! See the first two commits in #26812:
bee6bdf test: put the generic parts from StaticContentsSock into a separate class
f42e4f3 test: add a mocked Sock that allows inspecting what has been Send() to it
and then how to use that in the last commit of the same PR:
8b10990 test: add unit tests exercising full call chain of CConnman and PeerManager
With those it is possible to send/receive raw bytes to/from the (mocked) socket, or NetMsgs, e.g.:
pipes->recv.PushNetMsg(NetMsgType::GETBLOCKS, block_locator, hash_stop);
ss << TX_WITH_WITNESS(tx); | ||
tx_size = ss.size(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TSAN is tripping up somewhere around here. The last thing it logs is - Connect 2 transactions:
. It doesn't get to - Verify ... txins:
. I wonder if this is related to mock time, which I'm testing in Sjors#34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maflcko shouldn't TSan on CI output something useful about why it crashed? I currently only says "error 2": https://cirrus-ci.com/task/5124733717446656?logs=ci#L3531
When running this locally on Ubuntu with clang 16.0.6 I get a WARNING: ThreadSanitizer: data race
and significantly more details (still a bit cryptic, but hopefully enough to figure out what's happening).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess the unit tests don't capture the tsan output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But they should. At least back when I tested #27667
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe good to re-check this when/after the cmake migration is done?
90735e6
to
1e06c04
Compare
Added |
Bumping macOS to 14 on the CI does not help (tried in Sjors#35). I also can't reproduce this failure on my own Intel macOS machines, not on 13.6.4 and not on 14.2.1. A This extra delay seems to do the trick for now: Sjors@c8d10af Another option to consider is using the functional test framework instead, since these are not really unit tests. However that involves implementing the sv2 noise protocol in Python and a bunch of other work to export transport functions to the functional test framework. If anyone feels up to that challenge, let me know... |
Co-authored-by: Christopher Coverdale <chris.coverdale24@gmail.com>
Co-Authored-By: Christopher Coverdale <chris.coverdale24@gmail.com>
The Template Provider does stuff like this: DataStream ss (sv2_net_msg.m_msg); try { ss >> setup_conn; } catch (const std::exception& e) { ... Such error checking should be moved inside Sv2Message.
The template provider will listen for a Job Declarator client. It can establish a connection and detect various protocol errors. Co-Authored-By: Christopher Coverdale <chris.coverdale24@gmail.com> Co-Authored-By: Fi3
Stratum v2 requires pools to communicate the maximum bytes they intend to add to the coinbase outputs, as well as the maximum number of sigops. This commit adds them to BlockAssembler::Options, so they can be set later.
These defaults are needed by the Mining interface in the next commit.
Co-Authored-By: Christopher Coverdale <chris.coverdale24@gmail.com>
Similar to the previous commit, however the behavior isn't opt-in with NO_BRANCH, because Github CI lacks custom variables.
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
What about a combination of extending the zmq and rest interface and/or using
I'm not sure I understand this. This PR is not an sv2 discussion, so if my understanding is wrong we don't need to talk about it here, but are you saying the pool will approve a proposed block template without having/keeping all the transactions? That seems undesired? Regardless, I'm not sure the TP necessarily needs to hold on to block templates when
Yes, a separate project and a separate process would be my strong preference. Again, I do think it's important Bitcoin Core supports SV2 well. But if the
Wouldn't it be more straightforward to implement this as a separate |
FWIW, my view is that indeed the current Bitcoin Core interfaces are inadequate for implementing template providers, and while it's probably possible to make a number of improvements to existing ones to improve the situation, the result will either be far from ideal still (RPC is not binary, ZMQ is only one-directional, ...), or quite different from anything we have now. And if we're going to make a substantially different interface for the purpose of template providing, it probably makes sense to just pick Sv2 to be that interface as it's designed exactly for that purpose. The encryption aspect of Sv2 isn't necessary for that, but if all So this doesn't exclude the possibility that future block template providing protocols be implemented in external proxy servers, which then talk still talk Sv2 to Bitcoin Core, if that is still appropriate. |
The Job Declaration Protocol has a param Approval takes time, because we have to send the pool any transactions it's missing, and it needs to check those (SRI currently only checks that all transactions are present, no additional checks - I suspect that's a can of worms on its own, since there may be conflicts with the pool node mempool). During this time you would either have to mine on a pool template, or solo mine. Regardless of the merits of async mining, it's more robust for the miner to broadcast a found block themselves (this is done via the Example of this actually going wrong (in testing): #912
Maybe. But we also need to hold on to everything that was in the mempool at the time (an aspect I haven't thoroughly tested yet). From what I've seen in SRI, the Job Declarator Server (pool side) implements some sort of pseudo-mempool for that purpose.
I'm not sure if it is. The two most practical approaches are improving this PR vs. expanding SRI. Building something from scratch is probably more work. Using a language other than Rust of c++ means re-writing the Noise implementation and other reusable code. The Bitcoin Core codebase has some very useful building blocks, e.g. the Transport class, BIP324 implementation (which we use parts of), well tested TCP socket handling code (unfortunately very tightly coupled with p2p, but refactoring that can be orthogonally useful), process separation in the pipeline (so it can crash without taking the node down - though so can a separate tool), good CI, and deterministic builds (so I can easily and relatively safely ship this PR as its own binary). |
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
🐙 This pull request conflicts with the target branch and needs rebase. |
a9716c5 rpc: call IsInitialBlockDownload via miner interface (Sjors Provoost) dda0b08 rpc: minize getTipHash() calls in gbt (Sjors Provoost) 7b4d324 rpc: call processNewBlock via miner interface (Sjors Provoost) 9e22835 rpc: getTransactionsUpdated via miner interface (Sjors Provoost) 64ebb0f Always pass options to BlockAssembler constructor (Sjors Provoost) 4bf2e36 rpc: call CreateNewBlock via miner interface (Sjors Provoost) 404b01c rpc: getblocktemplate getTipHash() via Miner interface (Sjors Provoost) d8a3496 rpc: call TestBlockValidity via miner interface (Sjors Provoost) 8ecb681 Introduce Mining interface (Sjors Provoost) Pull request description: Introduce a `Mining` interface for the `getblocktemplate`, `generateblock` and other mining RPCs to use now, and for Stratum v2 to use later. Suggested here: #29346 (comment) The selection of methods added to the interface is mostly based on what the Template Provider in #29432 uses. It could be expanded further so that `rpc/mining.cpp` no longer needs `EnsureMemPool` and `EnsureChainman`. This PR should be a pure refactor. ACKs for top commit: tdb3: re ACK a9716c5 itornaza: Code review and std-tests ACK a9716c5 ryanofsky: Code review ACK a9716c5 with one minor suggestion in case you update. Only changes since last review were other small changes to the interface. Tree-SHA512: cf97f87d6e9ed89da3835a0730da3b24a7b14c8605ea221149103a5915e79598cf082a95f2bc88e33f1c450e3d4aad88aed1163a29195acca88bcace055af724
Based on on @Fi3's master...Fi3:bitcoin:PatchTemplates which is based on @ccdle12's #27854. I rebased it and re-wrote the commit history. Compared to #28983 it introduces EllSwift in the handshake and fixes various bugs. I used that opportunity to change the branch name, which makes testing against SRI slightly easier. There's no conceptual discussion on #28983 so it can be ignored by reviewers.
See docs/stratum-v2.md for a brief description of Stratum v2 and the role of Bitcoin Core in that system..
What to test and review?
I'll make separate pull requests for parts that are ready for detailed review.
See the testing guide for various ways to test this PR. This branch is actively used by (testnet) pools, so it should be ready for high level review.
Related useful PRs
Related useful issues
Implementation notes
There's roughly three layers:
ci:
commits (Support self-hosted Cirrus workers on forks #29274) are there to facilitate PR's against this branch, but they are not blocking for Stratum v2transport.h
and some other stuff fromnode
tocommon
are not blocking. But in the longer run I'd like to see process separation between the node and the template provider.main
branch. Those will get dropped over time and can be ignored.Contributing
If you want to help out with any of the issues below, please open a PR to my fork. I will then squash your commits into my own where needed.
Things left todo
Spec
ProvideMissingTransactions
? (followup?)default_coinbase_tx_additional_output_size
(seegetblocktemplate
RPC)Networking
-sv2bind
and-sv2allowip
-sv2cert
Sv2TemplateProvider::SendBuf
, reuse p2p socket handling if possiblecoinbase_output_max_additional_size
sv2_messages.h
Testing
Template generation and updating
coinbase_tx_additional_output_size
Misc
GetMerklePathForCoinbase
helper (see Future Work in original PR, and Fix NewTemplate.merkle_path in bitcoind stratum-mining/stratum#567)Potential followups
test/sv2_template_provider_tests.cpp
)bitcoin-tp
binary, see multiprocess.mdSubmitSolution
message)