-
Notifications
You must be signed in to change notification settings - Fork 35.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The libbitcoinkernel
Project
#24303
Comments
One thing that I'd like to see better / more clearly documented is which components have which requirements and what changes as a result of libbitcoinkernel. E.g., I recently learned that script/interpreter.* is not allowed to link against threading primitives so the code can be used in ??? environments. Does libbitcoinkernel have similar restrictions on what sorts of things are includable / not includable as a result (other than our own code, like platform stuff)? |
Hmm, I'm not sure what you mean...
Was this a public conversation? Could you link me? |
Yeah this came up when I was implementing lazy caching for CTV using std::call_once that linking pthread from interpreter was a nono https://gnusha.org/bitcoin-core-dev/2022-01-20.log. I guess what I mean is it would be good to have a more clear cut boundary for libbitcoinkernel both up the stack on API boundaries but also "down the stack" in terms of what libbitcoinkernel can provide v.s. what the host environment should. Maybe this is a malformed question, feel free to chat me and we can flesh out if it's meaningful. |
(ot: I couldn't find "nono" in the chat. Mind quoting the exact phrase? I think for your specific implementation Concept ACK on libbitcoinkernel. |
Strongest possible Concept ACK. IMO we should've done this years ago. Kudos to @dongcarl for taking it on. I'll help in every way I can! |
Nice! That might be handy for iOs too; so far I've no luck with this, see #12557. It seems much easier to include a C project than C++, maybe because Objective C inherits from the former. Bonus points for Swift bindings. An iOs node would need p2p stuff too. |
Concept ACK As others have said even if there are occasional consensus leaks from a libbitcoinkernel keeping consensus code siloed for the most part would be great for risk mitigation and review rigor. Some initial discussions in today's IRC meeting. |
@MarcoFalke the approach I ended up taking is substantially more complicated, perhaps you haven't fully reviewed it (callers must pass in a lambda that contains an interior synchonicity primitive or otherwise guarnatee single threaded executiuon), whereas the call_once approach was correct for all users). The convo spanned a few days, that was just the start, my bad (I searched my local IRC for a date ref). On +1 day https://gnusha.org/bitcoin-core-dev/2022-01-21.log
But this seems largely off topic for here other than having a clearer deliniation of what can and cannot be linked within kernel consensus code after this change, because I am unhappy with libbitcoin_consensus preventing the use of std::call_once in a straightforward manner, so thinking about similar issues or expectations would be good! |
Right, I think we leave the API boundary up the stack until stage 2, so that we "avoid having to prematurely optimizing for a "perfect" boundary or API (tends to be highly subjective, non-obvious, may lead to unproductive bike-shedding before we've even done anything meaningful)"
Ah I see, yeah I think the libbitcoinkernel case is very different from the libbitcoinconsensus case, since "it is a stateful library that can spawn threads, do caching, do I/O, and many other things which one may not normally expect from a library". Right now if you build #24304 with
Now, a good decision to ponder upon is whether or not we should be embedding libraries such as |
Yup, most languages have FFI support of some kind to C, so once we have a C interface everything else will be easy! |
I don't understand why there isn't a clearer answer about what's in the kernel and what's not in the kernel. If there isn't a clear line that can be drawn, I think it would be best to just treat libbitcoin_kernel as an evolving whitelist of API's that can be called externally, and not move a lot of code around or significantly change source code organization. Otherwise, I would like to know more know more about the statement 'Most of the changes to be made are "move only"' because it is not clear to me what code would be moving or where it would be moving to. |
What's included in As of #24322, we're only capturing the dependencies as they are right now, which unfortunately includes things like Throughout stage 1, we will decouple (or in some cases, abstract away) these unnecessary modules from the consensus engine, which will allow us to remove them from Right now in my exploration branch, here is the list of files that are still necessary to be linked in:
A few modules that we decouple/abstract from:
Of course this is still not minimal, but we can make progress on the decoupling a step at a time, and each step is worthwhile because it prevents future re-coupling.
Do you mean that instead of (say) splitting off the consensus-used parts of I can totally see how that would work, however, I think there's a lot to be gained from a clean split: the functions not existing at all in In any case, I'd love to avoid significant source code organization changes if I can achieve the same goals, and I know you've thought a lot about it, so let's talk more!
Oh of course! Here's an example: dongcarl@d7cf3f0 In the first commit there, we split the parts of Let me know if you have any other questions or if there's a strictly better/easier to do this that I'm missing! |
Minor edit to main description: replace "style or style-adjacent comments/reviews" with "comments/reviews not pertinent to the main thrust of PRs" |
Thanks for posting #24332. Is #24332 your "exploration branch" or is that available somewhere else? A lot of things just seem vague to me and I think it would be useful to be concrete. In general, the approach here seems a little different than I what would expect. I would expect most of the work of making the "consensus engine" usable as the library to involve getting rid of globals more than moving code around. Like I'd expect step one to be introducing some kind of context for the consensus code to be able to function. Something like a namespace kernel {
struct Context {
std::function<void(const std::string&)> log_fn;
std::function<std::string(const char*)> translate_fn;
std::optional<int64_t> adjusted_time;
std::unique_ptr<RNGState> rng_state;
Consensus::Params params;
std::unique_ptr<ChainstateManager> chainman;
std::unique_ptr<CTxMemPool> mempool;
...
};
} // namespace kernel meant to replace globals. Getting rid of globals would let you use the kernel library to write a application simulating multiple nodes. Or let you use the library to write an application that uses consensus functionality for something else, and still has unit tests that run in parallel (not forced to run serially because the kernel library has shared global state.) Moving code around and changing source code organization could be related to this, but doesn't have to be. I do think if you do want to change source code organization, you should have a clear idea of how the source code should be organized. Right now we already have:
And you are proposing to add:
The current split between node and consensus makes sense to me. Node is for code we consider internal and unstable and don't want to be reused externally. Consensus is for code that is stable and we think is useful to expose. But maybe we should rename node to kernel or rename consensus to kernel? Or maybe we should actually go ahead and keep node and consensus and add kernel as a third middle layer. This could make sense if there is a clear idea of what belongs in this middle layer. Right now the kernel library sounds like a "consensus engine" or "consensus support" library for things that somehow have to do with consensus, but aren't in consensus library. The idea of excluding anything in the consensus library that uses threads does not seem very tenable to me, when a lot of performant code people will want to write is likely to involve parallelization. It might make more sense if the dividing line had something to do with persistent storage. Like maybe re: #24303 (comment)
This doesn't really answer the question, just turns it into "what things are in the consensus engine?" or "why isn't
This all seems good, but probably the list of things to decouple should be almost as long as the original list, and there should be not very much code to move to
Right, everything's already public.
Can you help me understand why this is an improvement? It seems like there could be benefits for understanding and testing and code reuse getting rid of the
eludes me and I think I don't know what the end state is. |
I figured this out a little better now and made some concrete suggestions in #24332 (comment) |
re: #24332 (comment)
To follow up on this, I talked to Carl and he convinced me with a single word ("utreexo") that deciding what goes into re: #24303 (comment)
On the globals vs context struct thing there's a concrete suggestion and discussion in #24332 (comment). I think it is important to create a context struct, because without one new code is forced to use global variables, and it is harder to get rid of globals gradualy without big blunt instrument changes. The libbitcoinkernel branch https://github.com/dongcarl/bitcoin/commits/2022-01-v8-on-new-kirby makes reasonable choices about what to deglobalize and what not to (ArgsManager yes, LogInstance no), but in places where it's easy to add channels to avoid referencing globals more places I do think we should do that. |
After talking to Ryan, I'm am convinced that adding a Logistically, we don't have to de-globalize everything at once (:fearful: logger), but just having a place is better than not. |
Concept/approach ACK
Have you considered naming it something like "libbitcoin-chainstate" (matching the new executable name)? "kernel", "util", "common" and "node" are all very generic names; would be nice to be more specific if it's at all possible. Having the library focus on managing chain state seems plausible to me: it's distinct from "node" which also encompasses mining and p2p and rpc access, and distinct from the current libconsensus which doesn't keep state. To me it would make sense for "chain state" to include raw block storage, utxo info for the current tip, the mempool (transactions validated against the current chain state), and perhaps indexes (coinstats, txindex) since they also track the chain... |
Ooooof it might be a little late for a name change at this stage since I've been talking about it with so many people under the |
2c03cec ci: Build bitcoin-chainstate (Carl Dong) 095aa6c build: Add example bitcoin-chainstate executable (Carl Dong) Pull request description: Part of: bitcoin#24303 This PR introduces an example/demo `bitcoin-chainstate` executable using said library which can print out information about a datadir and take in new blocks on stdin. Please read the commit messages for more details. ----- #### You may ask: WTF?! Why is `index/*.cpp`, etc. being linked in? This PR is meant only to capture the state of dependencies in our consensus engine as of right now. There are many things to decouple from consensus, which will be done in subsequent PRs. Listing the files out right now in `bitcoin_chainstate_SOURCES` is purely to give us a clear picture of the task at hand, it is **not** to say that these dependencies _belongs_ there in any way. ### TODO 1. Clean up `bitcoin-chainstate.cpp` It is quite ugly, with a lot of comments I've left for myself, I should clean it up to the best of my abilities (the ugliness of our init/shutdown might be the upper bound on cleanliness here...) ACKs for top commit: ajtowns: ACK 2c03cec ryanofsky: Code review ACK 2c03cec. Just rebase, comments, formatting change since last review MarcoFalke: re-ACK 2c03cec 🏔 Tree-SHA512: 86e7fb5718caa577df8abc8288c754f4a590650d974df9d2f6476c87ed25c70f923c4db651c6963f33498fc7a3a31f6692b9a75cbc996bf4888c5dac2f34a13b
Concept/approach ACK
Once we have a mature If there is interest to develop those applications, they might be sources of feedback to a
I think indexes are more server oriented features than validation. A full-node implementation embedding [0] https://gitlab.com/lightning-signer/validating-lightning-signer |
…dex`s 6c23c41 refactor: Rewrite AddToBlockIndex with try_emplace (Carl Dong) c05cf7a style: Modernize range-based loops over m_block_index (Carl Dong) c2a1655 style-only: Use using instead of typedef for BlockMap (Carl Dong) dd79dad refactor: Rewrite InsertBlockIndex with try_emplace (Carl Dong) 531dce0 tests: Remove now-unnecessary manual Unload's (Carl Dong) bec86ae blockstorage: Make m_block_index own CBlockIndex's (Carl Dong) Pull request description: Part of: #24303 Split off from: #22564 ``` Instead of having CBlockIndex's live on the heap, which requires manual memory management, have them be owned by m_block_index. This means that they will live and die with BlockManager. ``` The second commit demonstrates how this makes calls to `Unload()` to satisfy the address sanitizer unnecessary. ACKs for top commit: ajtowns: ACK 6c23c41 MarcoFalke: re-ACK 6c23c41 🎨 Tree-SHA512: 81b2b5119be27cc0f8a9457b11da60cc60930315d2a5be36be89fe253d32073ffe622348ff153114b9b3212197bddbc791810913a43811b33cc58e7162bd105b
…tructing temporary empty mempools 0f1a259 miner: Make mempool optional for BlockAssembler (Carl Dong) cc5739b miner: Make UpdatePackagesForAdded static (Carl Dong) f024578 miner: Absorb SkipMapTxEntry into addPackageTxs (Carl Dong) Pull request description: This is part of the libbitcoinkernel project: #24303, https://github.com/bitcoin/bitcoin/projects/18 This is **_NOT_** dependent on, but is a "companion-PR" to #25215. ### Abstract This PR removes the need to construct `BlockAssembler` with temporary, empty mempools in cases where we don't want to source transactions from the mempool (e.g. in `TestChain100Setup::CreateBlock` and `generateblock`). After this PR, `BlockAssembler` will accept a `CTxMemPool` pointer and handle the `nullptr` case instead of requiring a `CTxMemPool` reference. An overview of the changes is best seen in the changes in the header file: ```diff diff --git a/src/node/miner.h b/src/node/miner.h index 7cf8e3fb9e..7e9f503602 100644 --- a/src/node/miner.h +++ b/src/node/miner.h @@ -147,7 +147,7 @@ private: int64_t m_lock_time_cutoff; const CChainParams& chainparams; - const CTxMemPool& m_mempool; + const CTxMemPool* m_mempool; CChainState& m_chainstate; public: @@ -157,8 +157,8 @@ public: CFeeRate blockMinFeeRate; }; - explicit BlockAssembler(CChainState& chainstate, const CTxMemPool& mempool); - explicit BlockAssembler(CChainState& chainstate, const CTxMemPool& mempool, const Options& options); + explicit BlockAssembler(CChainState& chainstate, const CTxMemPool* mempool); + explicit BlockAssembler(CChainState& chainstate, const CTxMemPool* mempool, const Options& options); /** Construct a new block template with coinbase to scriptPubKeyIn */ std::unique_ptr<CBlockTemplate> CreateNewBlock(const CScript& scriptPubKeyIn); @@ -177,7 +177,7 @@ private: /** Add transactions based on feerate including unconfirmed ancestors * Increments nPackagesSelected / nDescendantsUpdated with corresponding * statistics from the package selection (for logging statistics). */ - void addPackageTxs(int& nPackagesSelected, int& nDescendantsUpdated) EXCLUSIVE_LOCKS_REQUIRED(m_mempool.cs); + void addPackageTxs(const CTxMemPool& mempool, int& nPackagesSelected, int& nDescendantsUpdated) EXCLUSIVE_LOCKS_REQUIRED(mempool.cs); // helper functions for addPackageTxs() /** Remove confirmed (inBlock) entries from given set */ @@ -189,15 +189,8 @@ private: * These checks should always succeed, and they're here * only as an extra check in case of suboptimal node configuration */ bool TestPackageTransactions(const CTxMemPool::setEntries& package) const; - /** Return true if given transaction from mapTx has already been evaluated, - * or if the transaction's cached data in mapTx is incorrect. */ - bool SkipMapTxEntry(CTxMemPool::txiter it, indexed_modified_transaction_set& mapModifiedTx, CTxMemPool::setEntries& failedTx) EXCLUSIVE_LOCKS_REQUIRED(m_mempool.cs); /** Sort the package in an order that is valid to appear in a block */ void SortForBlock(const CTxMemPool::setEntries& package, std::vector<CTxMemPool::txiter>& sortedEntries); - /** Add descendants of given transactions to mapModifiedTx with ancestor - * state updated assuming given transactions are inBlock. Returns number - * of updated descendants. */ - int UpdatePackagesForAdded(const CTxMemPool::setEntries& alreadyAdded, indexed_modified_transaction_set& mapModifiedTx) EXCLUSIVE_LOCKS_REQUIRED(m_mempool.cs); }; int64_t UpdateTime(CBlockHeader* pblock, const Consensus::Params& consensusParams, const CBlockIndex* pindexPrev); ``` ### Alternatives Aside from approach in this current PR, we can also take the approach of moving the `CTxMemPool*` argument from the `BlockAssembler` constructor to `BlockAssembler::CreateNewBlock`, since that's where it's needed anyway. I did not push this approach because it requires quite a lot of call sites to be changed. However, I do have it coded up and can do that if people express a strong preference. This would look something like: ``` BlockAssembler::BlockAssembler(CChainState& chainstate, const Options& options); BlockAssembler::CreateNewBlock(const CScript& scriptPubKeyIn, const CTxMemPool* maybe_mempool); ``` ### Future work Although wholly out of scope for this PR, we could potentially refine the `BlockAssembler` interface further, so that we have: ``` BlockAssembler::BlockAssembler(CChainState& chainstate, const Options& options); BlockAssembler::CreateNewBlock(const CScript& scriptPubKeyIn, std::vector<CTransaction>& txs); BlockAssembler::CreateNewBlock(const CScript& scriptPubKeyIn, const CTxMemPool& mempool); ``` Whereby `TestChain100Setup::CreateBlock` and `generateblock` would call the `BlockAssembler::CreateNewBlock` that takes in `CTransaction`s and we can potentially remove `RegenerateCommitments` altogether. All other callers can use the `CTxMemPool` version. ACKs for top commit: glozow: ACK 0f1a259 laanwj: Code review ACK 0f1a259 MarcoFalke: ACK 0f1a259 🐊 Tree-SHA512: 2b4b1dbb43d85719f241ad1f19ceb7fc50cf764721da425a3d1ff71bd16328c4f86acff22e565bc9abee770d3ac8827a6676b66daa93dbf42dd817ad929e9448
d273e53 bench/rpc_mempool: Create ChainTestingSetup, use its CTxMemPool (Carl Dong) 020caba bench: Use existing CTxMemPool in TestingSetup (Carl Dong) 86e732d scripted-diff: test: Use CTxMemPool in TestingSetup (Carl Dong) 213457e test/policyestimator: Use ChainTestingSetup's CTxMemPool (Carl Dong) 319f0ce rest/getutxos: Don't construct empty mempool (Carl Dong) 03574b9 tree-wide: clang-format CTxMemPool references (Carl Dong) Pull request description: This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18 This PR reduces the number of call sites where we explicitly construct CTxMemPool. This is done in preparation for later PRs which decouple the mempool module from `ArgsManager`, eventually all of libbitcoinkernel will be decoupled from `ArgsManager`. The changes in this PR: - Allows us to have less code churn as we modify `CTxMemPool`'s constructor in later PRs - In many cases, we can make use of existing `CTxMemPool` instances, getting rid of extraneous constructions - In other cases, we construct a `ChainTestingSetup` and use the `CTxMemPool` there, so that we can rely on the logic in `setup_common` to set things up correctly ## Notes for Reviewers ### A note on using existing mempools When evaluating whether or not it's appropriate to use an existing mempool in a `*TestingSetup` struct, the key is to make sure that the mempool has the same lifetime as the `*TestingSetup` struct. Example 1: In [`src/fuzz/tx_pool.cpp`](https://github.com/bitcoin/bitcoin/blob/b4f686952a60bbadc7ed2250651d0d6af0959f4d/src/test/fuzz/tx_pool.cpp), the `TestingSetup` is initialized in `initialize_tx_pool` and lives as a static global, while the `CTxMemPool` is in the `tx_pool_standard` fuzz target, meaning that each time the `tx_pool_standard` fuzz target gets run, a new `CTxMemPool` is created. If we were to use the static global `TestingSetup`'s CTxMemPool we might run into problems since its `CTxMemPool` will carry state between subsequent runs. This is why we don't modify `src/fuzz/tx_pool.cpp` in this PR. Example 2: In [`src/bench/mempool_eviction.cpp`](https://github.com/bitcoin/bitcoin/blob/b4f686952a60bbadc7ed2250651d0d6af0959f4d/src/bench/mempool_eviction.cpp), we see that the `TestingSetup` is in the same scope as the constructed `CTxMemPool`, so it is safe to use its `CTxMemPool`. ### A note on checking `CTxMemPool` ctor call sites After the "tree-wide: clang-format CTxMemPool references" commit, you can find all `CTxMemPool` ctor call sites with the following command: ```sh git grep -E -e 'make_unique<CTxMemPool>' \ -e '\bCTxMemPool\s+[^({;]+[({]' \ -e '\bCTxMemPool\s+[^;]+;' \ -e '\bnew\s+CTxMemPool\b' ``` At the end of the PR, you will find that there are still quite a few call sites that we can seemingly get rid of: ```sh $ git grep -E -e 'make_unique<CTxMemPool>' -e '\bCTxMemPool\s+[^({;]+[({]' -e '\bCTxMemPool\s+[^;]+;' -e '\bnew\s+CTxMemPool\b' # rearranged for easier explication src/init.cpp: node.mempool = std::make_unique<CTxMemPool>(node.fee_estimator.get(), mempool_check_ratio); src/test/util/setup_common.cpp: m_node.mempool = std::make_unique<CTxMemPool>(m_node.fee_estimator.get(), 1); src/rpc/mining.cpp: CTxMemPool empty_mempool; src/test/util/setup_common.cpp: CTxMemPool empty_pool; src/bench/mempool_stress.cpp: CTxMemPool pool; src/bench/mempool_stress.cpp: CTxMemPool pool; src/test/fuzz/rbf.cpp: CTxMemPool pool; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/validation_load_mempool.cpp: CTxMemPool pool{}; src/txmempool.h: /** Create a new CTxMemPool. ``` Let's break them down one by one: ``` src/init.cpp: node.mempool = std::make_unique<CTxMemPool>(node.fee_estimator.get(), mempool_check_ratio); src/test/util/setup_common.cpp: m_node.mempool = std::make_unique<CTxMemPool>(m_node.fee_estimator.get(), 1); ``` Necessary ----- ``` src/rpc/mining.cpp: CTxMemPool empty_mempool; src/test/util/setup_common.cpp: CTxMemPool empty_pool; ``` These are fixed in #25223 where we stop requiring the `BlockAssembler` to have a `CTxMemPool` if it's not going to consult it anyway (as is the case in these two call sites) ----- ``` src/bench/mempool_stress.cpp: CTxMemPool pool; src/bench/mempool_stress.cpp: CTxMemPool pool; ``` Fixed in #24927. ----- ``` src/test/fuzz/rbf.cpp: CTxMemPool pool; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/tx_pool.cpp: CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1}; src/test/fuzz/validation_load_mempool.cpp: CTxMemPool pool{}; ``` These are all cases where we don't want the `CTxMemPool` state to persist between runs, see the previous section "A note on using existing mempools" ----- ``` src/txmempool.h: /** Create a new CTxMemPool. ``` It's a comment (someone link me to a grep that understands syntax plz thx) ACKs for top commit: laanwj: Code review ACK d273e53 Tree-SHA512: c4ff3d23217a7cc4a7145defc7b901725073ef73bcac3a252ed75f672c87e98ca0368d1d8c3f606b5b49f641e7d8387d26ef802141b650b215876f191fb6d5f9
dc1e7ad Add doc/design/libraries.md (Ryan Ofsky) Pull request description: Prompted by the [libbitcoinkernel issue #24303](bitcoin/bitcoin#24303) and PRs, I started looking at existing libraries and what their dependencies are and wrote this document to describe them and where `libbitcoinkernel` fits in. Readable link is: https://github.com/ryanofsky/bitcoin/blob/pr/libs/doc/design/libraries.md Feedback is welcome ACKs for top commit: laanwj: ACK dc1e7ad hebasto: Approach ACK dc1e7ad, using this doc as a guide in hebasto/bitcoin#3 :) Tree-SHA512: 7687b1847797c50de1f5ea721bd201cc8304690064743fbe6d69e2198cc239084e9da7d158be65bea948a6ec3d71d74c84122c0e523c390b389b49ea8d2cddc9
d1684be fees: Pass in a filepath instead of referencing gArgs (Carl Dong) 9a3d825 init: Remove redundant -*mempool*, -limit* queries (Carl Dong) 6c5c60c mempool: Use m_limit for UpdateTransactionsFromBlock (Carl Dong) 9e93b10 node/ifaces: Use existing MemPoolLimits (Carl Dong) 38af2bc mempoolaccept: Use limits from mempool in constructor (Carl Dong) 9333427 mempool: Introduce (still-unused) MemPoolLimits (Carl Dong) 716bb5f scripted-diff: Rename anc/desc size limit vars to indicate SI unit (Carl Dong) 1ecc773 scripted-diff: Rename DEFAULT_MEMPOOL_EXPIRY to indicate time unit (Carl Dong) aa9141c mempool: Pass in -mempoolexpiry instead of referencing gArgs (Carl Dong) 51c7a41 init: Only determine maxmempool once (Carl Dong) 386c947 mempool: Make GetMinFee() with custom size protected (Carl Dong) 82f00de mempool: Pass in -maxmempool instead of referencing gArgs (Carl Dong) f1941e8 pool: Add and use MemPoolOptions, ApplyArgsManOptions (Carl Dong) 0199bd3 fuzz/rbf: Add missing TestingSetup (Carl Dong) ccbaf54 scripted-diff: Rename DEFAULT_MAX_MEMPOOL_SIZE to indicate SI unit (Carl Dong) fc02f77 ArgsMan: Add Get*Arg functions returning optional (Carl Dong) Pull request description: This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18 ----- As mentioned in the Stage 1 Step 2 description of [the `libbitcoinkernel` project](#24303), `ArgsManager` will not be part of `libbitcoinkernel`. Therefore, it is important that we remove any dependence on `ArgsManager` by code that will be part of `libbitcoinkernel`. This is the first in a series of PRs aiming to achieve this. This PR removes `CTxMemPool+MempoolAccept`'s dependency on `ArgsManager` by introducing a `CTxMemPool::Options` struct, which is used to specify `CTxMemPool`'s various options at construction time. These options are: - `-maxmempool` -> `CTxMemPool::Options::max_size` - `-mempoolexpiry` -> `CTxMemPool::Options::expiry` - `-limitancestorcount` -> `CTxMemPool::Options::limits::ancestor_count` - `-limitancestorsize` -> `CTxMemPool::Options::limits::ancestor_size` - `-limitdescendantcount` -> `CTxMemPool::Options::limits::descendant_count` - `-limitdescendantsize` -> `CTxMemPool::Options::limits::descendant_size` More context can be gleaned from the commit messages. The important commits are: - 56eb479 "pool: Add and use MemPoolOptions, ApplyArgsManOptions" - a1e08b7 "mempool: Pass in -maxmempool instead of referencing gArgs" - 6f4bf3e "mempool: Pass in -mempoolexpiry instead of referencing gArgs" - 5958a7f "mempool: Introduce (still-unused) MemPoolLimits" Reviewers: Help needed in the following commits (see commit messages): - a1e08b7 "mempool: Pass in -maxmempool instead of referencing gArgs" - 0695081 "node/ifaces: Use existing MemPoolLimits" Note to Reviewers: There are perhaps an infinite number of ways to architect `CTxMemPool::Options`, the current one tries to keep it simple, usable, and flexible. I hope we don't spend too much time arguing over the design here since that's not the point. In the case that you're 100% certain that a different design is strictly better than this one in every regard, please show us a fully-implemented branch. ----- TODO: - [x] Use the more ergonomic `CTxMemPool::Options` where appropriate - [x] Doxygen comments for `ApplyArgsManOptions`, `MemPoolOptions` ----- Questions for Reviewers: 1. Should we use `std::chrono::seconds` for `CTxMemPool::Options::expiry` and `CTxMemPool::m_expiry` instead of an `int64_t`? Something else? (`std::chrono::hours`?) 2. Should I merge `CTxMemPool::Limits` inside `CTxMemPool::Options`? ACKs for top commit: MarcoFalke: ACK d1684be 🍜 ryanofsky: Code review ACK d1684be. Just minor cleanups since last review, mostly switching to brace initialization Tree-SHA512: 2c138e52d69f61c263f1c3648f01c801338a8f576762c815f478ef5148b8b2f51e91ded5c1be915e678c0b14f6cfba894b82afec58d999d39a7bb7c914736e0b
ce8b0f9 Use designated initializers for ChainstateManager::Options (Carl Dong) 3837700 Move ChainstateManagerOpts into kernel:: namespace (Carl Dong) Pull request description: This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18 This PR is **_NOT_** dependent on any other PRs. ----- Places `ChainstateManager::Options` into the `kernel::` namespace and use designated initializers for construction. ACKs for top commit: ryanofsky: Code review ACK ce8b0f9 Tree-SHA512: 16a11b5051a2432ca4b6fa7b253376606fef619ace499dfe64d033c8fbe3e1a1875a7c946d7cd54bd908363886244ddf3a192e2f0c801ffbed40d60aad65e442
…anager` cb3e9a1 Move {Load,Dump}Mempool to kernel namespace (Carl Dong) aa30676 Move DEFAULT_PERSIST_MEMPOOL out of libbitcoinkernel (Carl Dong) 06b88ff LoadMempool: Pass in load_path, stop using gArgs (Carl Dong) b857ac6 test/fuzz: Invoke LoadMempool via CChainState (Carl Dong) b326725 Move FopenFn to fsbridge namespace (Carl Dong) ae1e8e3 mempool: Use NodeClock+friends for LoadMempool (Carl Dong) f9e8e57 mempool: Improve comments for [GS]etLoadTried (Carl Dong) 813962d scripted-diff: Rename m_is_loaded -> m_load_tried (Carl Dong) 413f4bb DumpMempool: Pass in dump_path, stop using gArgs (Carl Dong) bd44078 DumpMempool: Use std::chrono instead of weird int64_t arthmetics (Carl Dong) c84390b test/mempool_persist: Test manual savemempool when -persistmempool=0 (Carl Dong) Pull request description: This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18 ----- This PR moves `{Dump,Load}Mempool` into its own `kernel/mempool_persist` module and introduces `ArgsManager` `node::` helpers in `node/mempool_persist_args`to remove the scattered calls to `GetBoolArg("-persistmempool", DEFAULT_PERSIST_MEMPOOL)`. More context can be gleaned from the commit messages. ----- One thing I was reflecting on as I wrote this was that in the long run, I think we should probably invert the validation <-> mempool relationship. Instead of mempool not depending on validation, it might make more sense to have validation not depend on mempool. Not super urgent since `libbitcoinkernel` will include both validation and mempool, but perhaps something for the future. ACKs for top commit: glozow: re ACK cb3e9a1 via `git range-diff 7ae032e...cb3e9a1` MarcoFalke: ACK cb3e9a1 🔒 ryanofsky: Code review ACK cb3e9a1 Tree-SHA512: 979d7237c3abb5a1dd9b5ad3dbf3b954f906a6d8320ed7b923557f41a4472deccae3e8a6bca0018c8e7a3c4a93afecc502acd1e26756f2054f157f1c0edd939d
…from `ArgsManager` 0f3a253 validationcaches: Use size_t for sizes (Carl Dong) 41c5201 validationcaches: Add and use ValidationCacheSizes (Carl Dong) 82d3058 cuckoocache: Check for uint32 overflow in setup_bytes (Carl Dong) b370164 validationcaches: Abolish arbitrary limit (Carl Dong) 08dbc6e cuckoocache: Return approximate memory size (Carl Dong) 0dbce4b tests: Reduce calls to InitS*Cache() (Carl Dong) Pull request description: This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18 This PR is **_NOT_** dependent on any other PRs. ----- a.k.a. "Stop calling `gArgs.GetIntArg("-maxsigcachesize")` from validation code" This PR introduces the `ValidationCacheSizes` struct and its corresponding `ApplyArgsManOptions` function, removing the need to call `gArgs` from `Init{Signature,ScriptExecution}Cache()`. This serves to further decouple `ArgsManager` from `libbitcoinkernel` code. More context can be gleaned from the commit messages. ACKs for top commit: glozow: re ACK 0f3a253 theStack: Code-review ACK 0f3a253 ryanofsky: Code review ACK 0f3a253. Rebase and comment tweak since last Tree-SHA512: a492ca608466979807cac25ae3d8ef75d2f1345de52a156aa0d222c5a940f79f1b65db40090de69183cccdb12297ec060f6c64e57a26a155a94fec80e07ea0f7
…ystem.h aaced56 refactor: Move error() from util/system.h to logging.h (Ben Woosley) e7333b4 refactor: Extract util/exception from util/system (Ben Woosley) Pull request description: This pull request is part of the `libbitcoinkernel` project bitcoin/bitcoin#24303 https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". These commits were originally authored by empact and are taken from their parent PR #25152. #### Context There is an ongoing effort to decouple the `ArgsManager` used for command line parsing user-provided arguments from the libbitcoinkernel library (bitcoin/bitcoin#25290, bitcoin/bitcoin#25487, bitcoin/bitcoin#25527, bitcoin/bitcoin#25862, bitcoin/bitcoin#26177, and bitcoin/bitcoin#27125). The `ArgsManager` is defined in `system.h`. #### Changes Next to providing better code organization, this PR removes some reliance of the tree of libbitcoinkernel header includes on `system.h` (and thus the `ArgsManager` definition) by moving some logging functions out of the `system.*` files. Further commits splitting more functionality out of `system.h` are still in #25152 and will be submitted in separate PRs once this PR has been processed. ACKs for top commit: MarcoFalke: re-ACK aaced56 🐍 Tree-SHA512: cb39f4cb7a77e7dc1887b1cbf340d53decab8880fc00878a2f12dc627fe67245b4aafd4cc31a9eab0fad1e5bb5d0eb4cdb8d501323ca200fa6ab7b201ae34aea
…arams functionality to kernel b3e78dc refactor: Don't use global chainparams in chainstatemanager method (TheCharlatan) 382b692 Split non/kernel chainparams (Carl Dong) edabbc7 Add factory functions for Main/Test/Sig/Reg chainparams (Carl Dong) d938098 Remove UpdateVersionBitsParameters (Carl Dong) 84b8578 Decouple RegTestChainParams from ArgsManager (Carl Dong) 76cd4e7 Decouple SigNetChainParams from ArgsManager (Carl Dong) Pull request description: This pull request is part of the `libbitcoinkernel` project bitcoin/bitcoin#24303 https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". dongcarl is the original author of this patchset, these commits were taken from https://github.com/dongcarl/bitcoin/tree/2022-03-libbitcoinkernel-chainparams-args-only. #### Context The bitcoin kernel library currently relies on code containing user configurations through the `ArgsManager`. This is not optimal, since as a stand-alone library it should not rely on bitcoind's argument parsing logic. Instead, its interfaces should accept control and options structs that control the kernel library's desired configuration. Similar work towards decoupling the `ArgsManager` from the kernel has been done in bitcoin/bitcoin#25290, bitcoin/bitcoin#25487, bitcoin/bitcoin#25527 and bitcoin/bitcoin#25862. #### Changes By moving the `CChainParams` class definition into the kernel and giving it new factory functions `CChainParams::{RegTest,SigNet,Main,TestNet}`it can be constructed without an `ArgsManager` reference, unlike the current factory function `CreateChainParams`. The first few commits remove uses of `ArgsManager` within `CChainParams`. Then the `CChainParams` definition is moved to a new file in the `kernel/` subdirectory. ACKs for top commit: MarcoFalke: re-ACK b3e78dc 🛁 ryanofsky: Code review ACK b3e78dc. Only changes since last review were recent review suggestions. ajtowns: ACK b3e78dc Tree-SHA512: 3835aca1d3e3c75cc3303dd584bab3a77e58f6c678724a5e359fe4b0e17e0763a00931ee6191f516b9fde50496f59cc691f0709c0254206db3863bbf7ab2cacd
00e9b97 refactor: Move fs.* to util/fs.* (TheCharlatan) 106b46d Add missing fs.h includes (TheCharlatan) b202b3d Add missing cstddef include in assumptions.h (TheCharlatan) 18fb363 refactor: Extract util/fs_helpers from util/system (Ben Woosley) Pull request description: This pull request is part of the `libbitcoinkernel` project bitcoin/bitcoin#24303 https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". This commit was originally authored by empact and is taken from its parent PR #25152. #### Context There is an ongoing effort to decouple the `ArgsManager` used for command line parsing user-provided arguments from the libbitcoinkernel library (bitcoin/bitcoin#25290, bitcoin/bitcoin#25487, bitcoin/bitcoin#25527, bitcoin/bitcoin#25862, bitcoin/bitcoin#26177, and bitcoin/bitcoin#27125). The `ArgsManager` is defined in `system.h`. A similar pull request extracting functionality from `system.h` has been merged in bitcoin/bitcoin#27238. #### Changes Next to providing better code organization, this PR removes some reliance of the tree of libbitcoinkernel header includes on `system.h` (and thus the `ArgsManager` definition) by moving filesystem related functions out of the `system.*` files. There is already a pair of `fs.h` / `fs.cpp` in the top-level `src/` directory. They were not combined with the files introduced here, to keep the patch cleaner and more importantly because they are often included without the utility functions. The new files are therefore named `fs_helpers` and the existing `fs` files are moved into the util directory. Further commits splitting more functionality out of `system.h` are still in #25152 and will be submitted in separate PRs once this PR has been processed. ACKs for top commit: hebasto: ACK 00e9b97 Tree-SHA512: 31422f148d14ba3c843b99b1550a6fd77c77f350905ca324f93d4f97b652246bc58fa9696c64d1201979cf88733e40be02d262739bb7d417cf22bf506fdb7666
…/system be55f54 move-only: Extract common/args and common/config.cpp from util/system (TheCharlatan) Pull request description: This pull request is part of the `libbitcoinkernel` project bitcoin/bitcoin#24303 https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". It is part of a series of patches splitting up the `util/system` files. Its preceding pull request is bitcoin/bitcoin#27254. The pull request contains an extraction of ArgsManager related functions from util/system into their own common/ file. The background of this commit is an ongoing effort to decouple the libbitcoinkernel library from the ArgsManager. The ArgsManager belongs into the common library, since the kernel library should not depend on it. See [doc/design/libraries.md](https://github.com/bitcoin/bitcoin/blob/master/doc/design/libraries.md) for more information on this rationale. ACKs for top commit: MarcoFalke: re-ACK be55f54 🚲 ryanofsky: Code review ACK be55f54. Just small cleanups since the last review. hebasto: ACK be55f54, I have reviewed the code and it looks OK, I agree it can be merged. Tree-SHA512: 90eb03334af0155b823030b4f2ecf286d35058d700ee2ddbbaa445be19e31eb0fe982656f35bd14ecee3ad2c3d0db3746855cb8f3777eff7253713e42873e111
…il library d168458 scripted-diff: Remove unused chainparamsbase includes (TheCharlatan) e9ee8aa Add missing definitions in prep for scripted diff (TheCharlatan) ba8fc7d refactor: Replace string chain name constants with ChainTypes (TheCharlatan) 401453d refactor: Introduce ChainType getters for ArgsManager (TheCharlatan) bfc21c3 refactor: Create chaintype files (TheCharlatan) Pull request description: This pull request is part of the `libbitcoinkernel` project bitcoin/bitcoin#24303 https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". It is also a follow up to #26177. It replaces pull request bitcoin/bitcoin#27294, which just moved the constants to a new file, but did not re-declare them as enums. The code move of the chain name constants out of the `chainparamsbase` to their own separate header allows the kernel `chainparams` to no longer include `chainparamsbase`. The `chainparamsbase` contain references to the `ArgsManager` and networking related options that should not belong to the kernel library. Besides this move, the constants are re-declared as enums with helper functions facilitating string conversions. ACKs for top commit: ryanofsky: Code review ACK d168458. Just suggested changes since last review. Tree-SHA512: ac2fbe5cbbab4f52eae1e30af1f16700b6589eb4764c328a151a712adfc37f326cc94a65c385534c57d4bc92cc1a13bf1777d92bc924a20dbb30440e7380b316
Let's continue discussion in #27587? |
Project Board: https://github.com/bitcoin/bitcoin/projects/18
This is the main tracking issue for the
libbitcoinkernel
project.The
libbitcoinkernel
project is a new attempt at extracting out our consensus engine. Thekernel
part of the name highlights one of the key functional differences fromlibbitcoinconsensus
and in fact, most libraries: it is a stateful library that can spawn threads, do caching, do I/O, and many other things which one may not normally expect from a library.This statefulness is necessary for
libbitcoinkernel
's decidedly incremental approach to extracting our consensus engine. This approach favors:...which allows us to be continually integrated with Bitcoin Core and benefit from our extensive test suite
...which allows us to avoid having to prematurely optimizing for a "perfect" boundary or API (tends to be highly subjective, non-obvious, may lead to unproductive bike-shedding before we've even done anything meaningful)
I believe that the work of extracting out our consensus engine into a library and making the API ergonomic is likely to be a multi-release project involving multiple contributors. The incremental approach takes this into account, and respects the sheer size of work (both in writing code and getting it through review) that needs to be undertaken.
PRs
Please see the Project Board: https://github.com/bitcoin/bitcoin/projects/18
Project-wide TODOs
blockfilter.cpp
andindex/blockfilterindex.cpp
from{bitcoin_chainstate,libbitcoinkernel_la}_SOURCES
after merge of Improve Indices on pruned nodes via prune blockers #21726kernel::
namespace: [kernel 2c/n] Introducekernel::Context
, encapsulate global init/teardown #25065 (review)kernel::Context
, encapsulate global init/teardown #25065kernel::Context
, encapsulate global init/teardown #25065 (review)kernel::Context
cleanup suggestions: [kernel 2c/n] Introducekernel::Context
, encapsulate global init/teardown #25065 (comment)mempool
optional, stop constructing temporary empty mempools #25223 (comment)CTxMemPool
fromArgsManager
#25290ApplyArgsManOptions
: [kernel 3a/n] DecoupleCTxMemPool
fromArgsManager
#25290 (comment)CTxMemPool::Options
checking inCTxMemPool
constructor: [kernel 3a/n] DecoupleCTxMemPool
fromArgsManager
#25290 (comment)CalculateMemPoolAncestors
to take inCTxMemPool::Limits
instead: [kernel 3a/n] DecoupleCTxMemPool
fromArgsManager
#25290 (comment){Dump,Load}Mempool
fromArgsManager
#25487DumpMempool
's mutex: [kernel 3b/n] Decouple{Dump,Load}Mempool
fromArgsManager
#25487 (comment)LoadMempool
's semantics: [kernel 3b/n] Decouple{Dump,Load}Mempool
fromArgsManager
#25487 (comment){Dump,Load}Mempool
fromArgsManager
#25487 (comment), c4249c1The Game Plan
Stage 1: Extracting out a usable
libbitcoinkernel.{so,dylib,dll}
Step 1: Introduce internal
bitcoin-chainstate
andlibbitcoinkernel
a.k.a. What
.cpp
files do we need?This
bitcoin-chainstate
executable uses our consensus engine and its build system code will reveal the minimal set of files we need to link in to use our consensus engine as-is. It is important to note that this list of files will serve as a guiding "North Star" for this first stage of the plan: as we decouple more and more modules of the codebase from our consensus engine, this list will grow shorter and shorter.This list of files (the
_SOURCES
inAutomake
speak) then serves as the basis for alibbitcoinkernel
, whichbitcoin-chainstate
will be linked against.Key Result: Any further coupling of our consensus engine with non-consensus modules will result in linker errors, preventing this project from becoming a sisyphean task of battling coupling regressions.
Step 2: Decouple most non-consensus code from
libbitcoinkernel
a.k.a. Prune the unnecessary
.cpp
files!There are many modules which do not logically belong in
libbitcoinkernel
(e.g.index/*.cpp
,netaddress.cpp
), but are nevertheless necessary to be included in its_SOURCES
forbitcoin-chainstate
to link correctly. This is because Bitcoin Core's existing codebase is full of unnecessary dependencies/couplings that need to be untangled/decoupled/broken up.This step is where we do the decoupling for:
netaddress.cpp
timedata.cpp
init/common.cpp
ArgsManager
(this one's a doozy)index/*.cpp
shutdown.cpp
logging.cpp
Developer Note: We do not decouple the mempool yet because most users of
libbitcoinkernel
may want to have an embedded mempool with Bitcoin Core's policies and we can decouple it later.Step 3: Introduce an external
bitcoin-chainstate
a.k.a. What
.h
files do we need?Before this step,
bitcoin-chainstate
has been an internal executable managed by our build system with access to all files and headers. In this step, we add an externalbitcoin-chainstate
with a separate build system to reveal the minimal set of headers we need to ship in order to make thelibbitcoinkernel
library usable.Step 4: Decouple most non-consensus headers from
libbitcoinkernel
a.k.a. Prune the unnecessary
.h
files!Similar to Step 2, there are lots of small decoupling of the header dependency tree here. A notable piece of this step is to remove
leveldb
includes from our headers to avoid needing to re-shipleveldb
headers.Stage 2: Polishing the API / Continual De-coupling
At this point, we have a usable
libbitcoinkernel
that is somewhat minimally linked. However, it has a very idiosyncratic, Bitcoin Core-specific C++ interface. The goal of this stage is to incrementally make thelibbitcoinkernel
API more ergonomic for users outside of Bitcoin Core. Bindings to other languages (first C, then others) should be introduced.Personally, I have near-zero experience with library API design and langauge bindings, so I think it would be wise for other contributors to lead this stage. Ideally, they would be able to work with users looking to integrate with
libbitcoinkernel
who can give an accurate account of the API ergonomics from the library user's point of view.Getting
libbitcoinkernel
Through ReviewMost of the changes to be made are "move only", but there are a lot of these "move only" changes to be made. Of course comments/reviews regarding correctness are always more than welcome, but I want very much to avoid losing momentum on this project because of style or style-adjacent comments/reviews.
I propose the following ground rules to make this process more streamlined for all parties involved and a few things that I can do to help:
Action Items
libbitcoinkernel
or are a maintainer:libbitcoinkernel
Through Review" section above.The text was updated successfully, but these errors were encountered: