New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: Streamline util library #29015
base: master
Are you sure you want to change the base?
Conversation
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
Is the idea here that |
Updated 0e64745 -> 778c021 ( re: #29015 (comment)
I do think that's generally a good thing to do, but it's not really the goal of this PR. The main thing this PR is trying to do is provide a better kernel API and reduce the size of the kernel library. Because libbitcoin_kernel depends on libbitcoin_util, any external application using the kernel will also depend on util. So we should try to avoid including functions and types in util that are not likely to be useful to external applications, or are unstable and meant for internal use. The common library is usually a better place for these things. |
Does that mean that things should only go in util if they're going to be used by the kernel library then? That would presumably mean util/bitdeque.h and util/sock.cpp should also eventually go elsewhere, for example? That seems like a bad approach to me (in that it means code has to move into and out of util based on whether it's relevant to the kernel, rather than due to any changes to the code itself, and users then potentially have to deal with it moving into/outof the util:: namespace). |
No, I don't think so, and I don't think much else is going to move after this PR. I think only code that shouldn't be used by the kernel or kernel applications should be moved out of util, not just code that isn't used by the kernel. I agree just moving any code that isn't used by the kernel would be a bad approach, and wrote basically the same comment as you recently in #27989 (comment), so I think we are in agreement. |
Updated 778c021 -> 8b21f41 ( |
8748d63
to
b234094
Compare
Does it make sense to improve https://github.com/bitcoin/bitcoin/blob/master/doc/design/libraries.md by expressing the idea from the quotes above more clearly? As for now, it reads:
|
This seems like the right place to talk about the relationship between the util and kernel library. My work on the kernel library has been focusing on reducing its essence to validation code, pruning external dependencies, and stripping out as much system related functionality as possible. I think if the goal is to have the kernel be capable of running on as many platforms as possible, potentially beyond the number of platforms that Bitcoin Core is capable of today and extending into embedded environments, this last point becomes crucial. Looking at the content of In my opinion, making the The asterisk here is that some of the utilities that are not currently used by the kernel do not introduce new dependencies, since they entirely rely on code that is already used by the kernel. I also generally think that some utilities could remain in the util library even though they are not currently used by validation code. The criteria I propose for evaluating if modules should remain in util after this PR are:
Looking at @hebasto's comment here, Between this PR and #28690 I think it would be nice if common criteria for |
Would it make sense to just merge common into util; but have a "stripped" version of util available for the kernel, that excludes stuff that doesn't match the 5 points above? That way the difference is just at the build system level, it doesn't involve moving files around or moving code between namespaces... Then the advice would be: (a) just put things in util, (b) but only add things to the "util-core" section of the build file when they're needed by the kernel; (c) keep things in the util:: namespace; (d) split stuff into different .h/.cpp files pretty aggressively? That would imply a different approach for the util: move fees.h and error.h to common/messages.h commit here. |
re: #29015 (comment)
Idea to build util and util-core (or util-lite?) libraries is interesting, because then it would be easy to add/remove things from util library depending on whether the kernel needs them, and it would never require renaming anything. I could see this being useful in the future if util library started growing a lot and accumulating more features of over time. But for now, IMO, just sticking with the current util/common separation and using the cleanup in this PR would organize things better and be less confusing than having two util versions, and it would require moving fewer things. Even though the libraries.md document is currently ambiguous, I think it should be pretty easy to remove the ambiguity here. re: #29015 (comment)
I think these are good rules if you replace "basic data structure utilities" with "basic utilities that to do not require outside dependencies", and also drop the 3rd rule. It seems like with the 3rd rule you are trying to make a distinction between "data structure utilities" and "system-related utilities", and I don't think it helps to draw a dividing line there. |
Added 1 commits b234094 -> 6d8cb4d ( re: #29015 (comment)
Just to address these points directly, I think the goals of making the kernel portable and working on any platform, and making the kernel minimal working and in constrained environments are great goals. I just don't you need to |
One of the guiding principles for the kernel project so far has been to prohibit future re-couplings of decoupled modules. The tracking issue says "Any further coupling of our consensus engine with non-consensus modules will result in linker errors, preventing this project from becoming a Sisyphean task of battling coupling regressions.". How could this fit in with leaving unused content in the library? |
I reread the issue you linked to, but I don't understand what it implies here. I do think you can rely on linker errors to ensure kernel code isn't calling wallet code or GUI code or P2P code or storing state in global variables. But what else do you want beyond that? And how does removing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I'm on roughly the same page:
- keeping Split in util make sense
- moving our string stuff into the util namespace seems fine * moving the spanparsing stuff to script mostly makes sense -- though I feel a bit like mixing consensus critical script manipulation and miniscript/descriptor parsing in the same namespace is a bit clunky?
- moving fees.h/error.h to "common" makes sense; though if we can make "common" just be a broader part of "util" that seems better long term.
- moving TransactionError into node rather than common doesn't make sense to me though
doc/design/libraries.md
Outdated
@@ -87,14 +88,15 @@ class bitcoin-qt,bitcoind,bitcoin-cli,bitcoin-wallet bold | |||
|
|||
- *libbitcoin_consensus* should be a standalone dependency that any library can depend on, and it should not depend on any other libraries itself. | |||
|
|||
- *libbitcoin_util* should also be a standalone dependency that any library can depend on, and it should not depend on other internal libraries. | |||
- *libbitcoin_crypto* should be a standalone dependency that any library can depend on, and it should not depend on any other libraries itself. (It is not shown in diagram above to save space, but has the exact same connections as *libbitcoin_consensus*, which is shown.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't libconsensus necessarily depend on libcrypto (for sha256 at least)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #29015 (comment)
Doesn't libconsensus necessarily depend on libcrypto (for sha256 at least)?
Yes that should be right. I think I just deluded myself because I didn't want to update the mermaid diagram. Updated now though.
Maybe removing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-ACK b581567
Good improvements since my last review and having some form of external verification for the work being done is very nice. If other reviewers or maintainers are not happy with introducing a bash script, I could volunteer some time to translate it.
The includes could be cleaned up a bit, especially util/string.h
and util/strencodings.h
are missing in a bunch of places. It would also be nice if the newly introduced files had correct includes. Maybe this PR could also be used as a opportunity to more aggressively clean up includes in general.
test/util/check-deps.sh
Outdated
@@ -0,0 +1,203 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think rather than test/
this should go into contrib/devtools
, which contains functionally similar scripts like symbol-check
, or security-check
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #29015 (comment)
I think rather than
test/
this should go intocontrib/devtools
, which contains functionally similar scripts likesymbol-check
, orsecurity-check
.
Good suggestion, it does fit in more with those tools, moved over there.
Move chainparamsbase from util to common, because util library should not depend on the common library and chainparamsbase uses the ArgsManager class in common.
Move memory_cleanse from util to crypto because the crypto library should not depend on other libraries, and it calls memory_cleanse.
Move util/message to common/signmessage so it is named more clearly, and because the util library is not supposed to depend on other libraries besides the crypto library. The signmessage functions use CKey, CPubKey, PKHash, and DecodeDestination functions in the consensus and common libraries.
Move HexStr and HexDigit functions from util to crypto. The crypto library does not actually use these functions, but the consensus library does. The consensus and util libraries not allowed to depend on each other, but are allowed to depend on the cryto library, so the crypto library is a reasonable put these. The consensus library uses HexStr and HexDigit in script.cpp, transaction.cpp, and uint256.cpp. The util library does not use HexStr but does use HexDigit in strencodings.cpp to parse integers.
This will help move the miniscript / descriptor parsing functions out of the util library in an upcoming commit, so they are not exposed to libbitcoinkernel applications. Moving the Split functions should also make them more discoverable since they now close to related functions like Join. The functions are moved verbatim without any changes.
Move miniscript / descriptor script parsing functions out of util library so they are not a dependency of the kernel. There are no changes to code or behavior.
New node/types.h file is analagous to existing wallet/types.h and is a better place to define simple node types that are shared externally with wallet and gui code than the util library. Motivation for this change is to completely remove util/error.h file currently holding TransactionError in a followup commit.
Add separate PSBTError enum instead of reusing TransactionError enum for PSBT operations, and drop unused error codes. The error codes returned by PSBT operations and transaction broadcast functions mostly do not overlap, so using an unified enum makes it harder to call any of these functions and know which errors actually need to be handled. Define PSBTError in the common library because PSBT functionality is implemented in the common library and used by both the node (for rawtransaction RPCs) and the wallet.
Move enum and message formatting functions to a common/messages header where they should be more discoverable, and also out of the util library, so they will not be a dependency of the kernel The are no changes in behavior and no changes to the moved code.
Add TransactionError to node namespace and include it directly instead of relying on indirect include through common/messages.h This is a followup to a previous commit which moved the TransactionError enum. These changes were done in a separate followup just to keep the previous commit more minimal and easy to review.
There are no changes to behavior. Changes in this commit are all additions, and are easiest to review using "git diff -U0 --word-diff-regex=." options. Motivation for this change is to keep util functions with really generic names like "Split" and "Join" out of the global namespace so it is easier to see where these functions are defined, and so they don't interfere with function overloading, especially since the util library is a dependency of the kernel library and intended to be used with external code.
…es.md Also describe crypto library which was previously unmentioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated b581567 -> 7689dfd (pr/rmutil.17
-> pr/rmutil.18
, compare) moving check-deps script.
re: #29015 (review)
The includes could be cleaned up a bit, especially
util/string.h
andutil/strencodings.h
are missing in a bunch of places. It would also be nice if the newly introduced files had correct includes. Maybe this PR could also be used as a opportunity to more aggressively clean up includes in general.
Agree this would be nice. The only issue is I never figured out a good way to run the IWYU tool that didn't seem to add lot of unrelated changes to PRs, given that existing includes are so messy, and there are so many that could be cleaned up. If you have any advice on how to run IWYU with this PR, I'd be happy to follow. Or you already have a clear idea of how to clean up includes and just want implement changes on a branch, I'd be happy to use that.
test/util/check-deps.sh
Outdated
@@ -0,0 +1,203 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #29015 (comment)
I think rather than
test/
this should go intocontrib/devtools
, which contains functionally similar scripts likesymbol-check
, orsecurity-check
.
Good suggestion, it does fit in more with those tools, moved over there.
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the Possibly this is due to a silent merge conflict (the changes in this pull request being Leave a comment here, if you need help tracking down a confusing failure. |
There seems to be a conflict with #30098. Maybe rebase? |
Thanks! Rebased now.
Yes this could be a good idea. The script started out just being a few lines of bash listing symbols exported by one library and imported by another one. But then it grew as it was extended to support multiple libraries and suppress known errors. So now it's no longer a small script, though it is pretty clean and well commented. Other possible followups besides translating the script could be:
Rebased 7689dfd -> 1c26d10 ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-ACK 1c26d10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 1c26d10.
My Guix build:
|
Remove
fees.h
,errors.h
, andspanparsing.h
from the util library. Specifically:Split
functions fromutil/spanparsing.h
toutil/string.h
, usingutil
namespace for clarity.script/parsing.h
since they are used for descriptor and miniscript parsing.util/fees.h
andutil/errors.h
intocommon/messages.h
so there is a place for simple functions that generate user messages to live, and these functions are not part of the util library.Motivation for this change is that the util library is a dependency of the kernel, and we should remove functionality from util that shouldn't be called by kernel code or kernel applications. These changes should also improve code organization and make functions easier to discover. Some of these same moves are (or were) part of #28690, but did not help with code organization, or made it worse, so it is better to move them and clean them up in the same PR so code only has to change one time.