Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kernel 3e/n] Decouple CDBWrapper and its kernel users from ArgsManager #25623

Closed

Conversation

dongcarl
Copy link
Contributor

@dongcarl dongcarl commented Jul 15, 2022

This is part of the libbitcoinkernel project: #24303, https://github.com/bitcoin/bitcoin/projects/18

This PR is NOT dependent on any other PRs.


This PR introduces ::Options structs and ApplyArgsManOptions functions for CDBWrapper and its wrappers and descendants in libbitcoinkernel. Namely CBlockTreeDB and CCoinsViewDB.

libbitcoinkernel code can now instantiate these classes by filling in an ::Options struct without referencing gArgs, and non-libbitcoinkernel code can use ApplyArgsManOptions with a local ArgsManager.

The ::Options struct and ApplyArgsManOptions duo has been used in many previous libbitcoinkernel ArgsManager decoupling PRs. See: https://github.com/bitcoin/bitcoin/projects/18

@dongcarl dongcarl added this to Ready for Review PRs in The libbitcoinkernel Project Jul 15, 2022
@dongcarl dongcarl force-pushed the 2022-07-kernel-argsman-cblocktreedb branch from 26c3ad9 to 1ed5723 Compare July 16, 2022 00:33
@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 16, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25862 (refactor, kernel: Remove gArgs accesses from dbwrapper and txdb by ryanofsky)
  • #25786 (refactor: Make adjusted time type safe by MarcoFalke)
  • #25781 (Remove almost all blockstorage globals by MarcoFalke)
  • #25704 (refactor: Remove almost all validation option globals by MarcoFalke)
  • #25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)
  • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)
  • #23443 (p2p: Erlay support signaling by naumenkogs)
  • #22663 (dbwrapper: properly destroy objects in case CDBWrapper ctor throws by Crypt-iQ)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1ed5723 🔶

Show signature

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

1ed5723bc9c69bf2d7c2a902e44f3fc46b1165a7 🔶
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUhBHwv/bXhMtBcq5mbrytbt6HR1RqodiVJY8Lco4+XaZJ9SvabncUowi665bc9Q
pTlEAdBYb/RXxJhbRzZe9PTmTpqqZVbOtHaTM/TFfrJMhCSvpuCTKd28Fxav5MMV
En+4Na6IjeZ/NdRT6u2QIM4XgC3AUv3o1rsU8Uw+MHbrAcY+1sn8HuELNxcSg1dL
nUrACZIFIzsiOmNctDP4q+C9bJtnYywdV2DgT7hOED42c5A0cWxanVMvMNMVWAy+
X4m8fnMYjXL4+XmNiMGfrswHscNtWI6jdZd6uIL+kjunQFuPA08S3LVEl2UQKYav
a7G8TuH0OUXIJK6j9VFb1nPotMIbj+bgmvWObBS2R2pRpFglR45IgPmSYq+EnKyk
9pJUT6sdhrGj5RLEHF7tdF4scgqAqpilAmn95TfynEx20KAIhYse8PyRILzdTp2X
M/stGVjV9UyggbepiVfQ0TJC//LlwrGRNFlK8XLf7BHUDikjT8Hc9S3q/HeHtzDZ
zDqoqjm9
=XJoH
-----END PGP SIGNATURE-----

src/dbwrapper.cpp Show resolved Hide resolved
src/txdb.cpp Outdated
.in_memory = m_is_memory,
.wipe_existing = false,
.obfuscate_data = true,
.do_compact = false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2708fcb: Why is this false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a behavior change that I should probably document in the commit message. But I'd also like to hear if others think this is reasonable:

In my mind, options like wipe_existing and do_compact are "oneshot" parameters that should be latched to false after they've been applied for the first time. So in places like ResizeCache I think it makes more sense to supply false for these.

src/bench/checkblock.cpp Outdated Show resolved Hide resolved
src/init.cpp Outdated Show resolved Hide resolved
@dongcarl dongcarl force-pushed the 2022-07-kernel-argsman-cblocktreedb branch from 1ed5723 to ed7ae6a Compare July 18, 2022 16:22
@dongcarl dongcarl force-pushed the 2022-07-kernel-argsman-cblocktreedb branch 2 times, most recently from dd421fd to 961c86a Compare July 20, 2022 01:10
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 20, 2022
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 961c86a, but I think this would be better if the PR stopped adding ArgsManager::GetArg calls where they don't belong. I don't think there's a good reason to add new calls that will need to be deleted later from validation code when we can just delete them now. I know you aren't really concerned with indexing code, but I don't think it is great to add gArgs usages to indexing code either.

The main culprit here is gArgs.GetBoolArg("-forcecompactdb", false) calls and these can be eliminated by adding a node::ReadDatabaseArgs function analogous to the wallet::ReadDatabaseArgs function. I implemented this in f103f4d (tag) and also removed gArgs accesses from src/index/ code in the process

src/dbwrapper.h Outdated
@@ -236,6 +237,7 @@ class CDBWrapper
.in_memory = fMemory,
.wipe_existing = fWipe,
.obfuscate_data = obfuscate,
.do_compact = gArgs.GetBoolArg("-forcecompactdb", false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "CDBWrapper: Pass in -forcecompactdb via ::Options" (d29f09a)

gArgs access doesn't really belong in dbwrapper.h header either

.in_memory = f_memory,
.wipe_existing = f_wipe,
.obfuscate_data = f_obfuscate,
.do_compact = gArgs.GetBoolArg("-forcecompactdb", false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "Use CDBWrapper::Options ctor for non-test callers" (c5566d4)

Would be good not to add news gArgs calls here and below

@@ -80,6 +80,10 @@ int main(int argc, char* argv[])
const ChainstateManager::Options chainman_opts{
.chainparams = chainparams,
.adjusted_time_callback = static_cast<int64_t (*)()>(GetTime),
.block_tree_db_opts = {
.db_path = gArgs.GetDataDirNet() / "blocks" / "index",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "CBlockTreeDB: Add ::Options, init with BlockManager" (961c86a)

Should probably use abs_datadir variable, not gArgs

src/txdb.cpp Outdated
.in_memory = fMemory,
.wipe_existing = fWipe,
.obfuscate_data = true,
.do_compact = gArgs.GetBoolArg("-forcecompactdb", false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "Use CDBWrapper::Options ctor for non-test callers" (c5566d4)

Moving gArgs.GetBoolArg("-forcecompactdb", false) call out of cdbwrapper code into txdb code seems like a step backwards or maybe sideways, since txdb code shouldn't be using ArgsManager either

ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 20, 2022
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Jul 20, 2022
@dongcarl
Copy link
Contributor Author

dongcarl commented Jul 21, 2022

@ryanofsky So I think we only add (net) 2 calls to gArgs, one in src/index/ and one in src/txdb. I plan to remove the one in src/txdb in the next PR, and add an ApplyArgsManOptions function to this PR so that it's easier to remove the reference in src/index/ in the future.

It is very difficult to maintain behavior and build-ability between commits without having to add oddly placed references to gArgs here and there (e.g. the one you pointed out in dbwrapper.h), since in order to correctly avoid the oddly placed gArgs references we'd have to bubble up the CDBWrapper::Options struct all the way to the top in a single commit. This would entail bubbling up both:

  1. CDBWrapper::Options through CBlockTreeDB, through BlockManager, through ChainstateManager, to init (currently done in this PR)
  2. CDBWrapper::Options through CCoinsViewDB, through CoinsViews, through InitCoinsDB (or, as I have it in my branch, through CChainState's constructor), through LoadChainstate, to init (to be done in the next PR)

in a single commit, which would likely be very hard to review. So we have to have some (admittedly awkward) resting points in these bubbling up chains just so that it's not a huge commit.

I do admit that I can do a better job of adding code comments so that it's clear these gArgs references are not meant to stay. I'd also welcome discussing alternative strategies that I might not be thinking of right now.

@dongcarl dongcarl force-pushed the 2022-07-kernel-argsman-cblocktreedb branch from 961c86a to ccdfad0 Compare July 27, 2022 21:07
@dongcarl
Copy link
Contributor Author

dongcarl commented Jul 27, 2022

Push 961c86a -> ccdfad0

  • Add commits that also decouples CCoinsViewDB from ArgsManager
  • Reorganize commits

I've thought about it and perhaps it's best to do both CBlockTreeDB and CCoinsViewDB at once. This reduces the net addition of gArgs references to just the one instance in BaseIndex::DB::DB(...) and is probably less confusing to reviewers, see Ryan's review above.

This is not merge-able yet since ActivateSnapshot's call to std::unique_ptr<CChainState> does not take into account ArgsManager overrides. I will address this in a future push by adding a CCoinsViewDB::Options m_coinsdb_opts member to ChainstateManager.

@dongcarl
Copy link
Contributor Author

dongcarl commented Aug 4, 2022

When you pushed a big change and are waiting to see if the CI passes
image

maflcko pushed a commit to bitcoin-core/gui that referenced this pull request Aug 11, 2022
2e79fb6 validation tests: Use existing {Chainstate,Block}Man (Carl Dong)

Pull request description:

  This is split up because it is needed for two changes:
  * bitcoin/bitcoin#25781
  * bitcoin/bitcoin#25623

ACKs for top commit:
  adam2k:
    ACK tested 2e79fb6
  aureleoules:
    ACK 2e79fb6.

Tree-SHA512: 2cd6a2fec19545f8ffc77e37ccb793aa6cb5815bb1b5e560c0345af6e0f890fd500ae3297b044d3f6f613b8dd7fd4553f5fc2824013342b9e25af1fe2b624967
@maflcko
Copy link
Member

maflcko commented Aug 11, 2022

Could rebase after ##25815 ?

@dongcarl dongcarl force-pushed the 2022-07-kernel-argsman-cblocktreedb branch from 60d3ffa to 06bd11d Compare August 11, 2022 22:45
@dongcarl
Copy link
Contributor Author

Push 60d3ffa -> 06bd11d

The non-Options constructor is kept (it now calls the Options version of
the constructor) so that we can fix callsites one-by-one without
breaking builds, it will be removed later in this patchset.
Add an ApplyArgsManOptions function that currently just calls the
ApplyArgsManOptions for CBlockTreeDB::Options.

In a future commit in this patchset, this function will also call the
ApplyArgsManOptions for CCoinsViewDB::Options.
Add a CBlockTreeDB::Options member to ChainstateManager::Options, and
pass it down from ChainstateManager's ctor to BlockManager's ctor to
CBlockTreeDB's new ctor that takes only a CBlockTreeDB::Options.

This removes the need to reach into ChainstateManager -> BlockManager ->
CBlockTreeDB right after you initialize the ChainstateManager, which was
ugly. See node/chainstate.cpp and test/util/setup_common.cpp diffs.

Notes:

1. In init.cpp, we needed to move the ChainstateManager initialization
   into the catch_exceptions block, otherwise exceptions thrown by
   the CBlockTreeDB ctor previously caught because they were in
   LoadChainstate would no longer be caught.
2. We also needed to add a data dir member to ChainstateManager::Options
   so that we can determine the default db_path for CBlockTreeDB if none
   is specified (See validation.h hunk). It does seem to make sense that
   ChainstateManager would know about what its data dir is anyway.
Since each BlockManager's m_block_tree_db will now be initialized with
the BlockManager ctor, we don't need it to be a unique_ptr any more.
Add a CCoinsViewDB::Options member to ChainstateManager::Options, store
it as a member in ChainstateManager's ctor, and pass it down to
CChainState's ctor when CSM::{InitializeChainstate,ActivateSnapshot} is
called. CChainState's ctor will then pass it down to CCoinsViews' new
ctor that only takes a CBlockTreeDB::Options.

This removes the need to call InitCoinsDB right after you construct your
CChainState to have a working CChainState at all. In fact,
CChainState::InitCoinsDB is entirely removed in this commit.
Since each CChainState's CoinsViewDB will now be initialized with the
CChainState ctor, we don't need it to be a unique_ptr any more.

Question for reviewers: Is deferring its destruction in Shutdown
codepaths safe?
Also use scopes for freeing temporary CDBWrappers within tests.
Now that the non-Options constructor is no longer used anywhere, remove
it.
Construct ChainstateManager with total_coinstip_cache so that we don't
have to reach into it later in node::LoadChainstate to set it.

Allows us to remove the CacheSizes parameter from node::LoadChainstate
and the member from BasicTestingSetup. Done so in this commit.
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tend to NACK. I think PR is doing some good things, but it is also doing some bad or questionable things. I think the main goal of the PR (decoupling txdb and dbwrapper from ArgsManager) can be achieved more simply. Would suggest starting from #25862 and building from there. Other changes in this PR if they are worth pursuing do not have to be done at the same time as removing ArgsManager calls.

The parts of the PR I think are good are:

  • Removing gArgs usages from txdb and dbwrapper.
  • Introducing a struct to group dbwrapper parameters.

The parts of this PR I think are bad are:

  • Making kernel API more complicated. The change makes bitcoin-chainstate.cpp example program harder to understand, and inappropriately exposes low level database settings (paths, obfuscation, cache size settings) to kernel applications that will mostly likely break if applications try to set them. I'm in favor of exposing low-level options, but not at cost of making simple applications harder to write. I also think this type of change should be done in a targeted PR, not as an accidental byproduct of internal refactoring.

  • Replacing a single GetBoolArg("-forcecompactdb") call in dbwrapper code, with 3 separate calls outside of it. This is layer violation exposing low-level database performance options to higher level application code. It also makes it unnecessarily difficult to add new debug/performance options because it now requires adding multiple struct fields and changing multiple layers of code.

  • Defining the overlapping fields in 3 different structs (DBWrapperOpts, BlockTreeDBOpts, CoinsViewDBOpts) and requiring boilerplate code to convert between struct types. There isn't a real justification for this complexity right now.

I think #25862 is a clean minimal change that does the work needed to make kernel code not depend on ArgsManager. There may be other changes in this PR that are worth keeping, but I don't think there's a reason to bundle them all together.

},
.data_dir = gArgs.GetDataDirNet(),
.coins_view_db_opts = {
.cache_size = static_cast<size_t>(cache_sizes.coins_db), // FIXME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME?

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@dongcarl
Copy link
Contributor Author

Closing in favor of #25862, lets get this done!

@dongcarl dongcarl closed this Sep 26, 2022
achow101 added a commit that referenced this pull request Feb 17, 2023
…and txdb

aadd7c5 refactor, validation: Add ChainstateManagerOpts db options (Ryan Ofsky)
0352258 refactor, txdb: Use DBParams struct in CBlockTreeDB (Ryan Ofsky)
c00fa1a refactor, txdb: Add CoinsViewOptions struct (Ryan Ofsky)
2eaeded refactor, dbwrapper: Add DBParams and DBOptions structs (Ryan Ofsky)

Pull request description:

  Code in the libbitcoin_kernel library should not be calling `ArgsManager` methods or trying to read options from the command line. Instead it should just get options values from simple structs and function arguments that are passed in externally. This PR removes `gArgs` accesses from `dbwrapper` and `txdb` modules by defining appropriate options structs, and is a followup to PR's #25290 #25487 #25527 which remove other `ArgsManager` calls from kernel modules.

  This PR does not change behavior in any way. It is a simpler alternative to #25623 because the only thing it does is remove `gArgs` references from kernel code. It avoids other unnecessary changes like adding options to the kernel API (they can be added separately later).

ACKs for top commit:
  TheCharlatan:
    Code review ACK aadd7c5
  achow101:
    ACK aadd7c5
  furszy:
    diff ACK aadd7c5

Tree-SHA512: 46dfd5d99ab3110492e7bba97a87122c831b8344caaf7dd2ebdb6e0ad6aa9174d4d1832d6f3a7465eda9294fe50defaa3c000afbbddc4e72838687df09a63ffd
@bitcoin bitcoin locked and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Status: Done or Closed or Rethinking
The libbitcoinkernel Project
Ready for Review PRs
Development

Successfully merging this pull request may close these issues.

None yet

4 participants