Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More control of maintenance processes at startup/restart #29662

Open
domZippilli opened this issue Mar 15, 2024 · 4 comments
Open

More control of maintenance processes at startup/restart #29662

domZippilli opened this issue Mar 15, 2024 · 4 comments

Comments

@domZippilli
Copy link

domZippilli commented Mar 15, 2024

Please describe the feature you'd like to see added.

I'd like some way to control the levelDB compaction processes a bit more. Specifically, if I could control when they are scheduled (hourly in the background, for example), and possibly limit resources they consume (I/O, in particular), I think it would help.

Is your feature related to a problem, if so please describe it.

The problem I'm having is that when I reboot my bitcoin-core node (which runs with txindex=1), the RPC listener comes up pretty promptly, so my healthchecks (currently just TCP) pass. However, the service is not in its usual baseline state; it is doing a lot of read I/O, and logging about levelDB compaction. This condition lasts for about an hour.

Here is an illustration of the I/O level relative to baseline. The left side is the restart, and the lower right side is after this abates. Dark blue is read:

image

Here's a summary of the logs at this time:

image

The "Compacting" log line is extremely elevated during this period (though it does occur at a much lower level after):

image

Here's a graph of RPC trace P50 duration before, during, and after this phase:

image

The real problem is the last graph, the elevated RPC latency. The tail and head latencies are also much worse than normal, so it's not just a tail latency issue I could solve with timeouts / hedging. The server goes from microsecond/millisecond latency to 10s of seconds, especially for sendrawtransaction (10-30s max), with listunspent a distant second (3-4s max).

This latency abates as soon as the high I/O and compaction logging stops. I am therefore making the intuitive leap (so experts, please consider this critically and I welcome other explanations) that resource utilization during compaction is causing some RPCs to be very slow. I also considered lock contention (perhaps cs_main) but I couldn't see it in the code.

Describe the solution you'd like

I'd like the experts to recommend a solution. Intuitively, it seems like levelDB could amortize this compaction work during normal operation as a background task (the README seems to imply it already should?). Or maybe some way to limit resources used for compaction?

Describe any alternatives you've considered

Currently, I'm looking at alternative ways to do the RPC I enabled txindex for, which is getrawtransaction without the blockhash. But, that will only sidestep this issue with compaction and RPC latency.

Please leave any additional context

I am using a slower filesystem than most. It is a regionally-replicated NFS store, which we chose for resiliency reasons. Intuitively, I'd expect this problem to be less severe (or shorter duration) with lower-latency storage, but still present.

Command line args:

txindex="1", rpcworkqueue="1024", rpc_*="redacted", debug="coindb", debug="estimatefee", debug="reindex", debug="leveldb", debug="walletdb", debug="lock", debug="rpc", dbcache="5734", datadir="/home/bitcoin/data", chain="main"
@willcl-ark
Copy link
Member

Hi @domZippilli.

Whilst I think having more control over leveldb compaction would be nice, it seems to me that the changes would be pretty invasive to our leveldb subtree, which we don't particularly want to touch.

I tried some minimal changes, such as skipping the scheduled compaction on startup, however it's clear to me on testing this that these compaction jobs are scheduled very regularly by the DB implementation (and see below), as this made almost no difference to when I observed compaction running. It would also be possible to increase the leveldb cache size to trigger compactions less frequently (in combination with the previous change).

Having read the leveldb documentation compaction is run:

  1. when the DB is opened (which I tested changes to)
  2. when the log gets full (unavoidable) or
  3. when manual compaction is called.

You might be able to get some different performance characteristics by changing the database cache size in the source code manually, but I think it's unlikely that these changes/options will see implementation in main branch here.

If you are in IBD compactions will be unavoidable as you sync and the cache fills, but it sounds like you are either restarting or recovering from failure?

The startup compactions I saw are mostly coming from initial chainstate verifcation -- we verify the first 6 blocks at startup which triggers a few of these. Without disabling this (highly not recommended) I don't see any way to avoid them either. Disabling these check did see fewer startup compactions. You could probably therefore see improved performance by setting -checkblocks=1, which would be preferable to the below (crude, untested, don't run these!) changes which skips block checks altogether.

diff
diff --git a/src/init.cpp b/src/init.cpp
index 1a4fce4678..f55aadc9de 100644
--- a/src/init.cpp
+++ b/src/init.cpp
@@ -614,6 +614,7 @@ void SetupServerArgs(ArgsManager& argsman)
     argsman.AddArg("-checkblocks=<n>", strprintf("How many blocks to check at startup (default: %u, 0 = all)", DEFAULT_CHECKBLOCKS), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     argsman.AddArg("-checklevel=<n>", strprintf("How thorough the block verification of -checkblocks is: %s (0-4, default: %u)", Join(CHECKLEVEL_DOC, ", "), DEFAULT_CHECKLEVEL), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     argsman.AddArg("-checkblockindex", strprintf("Do a consistency check for the block tree, chainstate, and other validation data structures occasionally. (default: %u, regtest: %u)", defaultChainParams->DefaultConsistencyChecks(), regtestChainParams->DefaultConsistencyChecks()), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
+    argsman.AddArg("-skipverify", strprintf("Skip chain verification at startup (default %b)", DEFAULT_SKIP_VERIFY), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     argsman.AddArg("-checkaddrman=<n>", strprintf("Run addrman consistency checks every <n> operations. Use 0 to disable. (default: %u)", DEFAULT_ADDRMAN_CONSISTENCY_CHECKS), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     argsman.AddArg("-checkmempool=<n>", strprintf("Run mempool consistency checks every <n> transactions. Use 0 to disable. (default: %u, regtest: %u)", defaultChainParams->DefaultConsistencyChecks(), regtestChainParams->DefaultConsistencyChecks()), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     argsman.AddArg("-checkpoints", strprintf("Enable rejection of any forks from the known historical chain until block %s (default: %u)", defaultChainParams->Checkpoints().GetHeight(), DEFAULT_CHECKPOINTS_ENABLED), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
@@ -1573,6 +1574,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
         options.mempool = Assert(node.mempool.get());
         options.reindex = node::fReindex;
         options.reindex_chainstate = fReindexChainState;
+        options.skip_verify = args.GetIntArg("-skipverify", DEFAULT_SKIP_VERIFY);
         options.prune = chainman.m_blockman.IsPruneMode();
         options.check_blocks = args.GetIntArg("-checkblocks", DEFAULT_CHECKBLOCKS);
         options.check_level = args.GetIntArg("-checklevel", DEFAULT_CHECKLEVEL);
diff --git a/src/node/chainstate.cpp b/src/node/chainstate.cpp
index bf1fc06b0b..d241b580e6 100644
--- a/src/node/chainstate.cpp
+++ b/src/node/chainstate.cpp
@@ -249,6 +249,10 @@ ChainstateLoadResult VerifyLoadedChainstate(ChainstateManager& chainman, const C
     auto is_coinsview_empty = [&](Chainstate* chainstate) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
         return options.reindex || options.reindex_chainstate || chainstate->CoinsTip().GetBestBlock().IsNull();
     };
+    if (options.skip_verify) {
+        LogPrintf("[chainstate] WARNING: skipping chain verification due to --skipverify flag\n");
+        return {ChainstateLoadStatus::SUCCESS, {}};
+    };
 
     LOCK(cs_main);
 
diff --git a/src/node/chainstate.h b/src/node/chainstate.h
index a6e9a0331b..36321dbebd 100644
--- a/src/node/chainstate.h
+++ b/src/node/chainstate.h
@@ -24,6 +24,7 @@ struct ChainstateLoadOptions {
     bool coins_db_in_memory{false};
     bool reindex{false};
     bool reindex_chainstate{false};
+    bool skip_verify{false};
     bool prune{false};
     //! Setting require_full_verification to true will require all checks at
     //! check_level (below) to succeed for loading to succeed. Setting it to
diff --git a/src/validation.h b/src/validation.h
index b64ba4dcbc..147fd13ed9 100644
--- a/src/validation.h
+++ b/src/validation.h
@@ -65,6 +65,7 @@ class SignalInterrupt;
 /** Block files containing a block-height within MIN_BLOCKS_TO_KEEP of ActiveChain().Tip() will not be pruned. */
 static const unsigned int MIN_BLOCKS_TO_KEEP = 288;
 static const signed int DEFAULT_CHECKBLOCKS = 6;
+static const bool DEFAULT_SKIP_VERIFY{false};
 static constexpr int DEFAULT_CHECKLEVEL{3};
 // Require that user allocate at least 550 MiB for block & undo files (blk???.dat and rev???.dat)
 // At 1MB per block, 288 blocks = 288MB.

Apart from that startup option, your only other solution as I see it is to speed up your storage backend/replication somehow. These compactions are barely noticable for me on a local SSD...

I hope this helps in some way.

@TheBlueMatt
Copy link
Contributor

Can you describe in more detail why skipping the on-open compaction is a bad idea in general (and, eg, schedule it to run in five minutes)? This behavior is very annoying even outside of a high-latency storage - right when you want your machine to be a bit more free so you can start other things LevelDB decides to spam I/O as fast as it can.

@willcl-ark
Copy link
Member

Can you describe in more detail why skipping the on-open compaction is a bad idea in general (and, eg, schedule it to run in five minutes)?

Sorry I haven't made myself clear; what I think is a bad idea is default disabling the startup checkblocks, not DB compaction scheduling.

I don't see any great requirement for DB compaction on-open (but I am no LevelDB wizard), only as far as I can tell disabling it for Bitcoin Core would be quite invasive to the subtree.

This behavior is very annoying even outside of a high-latency storage - right when you want your machine to be a bit more free so you can start other things LevelDB decides to spam I/O as fast as it can.

If this problem is affecting more than just 1 user with high-latency storage then perhaps it warrants more investigation than I've given it so far, but it wasn't clear to me from this report that this might be the case.

When I profile from cold start to "Loading Wallet..." using an SSD, the compaction stuff doesn't seem problematic: https://tmp.256k1.dev/check_block_6_startup.svg?s=DoCompaction

@TheBlueMatt
Copy link
Contributor

If this problem is affecting more than just 1 user with high-latency storage then perhaps it warrants more investigation than I've given it so far, but it wasn't clear to me from this report that this might be the case.

This is definitely not just a one person issue. I would posit a guess that most bitcoin core nodes are on network-attached storage (though mostly a VM's backing storage in the same datacenter, not trying to sync storage across two datacenters). But also this can be nasty if you crashed and you have to do replay on startup, in that case you have a lot of IO and now compaction is not only much larger but also competing with the block validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants