nuke validation#27
Conversation
2635144 to
140e894
Compare
|
Concept ACK I like the cleaner separation. I started to do a light review, but noticed theres a merge conflict. While I finish reading through, can you resolve the conflict? |
ImportBlocks, which loads block files from disk during startup (-reindex
and -loadblock), does not belong in blockstorage.cpp alongside the
BlockManager class it merely uses. Move it to node/blockimport.{cpp,h}
so that each file has a single clear responsibility.
This is a pure move: no logic is changed. blockimport.cpp is given only
the includes it directly requires; util/result.h is removed from
blockstorage.cpp where it became unused once ImportBlocks left.
CheckFinalTxAtTip is a pure transaction finality check with no
dependency on chainstate. Move it to consensus/tx_verify.{cpp,h}
where it sits alongside the other transaction verification helpers.
DisconnectResult, ApplyTxInUndo, and UpdateCoins are UTXO mutation
primitives called during block (dis)connection. They depend only on
CCoinsViewCache and CTransaction, with no chainstate coupling. Move
them to coins.{cpp,h} alongside the existing AddCoins/SpendCoin
helpers they naturally extend.
These moves eliminate the need for validation.cpp callers to reach
into validation.h solely for these low-level helpers.
Moving ApplyTxInUndo and UpdateCoins into coins.h introduces a new
expected circular dependency: coins -> undo -> coins. undo.h includes
coins.h for Coin, and coins.h now includes undo.h for BlockUndo.
Extract mempool-related validation functions from validation.cpp into
new files mempool_validation.{h,cpp}:
- MempoolAcceptResult and PackageMempoolAcceptResult result types
- ProcessTransaction, AcceptToMemoryPool, ProcessNewPackage
- LimitMempoolSize, CalculateLockPointsAtTip, CheckSequenceLocksAtTip
- MaybeUpdateMempoolForReorg
Behavioral change: MaybeUpdateMempoolForReorg now takes an explicit
CTxMemPool& mempool parameter instead of obtaining it via
chainstate.GetMempool(). This is required because the Clang Thread
Safety Analyzer cannot alias *chainstate.MempoolMutex() to mempool.cs
when they are obtained via different expressions. The four call sites in
validation.cpp are guarded with `if (m_mempool)` null checks.
Introduces circular dependency:
mempool_validation -> validation -> mempool_validation
(validation.cpp still includes mempool_validation.h for the transition
period; to be resolved in a follow-up)
Extract block-related validation functions from validation.cpp into
new files block_validation.{h,cpp}:
- CheckBlockHeader, CheckBlock, BlockManuallyRequested
- ConnectBlock, DisconnectBlock, UpdateTip
- ActivateBestChainStep, ActivateBestChain
- InvalidateBlock, ReconsiderBlock
- GetSpendHeight, GetUTXOStats, VerifyScript helpers
- TestBlockValidity, BuildChainState
Also extract script checking infrastructure into script/script_check.{h,cpp}.
Introduces circular dependencies:
block_validation -> validation -> block_validation
block_validation -> validation -> mempool_validation -> block_validation
(validation.cpp still includes block_validation.h for the transition
period; to be resolved in a follow-up)
Rename node/chainstate.{cpp,h} to node/chainstate_load.{cpp,h} to avoid
a naming conflict with the upcoming rename of validation.{cpp,h} to
chainstate.{cpp,h}. The node/chainstate module hosts LoadChainstate()
and VerifyLoadedChainstate(), so chainstate_load better describes its
role.
After extracting mempool validation and block validation into their own
files, the code remaining in validation.{cpp,h} is exclusively
chainstate management: Chainstate, ChainstateManager, and their
supporting infrastructure. Rename the files to match what they
actually contain and update all include directives accordingly.
The rename also updates the names of existing circular dependencies in
the lint allowlist: node/blockstorage -> validation -> node/blockstorage
becomes chainstate -> node/blockstorage -> chainstate, and likewise for
kernel/coinstats. The previously introduced
block_validation -> validation -> block_validation and
mempool_validation -> validation -> mempool_validation are updated to
reflect the new filename.
140e894 to
c8d5aae
Compare
|
Rebased thanks. |
| return true; | ||
| } | ||
|
|
||
| bool CheckFinalTxAtTip(const CBlockIndex& active_chain_tip, const CTransaction& tx) |
There was a problem hiding this comment.
We lost the AssertLockHeld(cs_main) here. Why is that?
There was a problem hiding this comment.
Yes, this function does not need cs_main locked, I think, and does not have any side effects.
So it's fine, also thats one of the goals of this PR to decrease the tight coupling.
You pass the chain tip index, and then tx, and it returns whether the tx is final at that tip.
Also, active_chain_tip is a reference, so we are sure that it exists and won't get invalidated.
fwiw, I was able to retain the lock annotations in other functions because they still take chainman, mempool e.t.c. But in subsequent PR's, those annotations should also be gone in pure validation functions.
| --j; | ||
| const COutPoint& out = tx.vin[j].prevout; | ||
| int res = ApplyTxInUndo(std::move(txundo.vprevout[j]), view, out); | ||
| DisconnectResult res = ApplyTxInUndo(std::move(txundo.vprevout[j]), view, out); |
There was a problem hiding this comment.
Pointing out that we are no longer casting to int here. Perhaps this is something to upstream.
There was a problem hiding this comment.
Yeah, it is the right thing to do.
I agree that it should be ported upstream.
I will make a list of subtle things that are less intrusive that can easily be ported upstream as we move further.
|
Didn't you guys plan on keeping this repo up to date with bitcoin/bitcoin changes? If so, while these changes may seem appealing in a vacuum, this will just end up creating a ton of work for yourselves down the road to keep up with upstream? |
|
Agreed this will create many conflicts with upstream, however I anticipate this file will be subject to a great amount of change regardless. I think merge conflicts will have to be resolved manually, whether we split this file up or not.
That's the plan |
|
LGTM. @rustaceanrob since you had some comments, I'll wait for a thumbs up from you before merging |
validation.cppisn't a misnomer; it's a validation file that includes block validation, mempool validation, chainstate management, consensus helpers, and friends well over 5,000 lines across multiple concerns.This makes it hard to navigate, and hard to reason about.
This PR breaks it apart by moving code to where it actually belongs.
mempool_validation.{h,cpp}block_validation.{h,cpp}CScriptChecktoscript/script_check.{h,cpp}consensus/tx_verify.{h, cpp}coins.{h, cpp}.node/blockimport.{h, cpp}Once the extraction is complete, what remains in
validation.{cpp,h}is exclusivelyChainstateandChainstateManager, so the files are renamed tochainstate.{h,cpp}to reflect what they actually contain. node/chainstate.{cpp,h} is renamed to node/chainstate_load.{cpp,h} beforehand, also for clarity.The circular dependencies here actually makes the underlying tight coupling more obvious.
chainstate.h, while chainstate.h needs types declared in block_validation.h.
The circular dependencies are added to the lint allowlist as temporary known exceptions.
The path to resolving these is to reduce what the extracted functions need to take as parameters. Rather than passing a full ChainstateManager&, functions in block_validation and mempool_validation can be refactored to take narrower dependencies, for example, passing a BlockManager& directly, or introducing a small context struct which will be defined in a separate file, so they no longer need to include chainstate.h at all.
That work is left as a follow-up to keep this PR a pure move.
Pros of the end goal of this are:
On this branch, there are cons because this is an intermediate state.
separation is done but the logical separation is not the full benefit requires significant follow-up work