Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assumeutxo #15606

Closed
wants to merge 26 commits into from
Closed

assumeutxo #15606

wants to merge 26 commits into from

Conversation

jamesob
Copy link
Member

@jamesob jamesob commented Mar 15, 2019

See the proposal for assumeutxo here.

Testing instructions can be found below the "Progress" section.


Progress

All items here have corresponding commits here, but are unchecked if they haven't been merged yet.


Testing

For fun (~5min)

If you want to do a quick test, you can run ./contrib/devtools/test_utxo_snapshots.sh and follow the instructions. This is mostly obviated by the functional tests, though.

For real (longer)

If you'd like to experience a real usage of assumeutxo, you can do that too.
I've cut a new snapshot at height 788'000 (http://img.jameso.be/utxo-788000.dat - but you can do it yourself with ./contrib/devtools/utxo_snapshot.sh if you want). Download that, and then create a datadir for testing:

$ cd ~/src/bitcoin  # or whatever

# get the snapshot
$ curl http://img.jameso.be/utxo-788000.dat > utxo-788000.dat

# you'll want to do this if you like copy/pasting 
$ export AU_DATADIR=/home/${USER}/au-test # or wherever

$ mkdir ${AU_DATADIR}
$ vim ${AU_DATADIR}/bitcoin.conf

dbcache=8000  # or, you know, something high
blockfilterindex=1
coinstatsindex=1
prune=3000
logthreadnames=1

Obtain this branch, build it, and then start bitcoind:

$ git remote add jamesob https://github.com/jamesob/bitcoin
$ git fetch jamesob utxo-dumpload-compressed
$ git checkout jamesob/utxo-dumpload-compressed

$ ./configure $conf_args && make  # (whatever you like to do here)

# start 'er up and watch the logs
$ ./src/bitcoind -datadir=${AU_DATADIR}

Then, in some other window, load the snapshot

$ ./src/bitcoin-cli -datadir=${AU_DATADIR} loadtxoutset $(pwd)/utxo-788000.dat

You'll see some log messages about headers retrieval and waiting to see the snapshot in the headers chain. Once you get the full headers chain, you'll spend a decent amount of time (~10min) loading the snapshot, checking it, and flushing it to disk. After all that happens, you should be syncing to tip in pretty short order, and you'll see the occasional [background validation] log message go by.

In yet another window, you can check out chainstate status with

$ ./src/bitcoin-cli -datadir=${AU_DATADIR} getchainstates

as well as usual favorites like getblockchaininfo.


Original change description

For those unfamiliar with assumeutxo, here's a brief summary from the issue (where any conceptual discussion not specific to this implementation should happen):

assumeutxo would be a way to initialize a node using a headers chain and a serialized version of the UTXO state which was generated from another node at some block height. A client making use of this UTXO "snapshot" would specify a hash and expect the content of the resulting UTXO set to yield this hash after deserialization.

This would allow users to bootstrap a usable pruned node & wallet far more quickly (and with less disk usage) than waiting for a full initial block download to complete, since we only have to sync blocks between the base of the snapshot and the current network tip. Needless to say this is at expense of accepting a different trust model, though how different this really ends up being from assumevalid in effect is worth debate.

In short, this is an interesting change because it would allow nodes to get up and running within minutes given a ~3GB file (at time of writing) under an almost identical trust model to assumevalid.

In this implementation, I add a few RPC commands: dumptxoutset creates a UTXO snapshot and writes it to disk, and loadtxoutset intakes a snapshot from disk, constructs and activates chainstate based on it, and continues a from-scratch initial block download in the background for the sole purpose of validating the snapshot. Once the snapshot is validated, we throw away the chainstate used for background validation.

The assumeutxo procedure as implemented is as follows:

  1. A UTXO snapshot is loaded with the loadtxoutset <path> RPC command.
  2. A new chainstate (CChainState) is initialized using ChainstateManager::ActivateSnapshot():
    1. The serialized UTXO data is read in and various sanity checks are performed, e.g. compare expected coin count, recompute the hash and compare it with assumeutxo hash in source code.
    2. We "fast forward" new_chainstate->m_chain to have a tip at the base of the snapshot (with or without block data). Lacking block data, we fake the nTx counts of the constituent CBlockIndex entries.
    3. LoadChainTip() is called on the new snapshot and it is installed as our active chainstate.
  3. The new assumed-valid chainstate is now our active, and so that enters IBD until it is synced to the network's tip. Presumably the snapshot would be taken relatively close to the current tip but far enough away to avoid meaningful reorgs, say 10,000 blocks deep.
  4. Once the active chainstate is out of IBD, our old validation chain continues IBD "in the background" while the active chainstate services requests from most of the system.
  5. Once the background validation chainstate reaches a height equal the base of the snapshot, we take the hash of its UTXO set and ensure it equals the expected hash based on the snapshot. If the hashes are equivalent, we delete the validation chainstate and move on without event; if they aren't, we log loudly and fall back to the validation chainstate (we should probably just shut down).

The implicit assumption is that the background validation chain will always be a subset of the assumed-valid snapshot chain while the latter is active. We don't properly handle reorgs that go deeper than the base of the snapshot.

Changes (already merged/outdated)

chainstate-beforeafter (1)

The crux of this change is in removing any assumptions in the codebase that there is a single chainstate, i.e. any references to global variables chainActive, pcoinsTip, et al. need to be replaced with functions that return the relevant chainstate data at that moment in time. This change also takes CChainState to its logical conclusion by making it more self-contained - any references to globals like chainActive are removed with class-local references (m_chain).

A few minor notes on the implementation:

  • When we attempt to load a wallet with a BestBlock locator lower than the base of a snapshot and the snapshot has not yet been validated, we refuse to load the wallet.

  • For additional notes, see the new assumeutxo docs.

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 15, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK MarcoFalke, Sjors
Approach ACK ryanofsky

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #27491 (refactor: Move chain constants to the util library by TheCharlatan)
  • #27357 (validation: Move warningcache to ChainstateManager and rename to m_warningcache by dimitaracev)
  • #27277 (Move log messages: tx enqueue to mempool, allocation to blockstorage by Sjors)
  • #27125 (refactor, kernel: Decouple ArgsManager from blockstorage by TheCharlatan)
  • #27039 (blockstorage: do not flush block to disk if it is already there by pinheadmz)
  • #26966 (index: blockfilter initial sync speedup, parallelize process by furszy)
  • #26762 (refactor: Make CCheckQueue RAII-styled by hebasto)
  • #25977 (refactor: Replace std::optional<bilingual_str> with util::Result by ryanofsky)
  • #25970 (Add headerssync tuning parameters optimization script to repo by sipa)
  • #25722 (refactor: Use util::Result class for wallet loading by ryanofsky)
  • #25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)
  • #25193 (indexes: Read the locator's top block during init, allow interaction with reindex-chainstate by mzumsande)
  • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)
  • #24008 (assumeutxo: net_processing changes by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@gmaxwell
Copy link
Contributor

Presumably the snapshot would be taken relatively close to the current tip but far enough away to avoid meaningful reorgs, say 10,000 blocks deep.

To be clear, the snapshot can't be terribly recent for the user of it or it breaks the security model. Assume valid only has any security at all if there is time for the review and communication about review to happen, which means a human timescale of at least weeks.

Probably the assumption should be that snapshots are created pretty close to tip, just far enough to avoid wasting time in reorgs (even 6 blocks would be fine) but the users aren't using them until they are quite a bit older, likely using the second to most recent available one.

We don't serve blocks or transactions to the network while we're operating with an unvalidated snapshot-based chain.

We shouldn't do that. The worst that getting it wrong does is getting us disconnected from peers, which isn't particularly cosmic (and at least we might notice that something is going wrong). The downside of it is that not forwarding transactions shoots our privacy in the head, since any transaction we're emitting would be one we created.

When we attempt to load a wallet with a BestBlock locator lower than the base of a snapshot and the snapshot has not yet been validated, we refuse to load the wallet.

Sounds good.

compare it with the claimed hash in the snapshot metadata.

This is a zero security approach. You can't just ask the user for a value and then accept that. The attack is to reorg the chain and then announce to everyone that there is a node bug and you need to run this command to continue, we specifically engineered out that possibility in assumevalid. This isn't hypothetical, this is exactly what we've seen happen in ethereum w/ fastsync.

This is a fine setup for testing your PR however! Ultimately we should get it to a state where a (FEC-split) copy of the state is loaded from the network and where it can be tested against a publicly reviewed constants configured in the software and/or blockchain.

@jamesob
Copy link
Member Author

jamesob commented Mar 19, 2019

Probably the assumption should be that snapshots are created pretty close to tip, just far enough to avoid wasting time in reorgs (even 6 blocks would be fine) but the users aren't using them until they are quite a bit older, likely using the second to most recent available one.

Yep - I figured that we'd update the assumeutxo hash in lockstep with assumevalid's.

We shouldn't do that. The worst that getting it wrong does is getting us disconnected from peers, which isn't particularly cosmic (and at least we might notice that something is going wrong). The downside of it is that not forwarding transactions shoots our privacy in the head [...]

Good points, will fix that.

This is a zero security approach. You can't just ask the user for a value and then accept that.

Agreed - the check I do there is just for sanity. Deciding on an initial hardcoded assumeutxo hash seems corequisite to allowing snapshots to be loadable through RPC (or any other means).

@jamesob jamesob force-pushed the utxo-dumpload-compressed branch 2 times, most recently from 16fdf93 to d5ffb02 Compare March 21, 2019 16:45
@jamesob jamesob force-pushed the utxo-dumpload-compressed branch 2 times, most recently from 60e287a to 99e5086 Compare March 21, 2019 20:50
jamesob added a commit to jamesob/bitcoin that referenced this pull request Mar 21, 2019
when previously we refused to serve these until we had validated
the snapshot. This was suggested by Greg Maxwell
(bitcoin#15606 (comment)).
@ryanofsky
Copy link
Contributor

ryanofsky commented Mar 26, 2019

There are so many code changes here, and it seems like 80% of them are just renames. I know you are putting off really breaking this up and restructuring it, but maybe you could start by just splitting b2a735d in two commits: one that adds the new classes and function arguments and brute force renames without changing behavior, and a smaller one with the new functionality. That way reviewers could see the more interesting changes without having to go through all the mind-numbing renames.

jamesob added a commit to jamesob/bitcoin that referenced this pull request Mar 27, 2019
when previously we refused to serve these until we had validated
the snapshot. This was suggested by Greg Maxwell
(bitcoin#15606 (comment)).
jamesob added a commit to jamesob/bitcoin that referenced this pull request Mar 27, 2019
when previously we refused to serve these until we had validated
the snapshot. This was suggested by Greg Maxwell
(bitcoin#15606 (comment)).
jamesob added a commit to jamesob/bitcoin that referenced this pull request Mar 28, 2019
when previously we refused to serve these until we had validated
the snapshot. This was suggested by Greg Maxwell
(bitcoin#15606 (comment)).
jamesob added a commit to jamesob/bitcoin that referenced this pull request Mar 28, 2019
when previously we refused to serve these until we had validated
the snapshot. This was suggested by Greg Maxwell
(bitcoin#15606 (comment)).
jamesob added a commit to jamesob/bitcoin that referenced this pull request Mar 29, 2019
when previously we refused to serve these until we had validated
the snapshot. This was suggested by Greg Maxwell
(bitcoin#15606 (comment)).
jamesob added a commit to jamesob/bitcoin that referenced this pull request Mar 29, 2019
when previously we refused to serve these until we had validated
the snapshot. This was suggested by Greg Maxwell
(bitcoin#15606 (comment)).
@jamesob jamesob force-pushed the utxo-dumpload-compressed branch 3 times, most recently from b9b4b6f to 37fc9f8 Compare March 30, 2019 01:31
@jamesob
Copy link
Member Author

jamesob commented Mar 30, 2019

Thanks for the suggestion, @ryanofsky. I spent the last few days reconstructing the changeset into sensible commits - hopefully it's easier to understand this way.

Most of the early commits are just shuffling stuff around (though all of it strictly necessary AFAICT). I've phrased the changes as much as possible as scripted-diffs and move-onlys. In replaying the changes, I found a few unnecessary diffs to omit.

If anyone has additional suggestions for how this could be made easier to review, I'm all ears, though I'm not holding my breath waiting until some Concept (N)ACKs roll in on #15605.

@jamesob jamesob force-pushed the utxo-dumpload-compressed branch 3 times, most recently from 4fff23a to df7a684 Compare April 1, 2019 14:45
jamesob and others added 7 commits May 5, 2023 13:47
This is done for use in later commits. Once snapshot validation
completes, we must restart the indexers so that they continue
indexation (sequentially) on the fully validated snapshot chainstate
now that all the requisite block data is there.
When using an assumedvalid chainstate, only process validationinterface
callbacks from the background chainstate within indexers. This ensures
that all indexes are built in-order.

Later, we can possibly designate indexes which can be built out of order
and continue their operation during snapshot use.

Once the background sync has completed, restart the indexers so that
they continue to index the now-validated snapshot chainstate.
Introduces ChainstateManager::GetPruneRange().
When using an assumedvalid (snapshot) chainstate along with a background
chainstate, we are syncing two very different regions of the chain
simultaneously. If we use the same blockfile space for both of these
syncs, wildly different height blocks will be stored alongside one
another, making pruning ineffective.

This change implements a separate blockfile cursor for the assumedvalid
chainstate when one is in use.
@Sjors
Copy link
Member

Sjors commented May 5, 2023

Here's a torrent for the snapshot: magnet:?xt=urn:btih:a457a54c76dfdbb3f44e3485a84c4772bea647e0&dn=utxo-788000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969

jamesob and others added 7 commits May 5, 2023 19:13
Add the script to the shellcheck exception list since the
quoted variables rule needs to be violated in order to get
bitcoind to pick up on $CHAIN_HACK_FLAGS.
Use the expected AssumeutxoData in order to bootstrap nChainTx values
for assumedvalid blockindex entries in the snapshot chainstate. This
is necessary because nChainTx is normally built up from nTx values,
which are populated using blockdata which the snapshot chainstate
does not yet have.
Otherwise we will not receive transactions during background sync until
restart.
@fanquake
Copy link
Member

fanquake commented May 6, 2023

Background sync finshed catching up:

2023-05-06T06:42:11Z [msghand] UpdateTip: new best=0000000000000000000104bcc48cdaaf080653751bfe4df10d2097f860abedd0 height=788463 version=0x30296000 log2_work=94.162512 tx=12944905 date='2023-05-06T06:40:07Z' progress=0.015582 cache=127.2MiB(988602txo)
2023-05-06T06:46:56Z [msghand] [background validation] UpdateTip: new best=0000000000000000000454bc0c2b24c93b359d5eba2cf98d0108e5c748771ef7 height=786000 version=0x2eef6000 log2_work=94.128679 tx=825161729 date='2023-04-18T17:19:01Z' progress=0.993293 cache=268.5MiB(2211076txo)
2023-05-06T06:47:16Z [msghand] New outbound peer connected: version: 70015, blocks=788463, peer=958 (outbound-full-relay)
<snip>
2023-05-06T06:52:50Z [msghand] New outbound peer connected: version: 70016, blocks=788463, peer=973 (block-relay-only)
2023-05-06T06:55:24Z [msghand] [background validation] UpdateTip: new best=00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e height=788000 version=0x29e9e000 log2_work=94.156236 tx=831186441 date='2023-05-02T22:16:30Z' progress=0.998719 cache=307.0MiB(2318605txo)
2023-05-06T06:55:26Z [msghand] [snapshot] computing UTXO stats for background chainstate to validate snapshot - this could take a few minutes
2023-05-06T06:56:21Z [msghand] [snapshot] snapshot beginning at 00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e has been fully validated
2023-05-06T06:56:21Z [msghand] [snapshot] allocating all cache to the snapshot chainstate
2023-05-06T06:56:21Z [msghand] Opening LevelDB in /Users/*/Downloads/assume_test/chainstate_snapshot
2023-05-06T06:56:21Z [msghand] Opened LevelDB successfully
2023-05-06T06:56:21Z [msghand] Using obfuscation key for /Users/*/Downloads/assume_test/chainstate_snapshot: ee33826b0d50cd61
2023-05-06T06:56:21Z [msghand] [Chainstate [snapshot] @ height 788463 (0000000000000000000104bcc48cdaaf080653751bfe4df10d2097f860abedd0)] resized coinsdb cache to 8.0 MiB
2023-05-06T06:56:21Z [msghand] [Chainstate [snapshot] @ height 788463 (0000000000000000000104bcc48cdaaf080653751bfe4df10d2097f860abedd0)] resized coinstip cache to 8966.0 MiB
2023-05-06T06:56:22Z [scheduler] ChainStateFlushed: WARNING: Locator contains block (hash=00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e) not on known best chain (tip=000000000000000000025dbf6c4aab8e1dee993455b628eef315696e5fe4ef2f); not writing index locator
2023-05-06T06:56:22Z [scheduler] ChainStateFlushed: WARNING: Locator contains block (hash=00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e) not on known best chain (tip=000000000000000000025dbf6c4aab8e1dee993455b628eef315696e5fe4ef2f); not writing index locator
2023-05-06T06:56:22Z [basic block filter index] basic block filter index thread start
2023-05-06T06:56:22Z [basic block filter index] Syncing basic block filter index with block chain from height 787644
2023-05-06T06:56:22Z [coinstatsindex] coinstatsindex thread start
2023-05-06T06:56:22Z [coinstatsindex] Syncing coinstatsindex with block chain from height 787644
2023-05-06T06:56:32Z [basic block filter index] basic block filter index is enabled at height 788463
2023-05-06T06:56:32Z [basic block filter index] basic block filter index thread exit
2023-05-06T06:56:52Z [coinstatsindex] Syncing coinstatsindex with block chain from height 788246
2023-05-06T06:57:03Z [coinstatsindex] coinstatsindex is enabled at height 788463
2023-05-06T06:57:03Z [coinstatsindex] coinstatsindex thread exit
2023-05-06T06:57:08Z [msghand] New outbound peer connected: version: 70016, blocks=788463, peer=974 (block-relay-only)
2023-05-06T06:59:29Z [msghand] New outbound peer connected: version: 70016, blocks=788463, peer=975 (block-relay-only)
2023-05-06T07:15:50Z [msghand] Saw new header hash=00000000000000000005a1fa7ff860245e4ec12b09a61618e0fbb44f2ebb1e5d height=788464
2023-05-06T07:15:50Z [msghand] [net] Saw new cmpctblock header hash=00000000000000000005a1fa7ff860245e4ec12b09a61618e0fbb44f2ebb1e5d peer=970
2023-05-06T07:15:50Z [msghand] UpdateTip: new best=00000000000000000005a1fa7ff860245e4ec12b09a61618e0fbb44f2ebb1e5d height=788464 version=0x2817c000 log2_work=94.162526 tx=12948693 date='2023-05-06T07:15:36Z' progress=0.015587 cache=128.6MiB(1000903txo)
<snip>
2023-05-06T10:40:52Z [msghand] New outbound peer connected: version: 70016, blocks=788495, peer=1011 (block-relay-only)
2023-05-06T10:46:45Z [msghand] Saw new header hash=0000000000000000000091d6b178462227c1334c10f14c39255160c0fc82875b height=788496
2023-05-06T10:46:45Z [msghand] [net] Saw new cmpctblock header hash=0000000000000000000091d6b178462227c1334c10f14c39255160c0fc82875b peer=970
2023-05-06T10:46:46Z [msghand] UpdateTip: new best=0000000000000000000091d6b178462227c1334c10f14c39255160c0fc82875b height=788496 version=0x20000000 log2_work=94.162955 tx=13090510 date='2023-05-06T10:46:26Z' progress=0.015756 cache=155.3MiB(1211908txo)
# ./src/bitcoin-cli -datadir=${AU_DATADIR} getchainstates
{
  "active_chain_type": "validated_snapshot",
  "validated_snapshot": {
    "blocks": 788496,
    "bestblockhash": "0000000000000000000091d6b178462227c1334c10f14c39255160c0fc82875b",
    "difficulty": 48005534313578.78,
    "verificationprogress": 0.01575630722365423,
    "snapshot_blockhash": "00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e",
    "initialblockdownload": false,
    "coins_db_cache_bytes": 8388608,
    "coins_tip_cache_bytes": 9401532416
  },
  "headers": 788496
}

@jamesob
Copy link
Member Author

jamesob commented May 6, 2023

Thanks for testing @fanquake and thanks for the torrent @Sjors. I've pushed some fixes, rebased, and CI is green.

Runing through your steps above, everything seems to be working, except that my mempool was empty until I stopped and restarted bitcoind?

Fixed in 7cbbf2a - forgot to swap the m_mempool references on snapshot activation. I've verified that the mempool now starts to populate during background sync without a restart.

If you restart the node in the middle of the bg sync, nChainTx for the snapshot chain will not properly repopulate

Fixed this in 5526788. Verified that getchainstates reports the snapshot chianstate progress as expected after restart.

@jamesob
Copy link
Member Author

jamesob commented May 6, 2023

The history of the pull request is getitng unwieldy; the 400+ comments are now creating a situation where comments (like the testing instructions) posted a few days ago are buried under a minute of clicking "load more."

I've added the testing instructions to the PR description, but should I consider opening a fresh PR? Do we have any process for dealing with this Github limitation?

@fanquake
Copy link
Member

fanquake commented May 6, 2023

@jamesob I would be in favour of you opening a new PR (carrying over relevant current context into the description, and pointing back to anything else relevant). I ran into the same annoyance today, when trying to leave my most recent comment. Having to expand 400+ comments check recent discussion/context, is not great.

@Sjors
Copy link
Member

Sjors commented May 6, 2023

I also ended up with an empty mempool with 2dd8ea0 on Ubuntu 23.04, with pruning enabled. Will try again once you believe that's fixed. When I (cleanly) shut down the node and restarted it, I noticed this log message: [snapshot] computing UTXO stats for background chainstate to validate snapshot - this could take a few minutes, followed by [snapshot] snapshot beginning at 00000000000000000001f3fa1b4c03c877740778f56b0d5456b18dd88f7f695e has been fully validated. Stopping and starting the node again that message no longer appeared and things seem normal.

I also compared gettxoutsetinfo muhash for a recent and an old height to one of my other nodes, and they returned the same muhash, which suggests that indexes work.

@Sjors
Copy link
Member

Sjors commented May 6, 2023

It's gotten to the point where even refreshing the page doesn't always get you the latest comments. +1 for opening a new one.

Meanwhile I'll recompile and do an unpruned with assumevalid=0.

@Sjors
Copy link
Member

Sjors commented May 6, 2023

On Ubuntu 23.04 (gcc 12.2.0) I get a bunch of these warnings, that I don't get on master:

In file included from ./wallet/wallet.h:10,
                 from wallet/dump.cpp:10:
./interfaces/chain.h:274:22: warning: ‘virtual void interfaces::Chain::Notifications::blockConnected(const interfaces::BlockInfo&)’ was hidden [-Woverloaded-virtual]
  274 |         virtual void blockConnected(const BlockInfo& block) {}

@jamesob jamesob mentioned this pull request May 8, 2023
@jamesob
Copy link
Member Author

jamesob commented May 8, 2023

Closing this as replaced by #27596.

I get a bunch of these warnings, that I don't get on master:

Thanks for spotting this @Sjors; one-line removal fixed in the new PR.

@jamesob jamesob closed this May 8, 2023
achow101 added a commit to bitcoin-core/gui that referenced this pull request Oct 2, 2023
edbed31 chainparams: add signet assumeutxo param at height 160_000 (Sjors Provoost)
b8cafe3 chainparams: add testnet assumeutxo param at height 2_500_000 (Sjors Provoost)
99839bb doc: add note about confusing HaveTxsDownloaded name (James O'Beirne)
7ee46a7 contrib: add script to demo/test assumeutxo (James O'Beirne)
42cae39 test: add feature_assumeutxo functional test (James O'Beirne)
0f64bac rpc: add getchainstates (James O'Beirne)
bb05857 refuse to activate a UTXO snapshot if mempool not empty (James O'Beirne)
ce585a9 rpc: add loadtxoutset (James O'Beirne)
62ac519 validation: do not activate snapshot if behind active chain (James O'Beirne)
9511fb3 validation: assumeutxo: swap m_mempool on snapshot activation (James O'Beirne)
7fcd215 blockstorage: segment normal/assumedvalid blockfiles (James O'Beirne)
4c3b8ca validation: populate nChainTx value for assumedvalid chainstates (James O'Beirne)
49ef778 test: adjust chainstate tests to use recognized snapshot base (James O'Beirne)
1019c39 validation: pruning for multiple chainstates (James O'Beirne)
373cf91 validation: indexing changes for assumeutxo (James O'Beirne)
1fffdd7 net_processing: validationinterface: ignore some events for bg chain (James O'Beirne)
fbe0a7d wallet: validationinterface: only handle active chain notifications (James O'Beirne)
f073917 validationinterface: only send zmq notifications for active (James O'Beirne)
4d8f4dc validation: pass ChainstateRole for validationinterface calls (James O'Beirne)
1e59acd validation: only call UpdatedBlockTip for active chainstate (James O'Beirne)
c6af23c validation: add ChainstateRole (James O'Beirne)
9f2318c validation: MaybeRebalanceCaches when chain leaves IBD (James O'Beirne)
434495a chainparams: add blockhash to AssumeutxoData (James O'Beirne)
c711ca1 assumeutxo: remove snapshot during -reindex{-chainstate} (James O'Beirne)
c93ef43 bugfix: correct is_snapshot_cs in VerifyDB (James O'Beirne)
b73d3bb net_processing: Request assumeutxo background chain blocks (Suhas Daftuar)

Pull request description:

  - Background and FAQ: https://github.com/jamesob/assumeutxo-docs/tree/2019-04-proposal/proposal
  - Prior progress/project: https://github.com/bitcoin/bitcoin/projects/11
  - Replaces bitcoin/bitcoin#15606, which was closed due to Github slowness. Original description and commentary can be found there.

  ---

  This changeset finishes the first phase of the assumeutxo project. It makes UTXO snapshots loadable via RPC (`loadtxoutset`) and adds `assumeutxo` parameters to chainparams. It contains all the remaining changes necessary to both use an assumedvalid snapshot chainstate and do a full validation sync in the background.

  This may look like a lot to review, but note that
  - ~200 lines are a (non-essential) demo shell script
  - Many lines are functional test, documentation, and relatively dilute RPC code.

  So it shouldn't be as burdensome to review as the linecount might suggest.

  - **P2P**: minor changes are made to `init.cpp` and `net_processing.cpp` to make simultaneous IBD across multiple chainstates work.
  - **Pruning**: implement correct pruning behavior when using a background chainstate
  - **Blockfile separation**: to prevent "fragmentation" in blockfile storage, have background chainstates use separate blockfiles from active snapshot chainstates to avoid interleaving heights and impairing pruning.
  - **Indexing**: some `CValidationInterface` events are given with an additional parameter, ChainstateRole, and all indexers ignore events from ChainstateRole::ASSUMEDVALID so that indexation only happens sequentially.
  - Have `-reindex` properly wipe snapshot chainstates.
  - **RPC**: introduce RPC commands `loadtxoutset` and (hidden) `getchainstates`.
  - **Release docs & first assumeutxo commitment**: add notes and a particular assumeutxo hash value for first AU-enabled release.
    - This will complete the project and allow use of UTXO snapshots for faster node bootstrap.

  The next phase, if it were to be pursued, would be coming up with a way to distribute the UTXO snapshots over the P2P network.

  ---

  ### UTXO snapshots

  Create your own with `./contrib/devtools/utxo_snapshot.sh`, e.g.
  ```shell
  ./contrib/devtools/utxo_snapshot.sh 788000 utxo.dat ./src/bitcoin-cli -datadir=$(pwd)/testdata`)
  ```
  or use the pre-generated ones listed below.

  - Testnet: **2'500'000** (Sjors):
    - torrent: `magnet:?xt=urn:btih:511e09f4bf853aefab00de5c070b1e031f0ecbe9&dn=utxo-testnet-2500000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969`
    - sha256: `79db4b025448cc0ac388d8589a28eab02de53055d181e34eb47391717aa16388`
  - Signet: **160'000** (Sjors):
    - torrent: `magnet:?xt=urn:btih:9da986cb27b3980ea7fd06b21e199b148d486880&dn=utxo-signet-160000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969`
    - sha256: `eeeca845385ba91e84ef58c09d38f98f246a24feadaad57fe1e5874f3f92ef8c`
  - Mainnet: **800'000** (Sjors):
    - Note: this needs the following commit cherry-picked in: Sjors/bitcoin@24deb20
    - torrent: `magnet:?xt=urn:btih:50ee955bef37f5ec3e5b0df4cf0288af3d715a2e&dn=utxo-800000.dat&tr=udp%3A%2F%2Ftracker.bitcoin.sprovoost.nl%3A6969`

  ### Testing

  #### For fun (~5min)

  If you want to do a quick test, you can run `./contrib/devtools/test_utxo_snapshots.sh` and follow the instructions. This is mostly obviated by the functional tests, though.

  #### For real (longer)

  If you'd like to experience a real usage of assumeutxo, you can do that too.
  I've cut a new snapshot at height 788'000 (http://img.jameso.be/utxo-788000.dat - but you can do it yourself with `./contrib/devtools/utxo_snapshot.sh` if you want). Download that, and then create a datadir for testing:
  ```sh
  $ cd ~/src/bitcoin  # or whatever

  # get the snapshot
  $ curl http://img.jameso.be/utxo-788000.dat > utxo-788000.dat

  # you'll want to do this if you like copy/pasting
  $ export AU_DATADIR=/home/${USER}/au-test # or wherever

  $ mkdir ${AU_DATADIR}
  $ vim ${AU_DATADIR}/bitcoin.conf

  dbcache=8000  # or, you know, something high
  blockfilterindex=1
  coinstatsindex=1
  prune=3000
  logthreadnames=1
  ```
  Obtain this branch, build it, and then start bitcoind:
  ```sh
  $ git remote add jamesob https://github.com/jamesob/bitcoin
  $ git fetch jamesob assumeutxo
  $ git checkout jamesob/assumeutxo

  $ ./configure $conf_args && make  # (whatever you like to do here)

  # start 'er up and watch the logs
  $ ./src/bitcoind -datadir=${AU_DATADIR}
  ```
  Then, in some other window, load the snapshot
  ```sh
  $ ./src/bitcoin-cli -datadir=${AU_DATADIR} loadtxoutset $(pwd)/utxo-788000.dat
  ```

  You'll see some log messages about headers retrieval and waiting to see the snapshot in the headers chain. Once you get the full headers chain, you'll spend a decent amount of time (~10min) loading the snapshot, checking it, and flushing it to disk. After all that happens, you should be syncing to tip in pretty short order, and you'll see the occasional `[background validation]` log message go by.

  In yet another window, you can check out chainstate status with
  ```sh
  $ ./src/bitcoin-cli -datadir=${AU_DATADIR} getchainstates
  ```
  as well as usual favorites like `getblockchaininfo`.

ACKs for top commit:
  achow101:
    ACK edbed31

Tree-SHA512: 6086fb9a38dc7df85fedc76b30084dd8154617a2a91e89a84fb41326d34ef8e7d7ea593107afba01369093bf8cc91770621d98f0ea42a5b3b99db868d2f14dc2
@bitcoin bitcoin locked and limited conversation to collaborators May 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet