New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockchain: Reverse utxo set semantics. #1471

Merged
merged 1 commit into from Nov 12, 2018

Conversation

Projects
None yet
6 participants
@davecgh
Copy link
Member

davecgh commented Sep 27, 2018

NOTE: This PR currently requires a full chain download. If you plan to test it, please make a copy of your existing data directory first.


Testing Notes

This PR does not yet have database migration code and is completely incompatible with the current v4 database format. Thus a full chain download is required. dcrd will exit with error in the case you attempt to run it against an existing v4 database and the new database it creates will be v5, so if you try to run an older software version against the new database, you will get an error message similar to Unable to start server on [:9108]: the current blockchain database is no longer compatible with this version of the software (5 > 4).

The migration code will be added via a separate PR.

This is the biggest change to dcrd to date and, as such, will need a significant amount of testing and careful review.


This modifies the way the unspent transaction output set is handled to reverse its current semantics so that it is optimized for the typical case, provides simpler handling, and resolves various issues with the previous approach. In addition, it updates the transaction, address, and existsaddress indexes to no longer remove entries from blocks that have been disapproved as, in all cases, the data still exists in the blockchain and thus should be queryable via the indexes even though there is special handling applied which treats them as if they did not exist in certain regards.

Prior to this change, transactions in the regular tree were not applied to the utxo set until the next block was processed and did not vote against them. However, that approach has several undesirable consequences such as temporarily "invisible" utxos that are actually spendable, disapproved transactions missing from indexes even though they are still in the blockchain, and poor performance characteristics.

In a certain sense, the previous approach could be viewed as the transactions not being valid until they were approved, however, that is not really true because it was (and still is) perfectly acceptable to spend utxos created by transactions in the regular tree of the same block so long as they come before the transactions that spend them. Further, utxos from a transaction in the regular tree of a block can be spent in the next block so long as that block does not disapprove them, which further illustrates that the utxos are actually valid unless they are disapproved.

Consequently, this modifies that behavior to instead make the utxo set always track the most recent block and remove the regular transactions in the parent when a block votes against them. This approach is significantly more efficient for the normal case where the previous block is not disapproved by its successor.

Also, the terminology is changed in several places to refer to disapproved blocks and transaction trees as opposed to invalid, because invalid implies the tree/block is malformed or does not follow the consensus rules. On the contrary, when a block votes against its parent, it is only voting against regular transaction tree of the parent. Both the block and transaction tree are still valid in that case, only the regular transaction tree is treated as if it never existed in terms of effects on the utxo set and duplicate transaction semantics.

High level overview of changes:

  • Modify the utxo viewpoint to reverse semantics as previously described
    • Remove all code related to stake viewpoints
    • Change all block connection code in the viewpoint to first undo all
      transactions in the regular tree of the parent block if the current
      one disapproves it then connect all of the stake txns followed by
      the regular transactions in the block
      • NOTE: The order here is important since stake transactions are not
        allowed to spend outputs from the regular transactions in the same
        block as the next block might disapprove them
    • Change all block disconnection code in the viewpoint to first undo
      all the transactions in the regular and stake trees of the block
      being disconnected, and then resurrect the regular transactions in
      the parent block if the block being disconnected disapproved of it
    • Introduce a new type named viewFilteredSet for handling sets
      filtered by transactions that already exist in a view
    • Introduce a function on the viewpoint for specifically fetching the
      inputs to the regular transactions
    • Update mempool block connection and disconnection code to match the
      new semantics
    • Update all tests to handle the new semantics
  • Modify the best state number of transactions to include all
    transactions in all blocks regardless of disapproval because they
    still had to be processed and still exist in the blockchain
  • Remove include recent block parameter from mempool.FetchTransaction
    since the utxoset now always includes the latest block
    • This also has the side effect of correcting some unexpected results
      such as coinbases in the most recent block being incorrectly
      reported as having zero confirmations
  • Modify mempool utxo fetch logic to use a cached disapproved view, when
    needed, rather than recreating the view for every new transaction
    added to it
  • Update spend journal to include all transactions in the block instead
    of only stake transactions from the current block and regular
    transactions from the parent block
  • Modify tx and address indexes to store the block index of each tx
    along with its location within the files and update the query
    functions to return the information as well
  • Change the tx, address, and existsaddress indexes to index all
    transactions regardless of their disapproval
    • This also corrects several issues such as the inability to query and
      retrieve transactions that exist in a disapproved block
  • Update all RPC commands that return verbose transaction information
    to set that newly available block index information properly
  • Rename IsRegTxTreeKnownDisapproved in the mining.TxSource interface to
    IsRegTxTreeKnownDisapproved
    • NOTE: This will require a major bump to the mining module before
      the next release
  • Rename several utxoView instances to view for consistency
  • Rename several variables that dealt with disapproved trees from
    names that contained Invalid to ones that contain Disapproved

This is work towards #1145.
Fixes #618.

@davecgh davecgh added this to the 1.4.0 milestone Sep 27, 2018

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch from 9f5c227 to f28960c Sep 27, 2018

@chappjc

This comment has been minimized.

Copy link
Member

chappjc commented Sep 27, 2018

Just completed a full block download successfully, in almost exactly 30 minutes with a single LAN peer. I will be testing a fresh sync of dcrdata with this.

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Sep 27, 2018

Some additional details on one of the issues this also fixes:

Before:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[0] | dcrctl getrawtransaction - 1
{
  "txid": "0a1a046784c0cbe2e639dc8b79debe2def79b99e2f4ac8a6b055271a77843925",
...
 "vin": [
...
 ],
"vout": [
...
 ],
...
 "blockheight": 0               <----- This is a bug fixed by this PR
...
}

After:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[0] | dcrctl getrawtransaction - 1
{
  "txid": "0a1a046784c0cbe2e639dc8b79debe2def79b99e2f4ac8a6b055271a77843925",
...
  "vin": [
...
  ],
  "vout": [
...
  ],
...
  "blockheight": 278164,
...
}
@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Sep 27, 2018

Some additional details on another one of the issues this also fixes:

Before:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[0] | dcrctl gettxout - 0
{
  "bestblock": "0000000000000000396d8aba7c7c706562c4b962608812d69347af8cd2e05a94",
  "confirmations": 0,                       <----- This is a bug fixed by this PR
...
  "coinbase": true
}

After:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[0] | dcrctl gettxout - 0
{
  "bestblock": "0000000000000000396d8aba7c7c706562c4b962608812d69347af8cd2e05a94",
  "confirmations": 1,
...
  "coinbase": true
}
@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Sep 27, 2018

Some additional details on a third issue this also fixes:

Before:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[1] | dcrctl getrawtransaction - 1 | jq .vin
[
 {
   "txid": "211eec2881df9364ddbaf9cf0fd73cb4a43718ac46fe8fd20c0b3e4ba5d16856",
   ...
   "amountin": -1e-08,                         <--- Not filled in despite being mined
   "blockheight": 0,                           <--- Not filled in despite being mined
   "blockindex": 4294967295,                   <--- Not filled in despite being mined
   ...
 },
 ...
]

After:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[1] | dcrctl getrawtransaction - 1 | jq .vin
[
 {
   "txid": "211eec2881df9364ddbaf9cf0fd73cb4a43718ac46fe8fd20c0b3e4ba5d16856",
   ...
   "amountin": 94.24934123,
   "blockheight": 278198,
   "blockindex": 2,
   ...
 },
 ...
]
@matheusd
Copy link
Member

matheusd left a comment

First pass review, only a few initial comments.

Show resolved Hide resolved blockchain/indexers/txindex.go
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go
Show resolved Hide resolved blockchain/indexers/txindex.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch 2 times, most recently from c882fcc to 453f4b4 Sep 28, 2018

Show resolved Hide resolved blockchain/indexers/existsaddrindex.go Outdated
Show resolved Hide resolved blockchain/indexers/manager.go
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/indexers/txindex.go Outdated
Show resolved Hide resolved blockchain/indexers/txindex.go
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated
Show resolved Hide resolved blockchain/utxoviewpoint.go Outdated

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch from 453f4b4 to c9f67d8 Sep 28, 2018

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Oct 1, 2018

Some additional details on a fourth issue this also fixes:

Before:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[1] | dcrctl getrawtransaction - 1 | jq -r '.blockhash'
null    <---- The block hash is not filled in despite being mined

After:

$ dcrctl getbestblockhash | dcrctl getblock - | jq -r .tx[1] | dcrctl getrawtransaction - 1 | jq -r '.blockhash'
0000000000000000b76446ea746e2a48de8f1d43cb4674725e37d044865ff800
@matheusd
Copy link
Member

matheusd left a comment

Performing functional testing in simnet now.

Found 2 issues so far

  • Txs not being re-inserted into mempool after disapproving blocks due to a small bug in blockmanager (noted in the code)
  • I'm triggering an "assertion failed: missing spend journal data" on reorgs (explanation below)

I'm using this script to test reorgs: https://gist.github.com/matheusd/73907f551b21104bc3dea1af7d8bc385

This is the procedure triggering the issue (btw I'm testing with the first bug corrected as noted in the code review):

  • Wallets configured to vote false on blocks with blockheight % 10 >= 5 && <= 7
  • Mine on nodes until after SVH and up to the first block voted false in a streak (eg: 165)
  • Disconnect nodes
  • Mine on block beta causing a fork at block 165. Mine on beta until block 167.
  • Reconnect nodes
  • Node Alpha receives the new blocks with higher pow, causing a reorg and triggering the error

I've recorded a video describing the error, can send it to you if needed.

Show resolved Hide resolved blockmanager.go Outdated
Show resolved Hide resolved blockmanager.go Outdated

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch from 35e02c7 to 70c848b Oct 8, 2018

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Oct 8, 2018

The issue regarding reorgs of disapproved blocks as reported by @matheusd has been corrected. In addition, tests for the scenario that triggered it have been added via PR #1485 so the scenario is proven to work the same both before and after this PR.

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Oct 8, 2018

All review items until this point have been addressed.

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch from 70c848b to d7a4829 Oct 8, 2018

@matheusd
Copy link
Member

matheusd left a comment

Re-tested after fixes and didn't hit the error anymore. Tested again the utxo status changes and mempool behavior on disapproved blocks and also didn't see any new problems.

Super nice to have these changes in!

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch from d7a4829 to db686ba Oct 10, 2018

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Oct 10, 2018

Rebased to deal with conflicts with latest master.

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Oct 16, 2018

Some timing information for the large full block reorg tests with and without these changes:

without changes: 3m21.801s
with changes:    2m18.629s

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch from db686ba to dbea7b2 Oct 16, 2018

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Oct 16, 2018

Rebased to handle merge conflicts with latest master and squashed all approved review updates.

blockchain: Reverse utxo set semantics.
This modifies the way the unspent transaction output set is handled to
reverse its current semantics so that it is optimized for the typical
case, provides simpler handling, and resolves various issues with the
previous approach.  In addition, it updates the transaction, address,
and existsaddress indexes to no longer remove entries from blocks that
have been disapproved as, in all cases, the data still exists in the
blockchain and thus should be queryable via the indexes even though
there is special handling applied which treats them as if they did not
exist in certain regards.

Prior to this change, transactions in the regular tree were not applied
to the utxo set until the next block was processed and did not vote
against them.  However, that approach has several undesirable
consequences such as temporarily "invisible" utxos that are actually
spendable, disapproved transactions missing from indexes even though
they are still in the blockchain, and poor performance characteristics.

In a certain sense, the previous approach could be viewed as the
transactions not being valid until they were approved, however, that is
not really true because it was (and still is) perfectly acceptable to
spend utxos created by transactions in the regular tree of the same
block so long as they come before the transactions that spend them.
Further, utxos from a transaction in the regular tree of a block can be
spent in the next block so long as that block does not disapprove them,
which further illustrates that the utxos are actually valid unless they
are disapproved.

Consequently, this modifies that behavior to instead make the utxo set
always track the most recent block and remove the regular transactions
in the parent when a block votes against them.  This approach is
significantly more efficient for the normal case where the previous
block is not disapproved by its successor.

Also, the terminology is changed in several places to refer to
disapproved blocks and transaction trees as opposed to invalid, because
invalid implies the tree/block is malformed or does not follow the
consensus rules.  On the contrary, when a block votes against its
parent, it is only voting against regular transaction tree of the
parent.  Both the block and transaction tree are still valid in that
case, only the regular transaction tree is treated as if it never
existed in terms of effects on the utxo set and duplicate transaction
semantics.

High level overview of changes:
- Modify the utxo viewpoint to reverse semantics as previously described
  - Remove all code related to stake viewpoints
  - Change all block connection code in the viewpoint to first undo all
    transactions in the regular tree of the parent block if the current
    one disapproves it then connect all of the stake txns followed by
    the regular transactions in the block
    - NOTE: The order here is important since stake transactions are not
      allowed to spend outputs from the regular transactions in the same
      block as the next block might disapprove them
  - Change all block disconnection code in the viewpoint to first undo
    all the transactions in the regular and stake trees of the block
    being disconnected, and then resurrect the regular transactions in
    the parent block if the block being disconnected disapproved of it
  - Introduce a new type named viewFilteredSet for handling sets
    filtered by transactions that already exist in a view
  - Introduce a function on the viewpoint for specifically fetching the
    inputs to the regular transactions
  - Update mempool block connection and disconnection code to match the
    new semantics
  - Update all tests to handle the new semantics
- Modify the best state number of transactions to include all
  transactions in all blocks regardless of disapproval because they
  still had to be processed and still exist in the blockchain
- Remove include recent block parameter from mempool.FetchTransaction
  since the utxoset now always includes the latest block
  - This also has the side effect of correcting some unexpected results
    such as coinbases in the most recent block being incorrectly
    reported as having zero confirmations
- Modify mempool utxo fetch logic to use a cached disapproved view, when
  needed, rather than recreating the view for every new transaction
  added to it
- Update spend journal to include all transactions in the block instead
  of only stake transactions from the current block and regular
  transactions from the parent block
- Modify tx and address indexes to store the block index of each tx
  along with its location within the files and update the query
  functions to return the information as well
- Change the tx, address, and existsaddress indexes to index all
  transactions regardless of their disapproval
  - This also corrects several issues such as the inability to query and
    retrieve transactions that exist in a disapproved block
- Update all RPC commands that return verbose transaction information
  to set that newly available block index information properly
- Rename IsRegTxTreeKnownDisapproved in the mining.TxSource interface to
  IsRegTxTreeKnownDisapproved
  - NOTE: This will require a major bump to the mining module before
    the next release
- Rename several utxoView instances to view for consistency
- Rename several variables that dealt with disapproved trees from
  names that contained Invalid to ones that contain Disapproved

NOTE: This does not yet have database migration code and thus will
require a full chain download.  It will exit with error in the case you
attempt to run it against an existing v4 database.  The new database it
creates will be v5, so attempting to run an older version will reject
the new database to prevent corruption.

The database migration will be added in a separate commit.

@davecgh davecgh force-pushed the davecgh:blockchain_reverse_utxo_semantics branch from dbea7b2 to df1898c Nov 9, 2018

@davecgh davecgh changed the title WIP: blockchain: Reverse utxo set semantics. blockchain: Reverse utxo set semantics. Nov 9, 2018

@davecgh

This comment has been minimized.

Copy link
Member Author

davecgh commented Nov 9, 2018

This is ready for final review and approvals.

@dnldd

dnldd approved these changes Nov 10, 2018

@chappjc

This comment has been minimized.

Copy link
Member

chappjc commented Nov 12, 2018

This branch has been in use on my dev machine and on alpha.dcrdata.org for weeks without issue.

@dajohi

dajohi approved these changes Nov 12, 2018

Copy link
Member

dajohi left a comment

ok

@davecgh davecgh merged commit df1898c into decred:master Nov 12, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@davecgh davecgh deleted the davecgh:blockchain_reverse_utxo_semantics branch Nov 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment