Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Implement a UTXO cache #1168

Closed
wants to merge 4 commits into from

Conversation

stevenroose
Copy link
Contributor

@stevenroose stevenroose commented Apr 21, 2018

This PR requires #1045. Ignore the commit(s) from #1045 when reviewing this PR.

This PR adds an in-memory UTXO cache. UtxoViewpoint is trimmed of the block (dis)connection methods, as they now can be applied also to the cache (for reverting and forwarding blocks).

The database keeps track of the consistency of the UTXO state with the rest of the chainstate (block index and blocks). When the node starts with the UTXO state lagging behind, it replays blocks to catch up. Since flushing the cache cannot be done in a single LevelDB transaction, it is done in batches. Before a flush starts, the db is updated to indicate a flush is ongoing. When the node shuts down in the middle of a flush and restarts, it does not know where it left of with flushing. Therefore, it first rolls back changes that had already been applied and then replays all blocks to catch up with the chain state.

Yet to be done:

  • Work out a flushing strategy. Flush inside cache when over limit?
  • Consider somehow merging UtxoViewpoint and utxoCache in a cachedUtxoView.

Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenroose great work on this! I took an initial pass at review.

I tried running locally, and indeed there does seem to be an improvement in initial sync. I synced up testnet to about 700k so w/ no errors.

On mainnet, I found that the current PR reproducibly fails and falls out of consensus around block 93k. It seems one of the entries is not properly held in the cache, here are some info logs https://pastebin.com/4cN8XnZQ
Lmk if you want me to crank up the debug level.

@@ -785,6 +801,12 @@ func (b *BlockChain) disconnectBlock(node *blockNode, block *btcutil.Block, view
b.sendNotification(NTBlockDisconnected, block)
b.chainLock.Lock()

// Since we just changed the UTXO cache, we make sure it didn't excee its
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/excee/exceed/

@@ -673,6 +691,12 @@ func (b *BlockChain) connectBlock(node *blockNode, block *btcutil.Block, view *U
b.sendNotification(NTBlockConnected, block)
b.chainLock.Lock()

// Since we just changed the UTXO cache, we make sure it didn't excee its
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/excee/exceed

// It returns 0 for the nil element.
func (entry *UtxoEntry) memoryUsage() uint64 {
if entry == nil {
return 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should return 8, as a nil ptr isn't free :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize nil's aren't free. At first I thought it wouldn't work easy with adding and removing them, but I think right now it might. Let me make sure. Thanks for starting to look at this, though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might make sense to instead have totalMemoryUsage add 8 bytes per entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for that approach. Let me know if you think it makes more sense to add the 8 here instead of in the other method.

// later lookups for the entry know that the entry does not exist in the
// database.
s.cachedEntries[outpoint] = nil
s.totalEntryMemory -= entry.memoryUsage()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add back 8-bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment.

func (s *utxoCache) FetchEntryByHash(hash *chainhash.Hash) (*UtxoEntry, error) {
s.mtx.Lock()
entry, err := s.getEntryByHash(hash)
defer s.mtx.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to defer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

// This method should be called with the state lock held.
func (s *utxoCache) addEntry(outpoint wire.OutPoint, entry *UtxoEntry, overwrite bool) error {
// Don't add provably unspendable outputs.
if txscript.IsUnspendable(entry.pkScript) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

func dbPutUtxoView(dbTx database.Tx, view *UtxoViewpoint) error {
// dbPutUtxoEntry uses an existing database transaction to update the utxo entry
// in the database.
func dbPutUtxoEntry(dbTx database.Tx, outpoint wire.OutPoint, entry *UtxoEntry) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider batching this, so that it can write an entire map[wire.OutPoint]*UtxoEntry, similar comment for deletion.

in profiling I found that the call to retrieve the utxoset bucket was taking as much time as writing the elements, so batching should amortize this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so retrieving the bucket is costly? What about just having the bucket in a variable (using sync.Once to make it safe)?

I could batch it, though.

return flushBatch(dbTx)
})
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roll back to latest stable state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That error means a disk error of some sort. Likely that all further IO operations would also error. Besides, there is logic to recover on startup when shut down in the middle of a flush.

}

// newUtxoState initiates a new utxo state instance.
func newUtxoState(db database.DB, maxTotalMemoryUsage uint64) *utxoCache {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

godoc needs update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True that. Had a totally different implementation at first that was called utxoState :)

func (entry *UtxoEntry) Spend() {
// Nothing to do if the output is already spent.
if entry.IsSpent() {
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could be dangerous, as we would silently "accept" a double spend. Are there cases where we can safely call Spend() more than once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's only called in the methods that have overwrite=true, which are used to roll forward changes when the state was left in the middle of a flush. They might be on the disk in a spent state, and while "force roll forward" blocks aka applying them with overwrite=true, they could be marked spent again.

I do recognize that the method might be better renamed to MarkSpent of some kind, so that it's more clear it doesn't actually "spend" the entry. That would break the existing public interface, though.

@cfromknecht
Copy link
Contributor

I followed up last night and tried to get debug logs for this event, but it seems the debug logs slowed it down enough to get past the 93k block range. This seems like it would point to some kind of concurrency issue.

I tried the cache with various sizes and found that 100MB worked pretty well OMM. I was getting 100-200 blocks/10s well into 2014.

When I had the cache size cranked up to 8GB, 85% of the runtime was spent garbage collecting. I think this was being caused by the map itself as it continually grows, as well as the serialization of output scripts in BtcDecode.

Here are some profiles with the 100MB cache w/ modest internet connection:

Showing nodes accounting for 316.29MB, 82.24% of 384.58MB total                                                                                                
Dropped 57 nodes (cum <= 1.92MB)                                                                                                                               
Showing top 50 nodes out of 119                                                                                                                                
      flat  flat%   sum%        cum   cum%   
         0     0%     0%   254.76MB 66.24%  github.com/btcsuite/btcd/netsync.(*SyncManager).blockHandler
         0     0%     0%   253.60MB 65.94%  github.com/btcsuite/btcd/blockchain.(*BlockChain).ProcessBlock
         0     0%     0%   253.60MB 65.94%  github.com/btcsuite/btcd/blockchain.(*BlockChain).maybeAcceptBlock
         0     0%     0%   253.60MB 65.94%  github.com/btcsuite/btcd/netsync.(*SyncManager).handleBlockMsg
         0     0%     0%   252.10MB 65.55%  github.com/btcsuite/btcd/blockchain.(*BlockChain).connectBestChain
         0     0%     0%   227.78MB 59.23%  github.com/btcsuite/btcd/blockchain.(*BlockChain).connectBlock
         0     0%     0%   151.27MB 39.33%  github.com/btcsuite/btcd/blockchain.(*utxoCache).Commit
  141.77MB 36.86% 36.86%   141.77MB 36.86%  github.com/btcsuite/btcd/blockchain.(*utxoCache).addEntry
         0     0% 36.86%    97.50MB 25.35%  github.com/btcsuite/btcd/database/ffldb.(*db).View
         0     0% 36.86%    96.52MB 25.10%  runtime.main
         0     0% 36.86%    95.41MB 24.81%  main.btcdMain
         0     0% 36.86%    95.41MB 24.81%  main.main
         0     0% 36.86%    91.41MB 23.77%  main.newServer
         0     0% 36.86%    79.19MB 20.59%  github.com/btcsuite/btcd/blockchain.(*BlockChain).initChainState
   30.88MB  8.03% 44.89%    79.19MB 20.59%  github.com/btcsuite/btcd/blockchain.(*BlockChain).initChainState.func2
         0     0% 44.89%    79.19MB 20.59%  github.com/btcsuite/btcd/blockchain.New
         0     0% 44.89%    74.73MB 19.43%  github.com/btcsuite/btcd/database/ffldb.(*db).Update
         0     0% 44.89%       61MB 15.86%  github.com/btcsuite/btcd/blockchain.(*utxoCache).Flush
         0     0% 44.89%       61MB 15.86%  github.com/btcsuite/btcd/blockchain.(*utxoCache).flush
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/ffldb.(*dbCache).commitTx
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/ffldb.(*transaction).Commit
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/ffldb.(*transaction).writePendingAndCommit
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/internal/treap.(*Immutable).Put
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/internal/treap.(*Mutable).ForEach
      37MB  9.62% 54.51%       37MB  9.62%  github.com/btcsuite/btcd/database/internal/treap.cloneTreapNode (inline)
   12.50MB  3.25% 57.76%       26MB  6.76%  github.com/btcsuite/btcd/blockchain.CalcWork
         0     0% 57.76%       26MB  6.76%  github.com/btcsuite/btcd/blockchain.initBlockNode
         0     0% 57.76%    25.41MB  6.61%  github.com/btcsuite/btcd/peer.(*Peer).inHandler
         0     0% 57.76%       25MB  6.50%  github.com/btcsuite/btcd/blockchain.(*utxoCache).flush.func2
         0     0% 57.76%       25MB  6.50%  github.com/btcsuite/btcd/blockchain.(*utxoCache).flush.func3
         0     0% 57.76%    24.79MB  6.45%  github.com/btcsuite/btcd/peer.(*Peer).readMessage
    0.62MB  0.16% 57.92%    24.79MB  6.45%  github.com/btcsuite/btcd/wire.ReadMessageWithEncodingN
         0     0% 57.92%    24.31MB  6.32%  github.com/btcsuite/btcd/blockchain.(*UtxoViewpoint).addInputUtxos
       6MB  1.56% 59.48%    24.31MB  6.32%  github.com/btcsuite/btcd/blockchain.(*utxoCache).fetchAndCacheEntry
         0     0% 59.48%    24.31MB  6.32%  github.com/btcsuite/btcd/blockchain.(*utxoCache).getEntry
         0     0% 59.48%    24.17MB  6.29%  github.com/btcsuite/btcd/wire.(*MsgBlock).BtcDecode
   22.17MB  5.77% 65.25%    24.17MB  6.29%  github.com/btcsuite/btcd/wire.(*MsgTx).BtcDecode
   22.31MB  5.80% 71.05%    22.31MB  5.80%  github.com/btcsuite/btcd/blockchain.(*blockIndex).addNode (inline)
         0     0% 71.05%    22.04MB  5.73%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readRawBlock
   22.04MB  5.73% 76.78%    22.04MB  5.73%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/util.(*BufferPool).Get
         0     0% 76.78%    21.50MB  5.59%  github.com/btcsuite/btcd/database/ffldb.(*dbCache).commitTx.func1
      21MB  5.46% 82.24%       21MB  5.46%  github.com/btcsuite/btcd/database/ffldb.bucketizedKey (inline)
         0     0% 82.24%    20.50MB  5.33%  github.com/btcsuite/btcd/database/ffldb.(*dbCache).commitTx.func2
         0     0% 82.24%    18.31MB  4.76%  github.com/btcsuite/btcd/blockchain.(*utxoCache).fetchAndCacheEntry.func1
         0     0% 82.24%    18.31MB  4.76%  github.com/btcsuite/btcd/blockchain.dbFetchUtxoEntry
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/database/ffldb.(*bucket).Get
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/database/ffldb.(*dbCacheSnapshot).Get
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/database/ffldb.(*transaction).fetchKey
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).get
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*Snapshot).Get
Showing nodes accounting for 59.33s, 81.27% of 73s total
Dropped 436 nodes (cum <= 0.36s)
Showing top 50 nodes out of 130
      flat  flat%   sum%        cum   cum%
     0.02s 0.027% 0.027%     44.10s 60.41%  runtime.systemstack
         0     0% 0.027%     40.22s 55.10%  runtime.gcBgMarkWorker
     0.40s  0.55%  0.58%     40.06s 54.88%  runtime.gcDrain
         0     0%  0.58%     40.05s 54.86%  runtime.gcBgMarkWorker.func2
    14.57s 19.96% 20.53%     30.55s 41.85%  runtime.scanobject
         0     0% 20.53%     22.12s 30.30%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).tCompaction
         0     0% 20.53%     22.12s 30.30%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).tableAutoCompaction
         0     0% 20.53%     22.12s 30.30%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).tableCompaction
         0     0% 20.53%     21.98s 30.11%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).compactionTransact
     0.03s 0.041% 20.58%     21.84s 29.92%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*tableCompactionBuilder).run
         0     0% 20.58%     13.93s 19.08%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readRawBlock
         0     0% 20.58%     13.36s 18.30%  internal/poll.(*FD).Pread
         0     0% 20.58%     13.36s 18.30%  os.(*File).ReadAt
         0     0% 20.58%     13.36s 18.30%  os.(*File).pread
    13.29s 18.21% 38.78%     13.29s 18.21%  syscall.Syscall6
         0     0% 38.78%     13.22s 18.11%  syscall.Pread
         0     0% 38.78%     13.11s 17.96%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readBlockCached
         0     0% 38.78%     13.09s 17.93%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readBlock
         0     0% 38.78%     12.35s 16.92%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).getDataIter
         0     0% 38.78%     11.42s 15.64%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/iterator.(*mergedIterator).Next
         0     0% 38.78%     11.29s 15.47%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/iterator.(*indexedIterator).Next
         0     0% 38.78%     11.21s 15.36%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/iterator.(*indexedIterator).setData
     0.01s 0.014% 38.79%     11.07s 15.16%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*indexIter).Get
         0     0% 38.79%     11.06s 15.15%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).getDataIterErr
         0     0% 38.79%     10.15s 13.90%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*tableCompactionBuilder).appendKV
     0.01s 0.014% 38.81%     10.11s 13.85%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*tWriter).append
         0     0% 38.81%     10.08s 13.81%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Writer).Append
     8.68s 11.89% 50.70%      8.68s 11.89%  syscall.Syscall
         0     0% 50.70%      8.60s 11.78%  internal/poll.(*FD).Write
         0     0% 50.70%      8.60s 11.78%  os.(*File).Write
         0     0% 50.70%      8.60s 11.78%  os.(*File).write
     0.01s 0.014% 50.71%      8.49s 11.63%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Writer).finishBlock
         0     0% 50.71%      8.48s 11.62%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Writer).writeBlock
     0.01s 0.014% 50.73%      8.39s 11.49%  syscall.Write
         0     0% 50.73%      8.38s 11.48%  syscall.write
     7.89s 10.81% 61.53%      8.08s 11.07%  runtime.heapBitsForObject
     0.52s  0.71% 62.25%      7.50s 10.27%  runtime.netpoll
     7.02s  9.62% 71.86%      7.02s  9.62%  runtime.kevent
     0.01s 0.014% 71.88%      6.50s  8.90%  runtime.pollWork
         0     0% 71.88%      6.10s  8.36%  github.com/btcsuite/btcd/blockchain.(*BlockChain).ProcessBlock
         0     0% 71.88%      6.10s  8.36%  github.com/btcsuite/btcd/netsync.(*SyncManager).blockHandler
         0     0% 71.88%      6.10s  8.36%  github.com/btcsuite/btcd/netsync.(*SyncManager).handleBlockMsg
         0     0% 71.88%      5.25s  7.19%  github.com/btcsuite/btcd/blockchain.(*BlockChain).maybeAcceptBlock
         0     0% 71.88%      5.18s  7.10%  github.com/btcsuite/btcd/blockchain.(*BlockChain).connectBestChain
     4.54s  6.22% 78.10%      4.54s  6.22%  runtime.memmove
         0     0% 78.10%      4.41s  6.04%  runtime.(*gcWork).balance
         0     0% 78.10%      4.41s  6.04%  runtime.handoff
     2.13s  2.92% 81.01%      4.13s  5.66%  runtime.greyobject
     0.19s  0.26% 81.27%      3.41s  4.67%  runtime.mallocgc
         0     0% 81.27%      3.12s  4.27%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*version).get

@cfromknecht
Copy link
Contributor

As a feature request, it would also be great if the log msgs printed out the current size of utxocache :)

@stevenroose
Copy link
Contributor Author

I bumped in the fastAdd set in the side chain case? thing as well. It's been more than a month since I was testing that, though, don't remember very well the current state of my progress :/ (I'm changing jobs and moving to another country, so don't have much time until probably September.)

If you're interested in following up on this, I can give you commit access to my repo so you can push to the branch. I'll push changes regarding your remarks right now though.

@stevenroose
Copy link
Contributor Author

I noticed that I was tracking down a specific bug. You can see that one of the unit tests fails. I was baffled by what was happening there: the result of that test are not deterministic while they definitely should be. It should end with 5 elements in the cache and ends with either 3 or 4.

These are the extra prints that I added to get a better view on what happens:
https://gist.github.com/stevenroose/d8350a13ebdfeb8fd8d2161ebf5011a5

@stevenroose
Copy link
Contributor Author

stevenroose commented Jun 15, 2018

As a feature request, it would also be great if the log msgs printed out the current size of utxocache :)

Done.

@zquestz
Copy link

zquestz commented Nov 2, 2018

@stevenroose are you still working on this? This is a key feature and I would love to assist in getting it completed and tested.

@stevenroose
Copy link
Contributor Author

I kinda quite btcd development for now. So no, not working on it anymore, sadly. It's mostly finished, but there is a bug I didn't find. Somehow I introduced an inconsistency. I think the culprit is the fresh flag assignment. I tried to base this on the Core implementation, but since our structure is a bit different, it didn't really work.

It's been some time, I'd say take a look and compare with Core's coins.cpp.

@zquestz
Copy link

zquestz commented Nov 2, 2018

@stevenroose thank you for responding. We are going to work on this for bchd. Do you mind if I ping you with questions if we have issues?

Once it is complete I will happily post a completed PR for btcd as well. I want everyone to benefit from this. =)

@zquestz
Copy link

zquestz commented Nov 2, 2018

I followed up last night and tried to get debug logs for this event, but it seems the debug logs slowed it down enough to get past the 93k block range. This seems like it would point to some kind of concurrency issue.

I tried the cache with various sizes and found that 100MB worked pretty well OMM. I was getting 100-200 blocks/10s well into 2014.

When I had the cache size cranked up to 8GB, 85% of the runtime was spent garbage collecting. I think this was being caused by the map itself as it continually grows, as well as the serialization of output scripts in BtcDecode.

Here are some profiles with the 100MB cache w/ modest internet connection:

Showing nodes accounting for 316.29MB, 82.24% of 384.58MB total                                                                                                
Dropped 57 nodes (cum <= 1.92MB)                                                                                                                               
Showing top 50 nodes out of 119                                                                                                                                
      flat  flat%   sum%        cum   cum%   
         0     0%     0%   254.76MB 66.24%  github.com/btcsuite/btcd/netsync.(*SyncManager).blockHandler
         0     0%     0%   253.60MB 65.94%  github.com/btcsuite/btcd/blockchain.(*BlockChain).ProcessBlock
         0     0%     0%   253.60MB 65.94%  github.com/btcsuite/btcd/blockchain.(*BlockChain).maybeAcceptBlock
         0     0%     0%   253.60MB 65.94%  github.com/btcsuite/btcd/netsync.(*SyncManager).handleBlockMsg
         0     0%     0%   252.10MB 65.55%  github.com/btcsuite/btcd/blockchain.(*BlockChain).connectBestChain
         0     0%     0%   227.78MB 59.23%  github.com/btcsuite/btcd/blockchain.(*BlockChain).connectBlock
         0     0%     0%   151.27MB 39.33%  github.com/btcsuite/btcd/blockchain.(*utxoCache).Commit
  141.77MB 36.86% 36.86%   141.77MB 36.86%  github.com/btcsuite/btcd/blockchain.(*utxoCache).addEntry
         0     0% 36.86%    97.50MB 25.35%  github.com/btcsuite/btcd/database/ffldb.(*db).View
         0     0% 36.86%    96.52MB 25.10%  runtime.main
         0     0% 36.86%    95.41MB 24.81%  main.btcdMain
         0     0% 36.86%    95.41MB 24.81%  main.main
         0     0% 36.86%    91.41MB 23.77%  main.newServer
         0     0% 36.86%    79.19MB 20.59%  github.com/btcsuite/btcd/blockchain.(*BlockChain).initChainState
   30.88MB  8.03% 44.89%    79.19MB 20.59%  github.com/btcsuite/btcd/blockchain.(*BlockChain).initChainState.func2
         0     0% 44.89%    79.19MB 20.59%  github.com/btcsuite/btcd/blockchain.New
         0     0% 44.89%    74.73MB 19.43%  github.com/btcsuite/btcd/database/ffldb.(*db).Update
         0     0% 44.89%       61MB 15.86%  github.com/btcsuite/btcd/blockchain.(*utxoCache).Flush
         0     0% 44.89%       61MB 15.86%  github.com/btcsuite/btcd/blockchain.(*utxoCache).flush
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/ffldb.(*dbCache).commitTx
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/ffldb.(*transaction).Commit
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/ffldb.(*transaction).writePendingAndCommit
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/internal/treap.(*Immutable).Put
         0     0% 44.89%       42MB 10.92%  github.com/btcsuite/btcd/database/internal/treap.(*Mutable).ForEach
      37MB  9.62% 54.51%       37MB  9.62%  github.com/btcsuite/btcd/database/internal/treap.cloneTreapNode (inline)
   12.50MB  3.25% 57.76%       26MB  6.76%  github.com/btcsuite/btcd/blockchain.CalcWork
         0     0% 57.76%       26MB  6.76%  github.com/btcsuite/btcd/blockchain.initBlockNode
         0     0% 57.76%    25.41MB  6.61%  github.com/btcsuite/btcd/peer.(*Peer).inHandler
         0     0% 57.76%       25MB  6.50%  github.com/btcsuite/btcd/blockchain.(*utxoCache).flush.func2
         0     0% 57.76%       25MB  6.50%  github.com/btcsuite/btcd/blockchain.(*utxoCache).flush.func3
         0     0% 57.76%    24.79MB  6.45%  github.com/btcsuite/btcd/peer.(*Peer).readMessage
    0.62MB  0.16% 57.92%    24.79MB  6.45%  github.com/btcsuite/btcd/wire.ReadMessageWithEncodingN
         0     0% 57.92%    24.31MB  6.32%  github.com/btcsuite/btcd/blockchain.(*UtxoViewpoint).addInputUtxos
       6MB  1.56% 59.48%    24.31MB  6.32%  github.com/btcsuite/btcd/blockchain.(*utxoCache).fetchAndCacheEntry
         0     0% 59.48%    24.31MB  6.32%  github.com/btcsuite/btcd/blockchain.(*utxoCache).getEntry
         0     0% 59.48%    24.17MB  6.29%  github.com/btcsuite/btcd/wire.(*MsgBlock).BtcDecode
   22.17MB  5.77% 65.25%    24.17MB  6.29%  github.com/btcsuite/btcd/wire.(*MsgTx).BtcDecode
   22.31MB  5.80% 71.05%    22.31MB  5.80%  github.com/btcsuite/btcd/blockchain.(*blockIndex).addNode (inline)
         0     0% 71.05%    22.04MB  5.73%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readRawBlock
   22.04MB  5.73% 76.78%    22.04MB  5.73%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/util.(*BufferPool).Get
         0     0% 76.78%    21.50MB  5.59%  github.com/btcsuite/btcd/database/ffldb.(*dbCache).commitTx.func1
      21MB  5.46% 82.24%       21MB  5.46%  github.com/btcsuite/btcd/database/ffldb.bucketizedKey (inline)
         0     0% 82.24%    20.50MB  5.33%  github.com/btcsuite/btcd/database/ffldb.(*dbCache).commitTx.func2
         0     0% 82.24%    18.31MB  4.76%  github.com/btcsuite/btcd/blockchain.(*utxoCache).fetchAndCacheEntry.func1
         0     0% 82.24%    18.31MB  4.76%  github.com/btcsuite/btcd/blockchain.dbFetchUtxoEntry
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/database/ffldb.(*bucket).Get
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/database/ffldb.(*dbCacheSnapshot).Get
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/database/ffldb.(*transaction).fetchKey
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).get
         0     0% 82.24%    17.81MB  4.63%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*Snapshot).Get
Showing nodes accounting for 59.33s, 81.27% of 73s total
Dropped 436 nodes (cum <= 0.36s)
Showing top 50 nodes out of 130
      flat  flat%   sum%        cum   cum%
     0.02s 0.027% 0.027%     44.10s 60.41%  runtime.systemstack
         0     0% 0.027%     40.22s 55.10%  runtime.gcBgMarkWorker
     0.40s  0.55%  0.58%     40.06s 54.88%  runtime.gcDrain
         0     0%  0.58%     40.05s 54.86%  runtime.gcBgMarkWorker.func2
    14.57s 19.96% 20.53%     30.55s 41.85%  runtime.scanobject
         0     0% 20.53%     22.12s 30.30%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).tCompaction
         0     0% 20.53%     22.12s 30.30%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).tableAutoCompaction
         0     0% 20.53%     22.12s 30.30%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).tableCompaction
         0     0% 20.53%     21.98s 30.11%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*DB).compactionTransact
     0.03s 0.041% 20.58%     21.84s 29.92%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*tableCompactionBuilder).run
         0     0% 20.58%     13.93s 19.08%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readRawBlock
         0     0% 20.58%     13.36s 18.30%  internal/poll.(*FD).Pread
         0     0% 20.58%     13.36s 18.30%  os.(*File).ReadAt
         0     0% 20.58%     13.36s 18.30%  os.(*File).pread
    13.29s 18.21% 38.78%     13.29s 18.21%  syscall.Syscall6
         0     0% 38.78%     13.22s 18.11%  syscall.Pread
         0     0% 38.78%     13.11s 17.96%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readBlockCached
         0     0% 38.78%     13.09s 17.93%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).readBlock
         0     0% 38.78%     12.35s 16.92%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).getDataIter
         0     0% 38.78%     11.42s 15.64%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/iterator.(*mergedIterator).Next
         0     0% 38.78%     11.29s 15.47%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/iterator.(*indexedIterator).Next
         0     0% 38.78%     11.21s 15.36%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/iterator.(*indexedIterator).setData
     0.01s 0.014% 38.79%     11.07s 15.16%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*indexIter).Get
         0     0% 38.79%     11.06s 15.15%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Reader).getDataIterErr
         0     0% 38.79%     10.15s 13.90%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*tableCompactionBuilder).appendKV
     0.01s 0.014% 38.81%     10.11s 13.85%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*tWriter).append
         0     0% 38.81%     10.08s 13.81%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Writer).Append
     8.68s 11.89% 50.70%      8.68s 11.89%  syscall.Syscall
         0     0% 50.70%      8.60s 11.78%  internal/poll.(*FD).Write
         0     0% 50.70%      8.60s 11.78%  os.(*File).Write
         0     0% 50.70%      8.60s 11.78%  os.(*File).write
     0.01s 0.014% 50.71%      8.49s 11.63%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Writer).finishBlock
         0     0% 50.71%      8.48s 11.62%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb/table.(*Writer).writeBlock
     0.01s 0.014% 50.73%      8.39s 11.49%  syscall.Write
         0     0% 50.73%      8.38s 11.48%  syscall.write
     7.89s 10.81% 61.53%      8.08s 11.07%  runtime.heapBitsForObject
     0.52s  0.71% 62.25%      7.50s 10.27%  runtime.netpoll
     7.02s  9.62% 71.86%      7.02s  9.62%  runtime.kevent
     0.01s 0.014% 71.88%      6.50s  8.90%  runtime.pollWork
         0     0% 71.88%      6.10s  8.36%  github.com/btcsuite/btcd/blockchain.(*BlockChain).ProcessBlock
         0     0% 71.88%      6.10s  8.36%  github.com/btcsuite/btcd/netsync.(*SyncManager).blockHandler
         0     0% 71.88%      6.10s  8.36%  github.com/btcsuite/btcd/netsync.(*SyncManager).handleBlockMsg
         0     0% 71.88%      5.25s  7.19%  github.com/btcsuite/btcd/blockchain.(*BlockChain).maybeAcceptBlock
         0     0% 71.88%      5.18s  7.10%  github.com/btcsuite/btcd/blockchain.(*BlockChain).connectBestChain
     4.54s  6.22% 78.10%      4.54s  6.22%  runtime.memmove
         0     0% 78.10%      4.41s  6.04%  runtime.(*gcWork).balance
         0     0% 78.10%      4.41s  6.04%  runtime.handoff
     2.13s  2.92% 81.01%      4.13s  5.66%  runtime.greyobject
     0.19s  0.26% 81.27%      3.41s  4.67%  runtime.mallocgc
         0     0% 81.27%      3.12s  4.27%  github.com/btcsuite/btcd/vendor/github.com/btcsuite/goleveldb/leveldb.(*version).get

Any chance you also tested this with a 2GB cache size? Would like to know where the sweet spot is in terms of performance.

@stevenroose
Copy link
Contributor Author

Yeah since something didn't work with keeping things consistent, I mostly ran with a small cache to trigger the write-through behavior etc.

You can always reach out to me on Jabber: steven@konuro.net. I'm on IRC as well (stevenroose@Freenode (but I don't see private messages well, better ping me in #btcd) or stevenroose@Gitter bridge (same with private messages)).

// avoid redoing the work when interrupted.
rollforwardBatch := func(dbTx database.Tx, node *blockNode) (*blockNode, error) {
nbBatchBlocks := 0
for ; node.height <= tip.height; node = node.parent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are rolling back nodes here rather than rolling them forward.

Should be something more like:

for e := attachNodes.Front(); e != nil; e = e.Next() {
			node = e.Value.(*blockNode)
			attachNodes.Remove(e)

if !overwrite {
// Prevent overwriting not-fully-spent entries. Note that this is not
// a consensus check.
if cachedEntry != nil && !cachedEntry.IsSpent() {
Copy link
Contributor

@cpacia cpacia Nov 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this code is failing at block 94804 is because there are two coinbases in the chain which share the same txid with previous coinbases and this line errors when adding the duplicate coinbase. There error bubbles up to the commit() function but you don't handle the error there.

Ultimately the commit terminates and a later transaction fails to validate when the utxos that failed to be committed don't exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh god damnit Bitcoin historical bugs :D Yeah that makes sense. Any idea how Core handles this? Does it just have two if-clauses here? :/

// the best chain.
var statusNode *blockNode
var statusNodeNext *blockNode // the first one higher than the statusNode
for node := tip; node.height > 0; node = node.parent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node.height > 0 wont enter this loop if the tip is the genesis block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting it should be >= 0? That doesn't make sense, though, will need another check.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed. I was debugging yesterday and you are 100% correct.

@cpacia
Copy link
Contributor

cpacia commented Nov 7, 2018

We've been working on this in the Bitcoin Cash version. I think we've got most of the bugs worked out and it seems to be working good if you guys want to take a look gcash/bchd#21

Only real changes are related to the canonical transaction ordering.

@stevenroose
Copy link
Contributor Author

Do you guys plan to backport the fixes? A PR to my fork would be very much appreciated so I can include your fixes in this PR.

@cpacia
Copy link
Contributor

cpacia commented Nov 9, 2018

We can probably port the fixes here. One other issue of note is we believe because the database has its own cache the flushing flag sometimes does not get written to the db on a hard shutdown during a flush. Then on the next startup it fails to rollback the chain and errors when rolling blocks forward.

One possible fix is to always attempt to rollback the chain when inconsistent regardless of whether the shutdown happened while flushing or not. That seems to be working pretty reliably on our end.

@Roasbeef
Copy link
Member

Picked back up here #1373

@jakesylvestre
Copy link
Collaborator

@jcvernaleo (as per #1530)

  • Low priority
  • Outdated

Can probably close in favor of #1373

@Roasbeef
Copy link
Member

Roasbeef commented Mar 5, 2020

Yeah #1373 is the more updated version, so this one can be closed. There's still a lingering BIP 30 related bug, and a re-org rewind bug. Aside from those, with that PR and some of the other optimization PRs, we've been able to get sync down to < 24 hours which is a massive win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants