Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockchain, cmd, netsync, main: Add utxocache #1955

Merged
merged 17 commits into from Dec 19, 2023

Conversation

kcalvinalvin
Copy link
Collaborator

@kcalvinalvin kcalvinalvin commented Mar 1, 2023

Updated the PR and is now very different from #1373.

Main differences are:

No utxoviewpoint middleman during headers-first sync

In #1373, the view is still used as a middleman. The flow for doing block validation would be:

1. Make a new UtxoViewPoint.
2. Load the relevant inputs into the UtxoViewPoint from the cache.
3. Connect transactions in the UtxoViewPoint
4. Apply the connections to the cache from the UtxoViewPoint
5. Flush the cache to the database when needed

The problems with these approach were:
(2) is extremely expensive as there a lot of memory allocated by creating a whole new set of key/value pairs of the existing UTXOs.
(4) introduces a new code path from [view -> cache] and in turn introduces a whole new set of bugs.

The new code is different where the flow during headers-first (blocks up until the checkpoint) is:

1. Connect transactions in the cache
2. Flush the cache to the database when needed

This flow is much simpler and faster than the previous code.

For when the node is not in headers-first mode:

1. Make a new UtxoViewPoint
2. Load the relevant inputs into the UtxoViewPoint from the cache.
3. CheckConnectBlock with the UtxoViewPoint
4. Connect transactions in the cache
5. Flush the cache to the database when needed

This flow still has the overhead of (2) but no longer has the new bugs introduced by applying the transaction connections from [view -> cache]. Instead, the view is only used as a temporary mutable map just for checkConnectBlock and the txs are connected in cache just like in headers-first mode.

No use of the cache during reorganizations

During reorganizations, a whole new possible bug was introduced where if the node is unexpectedly shutdown, the disconnected txs may only have been updated in the cache and not in the database. The new code forces a flush before the reorganization happens and keeps the same code as before, eliminating the new case of bugs introduced in #1373.

This allowed the removal of TestUtxoCache_Reorg and all the associated (and duplicated) code.

No more utxoStateConsistency struct

It was enough to just save the block hash and not the single byte indicating if the cache is consistent or not. All the related code was removed.

Map slice instead of a singular map

The singular map doubles in size when the map gets full. For bigger maps, this may result in us going over the promised memory limit. The mapslice allows us to allocate multiple maps with differing sizes, allowing acute control of the memory allocated memory.

Another solution to this would be to just force the user to only allow fixed amounts of memory for the utxo cache (1GB, 2GB, 4GB, 8GB, etc). This would get rid of the code in sizehelper.go and mapslice related code in utxocache.go but wouldn't allow the user flexibility. I'm ok with either route. This would get rid of 491 lines of code from being added.

Known issue

During a flush, there's a large allocation of memory due to the fact that database/ffldb will hold onto every single bucket write to guarantee atomicity. It'll usually result in 2x the memory from what the cache is using. If the cache is using up 2GB, ffldb will allocate 2GB as well, resulting in 4GB of memory being taken up by btcd. I don't think there's any easy way around this other than making a different code path for utxo cache flushes.

@coveralls
Copy link

coveralls commented Mar 1, 2023

Pull Request Test Coverage Report for Build 5152398437

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • 490 of 651 (75.27%) changed or added relevant lines in 11 files are covered.
  • 245 unchanged lines in 5 files lost coverage.
  • Overall coverage increased (+0.2%) to 55.522%

Changes Missing Coverage Covered Lines Changed/Added Lines %
config.go 0 1 0.0%
blockchain/sizehelper.go 84 87 96.55%
netsync/blocklogger.go 0 4 0.0%
blockchain/utxoviewpoint.go 31 36 86.11%
netsync/manager.go 0 8 0.0%
server.go 0 10 0.0%
blockchain/chainio.go 27 38 71.05%
blockchain/chain.go 43 58 74.14%
blockchain/utxocache.go 290 394 73.6%
Files with Coverage Reduction New Missed Lines %
peer/peer.go 7 73.2%
database/ffldb/db.go 12 92.64%
netsync/manager.go 18 0.0%
blockchain/chainio.go 48 61.1%
blockchain/chain.go 160 71.37%
Totals Coverage Status
Change from base Build 5075515644: 0.2%
Covered Lines: 27104
Relevant Lines: 48817

💛 - Coveralls

@Roasbeef
Copy link
Member

Roasbeef commented Mar 8, 2023

Giving this a shot for a mainnet sync, will report back if it gets to a higher height than referenced above (I think the past I ran into an issue at that height too).

Running with the following config:

 GOGC=50 btcd --txindex --nopeerbloomfilters --maxpeers=8 --utxocachemaxsize=10000

@Roasbeef
Copy link
Member

Roasbeef commented Mar 8, 2023

Had to restart my sync as had some memory pressure limits incorrectly configured (though surprised it was killed given the box has 32 GB of RAM and was running with a 10 GB cache size).

I also think the current cache size estimate might be significantly undercounting the true over head. Will do some profiles to dig into things.

@Roasbeef
Copy link
Member

Roasbeef commented Mar 8, 2023

Heap profile after around restart with 3 GB of cache used (note how newUtxoCache seems to report double that):

Screenshot 2023-03-08 at 11 37 02 AM

Aside from cache, can see that memory building up for: cloning the cache entries, and also just decoding transactions themselves. Re tx decoding, we have some PRs up that attempt to reduce garbage during that process, and also cache the serialization to make hashing faster (but this is still in check pointed land), but ofc one task at a time.

Logs of cache size+speed:

2023-03-08 11:35:47.871 [INF] SYNC: Processed 335 blocks in the last 10.05s (264489 transactions, height 339095, 2015-01-15 11:34:26 -0800 PST, ~2751 MiB cache)
2023-03-08 11:35:57.873 [INF] SYNC: Processed 436 blocks in the last 10s (284726 transactions, height 339531, 2015-01-18 11:31:10 -0800 PST, ~2816 MiB cache)
2023-03-08 11:36:07.877 [INF] SYNC: Processed 428 blocks in the last 10s (304451 transactions, height 339959, 2015-01-21 13:41:44 -0800 PST, ~2868 MiB cache)
2023-03-08 11:36:17.894 [INF] SYNC: Processed 406 blocks in the last 10.01s (279593 transactions, height 340365, 2015-01-24 16:50:34 -0800 PST, ~2912 MiB cache)

@Roasbeef
Copy link
Member

Roasbeef commented Mar 8, 2023

Another profile taken at ~7GB cache size:

Screenshot 2023-03-08 at 11 50 56 AM

@Roasbeef
Copy link
Member

Roasbeef commented Mar 8, 2023

One other thing I notice due to the cache design (purge everything once full, instead of just evicting on the fly) is that post purge, there's a significant bump in I/O to refill the cache we guarantee that the very next block triggers thousands of cache misses. The machine I'm running on has a pretty fast SSD (Samsung 980 Pro M.2 Gen 4) but validation still sees a tangible drop off due to the increased I/O.

An alternative strategy would be something like:

  • Keep the normal height/age based eviction in place
  • Once persistently at capacity, start to evict very old UTXOs in the background to make space for new items
    • This can be done in a batched manner

@Roasbeef
Copy link
Member

Roasbeef commented Mar 8, 2023

CPU profile shows most of the time is spent on GC (this is the default GOGC setting)

Screenshot 2023-03-08 at 12 40 15 PM

@Roasbeef
Copy link
Member

Roasbeef commented Mar 9, 2023

I think the other issue with this design is the flushing. Rather than incrementally evict items from the cache to write back on disk, we end up flushing the entire cache back to disk all at once. For every large caches, I've seen this take up to 20 minutes to accomplish (still flushing as I type this), during which, the entire block download is stalled.

Post flush, looks like the runtime spends a lot of time trying to free the GBs allocated back to the OS:
Screenshot 2023-03-08 at 5 33 27 PM

@kcalvinalvin
Copy link
Collaborator Author

Yeah I've noticed the slowdowns during flushes as well. Memory allocation during flushes were a big slowdown for me as well.

Here's a snippet of my log during ibd (6GB of cache, GOGC=100, memlimit of 30GB):

2023-03-02 12:55:08.409 [INF] SYNC: Processed 46 blocks in the last 10.05s (91607 transactions, height 674986, 2021-03-17 14:55:10 +0900 KST, ~5975 MiB cache)                                                                                                                                               
2023-03-02 12:55:18.452 [INF] SYNC: Processed 59 blocks in the last 10.04s (123815 transactions, height 675045, 2021-03-18 00:35:34 +0900 KST, ~5997 MiB cache)                                                                                                                                              
2023-03-02 12:55:19.168 [INF] CHAN: Flushing UTXO cache of ~6001 MiB to disk. For large sizes, this can take up to several minutes...                                                                                                                                                                        
2023-03-02 13:08:44.185 [INF] SYNC: Processed 6 blocks in the last 13m25.73s (14146 transactions, height 675051, 2021-03-18 02:05:39 +0900 KST, ~0 MiB cache)                                      

Heap profile after around restart with 3 GB of cache used (note how newUtxoCache seems to report double that):

Was this taken with the flag --profile? I'll check up on the size calculations with the utxo cache.

Keep the normal height/age based eviction in place
Once persistently at capacity, start to evict very old UTXOs in the background to make space for new items
This can be done in a batched manner

So for the Utreexo project, we did some testing on efficient methods of caching utxo entries and @adiabat + @cbao1198 had concluded that evicting old entries after the threshold you set is reached. There's a relevant github discussion here mit-dci/utreexo#325 but iirc there were graphs generated to visualize these. I'll look into the evicting method first since there's some evidence of it being good.

@abitrolly
Copy link
Contributor

Can Go release memory immediately after a variable is released?

It looks like there are two caches here. 1 cache TX lookup. 2 buffer for writing to disk.

The lookup cache can be a FIFO queue, where items are moved if they are being hit. The oldest part of FIFO is written to the disk periodically. There could be good static memory structure for such data that doesn't require GC. Is that what eviction means?

@kcalvinalvin
Copy link
Collaborator Author

Verified that the memory calculation is wrong. Working on a fix and will push a fix soon.

@kcalvinalvin
Copy link
Collaborator Author

Can Go release memory immediately after a variable is released?

With arenas I believe you could. But it wouldn't be for a single variable.

The lookup cache can be a FIFO queue, where items are moved if they are being hit. The oldest part of FIFO is written to the disk periodically. There could be good static memory structure for such data that doesn't require GC. Is that what eviction means?

Yeah basically.

@Roasbeef
Copy link
Member

Roasbeef commented Mar 9, 2023

Was this taken with the flag --profile? I'll check up on the size calculations with the utxo cache.

Yeah, then using this command:
(heap)

go tool pprof --http=localhost:8083 http://localhost:5050/debug/pprof/heap

(cpu)

go tool pprof --http=localhost:8083 http://localhost:5050/debug/pprof/profile

So I was able sync all the way up to current day!

I don't have a good sync time, as I had to restart a few times, before settling into a 5 GB cache (w/ the default GOGC) settings.

I'll do another run now, with the latest version (after the fix) and this tuned cache size.

@Roasbeef
Copy link
Member

The lookup cache can be a FIFO queue, where items are moved if they are being hit

A hit is basically a deletion: if you're looking up a UTXO, it's because it's been spent in the first place. The cache as is, lets us avoid unnecessary random writes: UTXOs that die young are never written to disk. However, "young" here is relative to the cache size. Once the cache overflows, we write it all to disk, then start from scratch, which guarantees a cold cache (I/O spikes after the purge, as all UTXOs need to be loaded in).

Re GC optimization:

  • Today we allocate a flat map for each item in the cache:
    cachedEntries map[wire.OutPoint]*UtxoEntry
  • When we go to add an item, we end up allocating the entry on the heap, then adding it to the cache.
  • This causes a lot of active pointers that need to be scanned by the GC, this increases as a function of the cache size. Eventually this dominates GC time as after a purge, the GC must remove all these, while we're also adding new items.

One approach (basically a middle ground to the Arena, but also complementary to it potentially), would be to pre-allocate al the UTXO entries in a single swoop, in a single slice. This would allocate all the entries to contiguous (virtual) memory. We'd keep an index of the last unallocated (not set to anything) entry in this slice. We'd then use this to create new UTXO entries, as adding a new entry will just be adding a pointer to the map, instead of actually allocating. Assuming we keep the current purge behavior, then then entire slice (in theory) can be de-allocated all at once, as all the refs in the map have been deleted and also the main slice pointer to longer held onto.

Going further, we could then also actually never de-allocate the slice. We'd just reset that pointer and know that anything "after" that is actually blank. This is more or less the "memory ballast" approach outlined here: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i-learnt-to-stop-worrying-and-love-the-heap/.

IIUC, this has basically been subsumed by the arena allocator, as it gives you a nicer API (that uses generics).

Re this:

Once persistently at capacity, start to evict very old UTXOs in the background to make space for new items

I think the reason the cache is designed the way it is rn (copied from bitcoind) was to enable recovery from unclean shutdowns (to avoid a cache flush each block). I need to revisit that design to properly analyze it, but maybe it turns out this isn't so critical after all. One other side effect of this is:

  • If you set a cache size of 10 GB, then do an unclean shutdown, then do 5 GB, on start up in order to recover the cache size swells back to 10 GB.

Also this flag ( env GODEBUG=gctrace=1) can be very helpful in optimization. As it shows what happens with each GC pause, and how long the runtime spends on that.

@kcalvinalvin kcalvinalvin force-pushed the utxocache-original branch 3 times, most recently from b34a14b to e16b0c1 Compare March 15, 2023 16:37
// I suspect it's because of alignment and how the compiler handles it but
// when compared with how much the compiler allocates, it's a couple hundred
// bytes off.
func CalculateRoughMapSize(hint int, bucketSize uintptr) int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there're functions in the unsafe or runtime package to make this calculation a bit more future proof. Still need to review the improvements, but possible that this behavior isn't consistent across Go versions, or a new version overhauls the map internals which adds estimate drift to our calculations here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came upon this: https://stackoverflow.com/a/31855111

But maybe you've seen it already, using unsafe.Sizeof on the map entry itself may also allow us to get a tighter estimate.

In any case, will do a run with this to see how closely the reported size matches up the the visible RSS.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there're functions in the unsafe or runtime package to make this calculation a bit more future proof. Still need to review the improvements, but possible that this behavior isn't consistent across Go versions, or a new version overhauls the map internals which adds estimate drift to our calculations here.

That could definitely happen. You could grab the raw value of B in struct hmap like how dgraph-io people did. https://github.com/dgraph-io/dgraph/blob/main/posting/size.go but if struct hmap ends up changing how their fields are ordered, it could break that code as well.

I came upon this: https://stackoverflow.com/a/31855111
But maybe you've seen it already, using unsafe.Sizeof on the map entry itself may also allow us to get a tighter estimate.

I didn't see that stackoverflow answer but the selected answer is wrong. The answer states that len(theMap) should be taken to calculate how much the memory the map is taking up but since the map allocates in powers of 2 (+few overflow buckets), that answer will be a lot different than what the actual size of the map is.

unsafe.Sizeof isn't actually unsafe and it only returns a constant value of the fixed part of each data structure, like the pointer and length of a string, but not indirect parts like the contents of the string (pg. 354 of The Go Programming Language by Alan Donovan and Brian Kernighan. So unsafe.Sizeof() will always return 48 bytes for hmap in a 64bit system as the entire struct takes up 6 words. It wouldn't help us better calculate the size of how much memory the buckets are taking up.

What I still don't have an exact answer for is the alignment of wire.Outpoint in the map. wire.Outpoint is 36bytes but since memory is allocated per word, we'll allocate 40 bytes for wire.Outpoint. However, when used in the map, I think the compiler does some optimizations and we actually end up taking up less than 40 bytes when pre-allocating for a lot of key/value pairs. I suspect this is at least one reason why the map size isn't accurate down to the byte.

nbEntries := uint64(len(s.cachedEntries))
// Total memory is the map size + the size that the utxo entries are
// taking up.
size := uint64(CalculateRoughMapSize(s.maxElements, bucketSize))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

//TODO(stevenroose) check if it's ok to drop the pkScript
// Since we don't need it anymore, drop the pkScript value of the entry.
s.totalEntryMemory -= entry.memoryUsage()
entry.pkScript = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re the commit body, yeah this is something we've seen on the lnd side as well: just holding onto a single pointer of transaction still points to the entire allocated script buffer for a block.

Isn't this copied over before hand though? So setting to nil would actually let it be freed up?

There's also another path of using a fixed sized byte array, given that for most scripts they're always under 40 bytes or so, aside from bare multi-sig early in the lifetime of the chain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll save the pprofs and post it to github but what I've noticed was msgTx.Decode() took up a significant size when the utxomaxcachesize was large enough. Profiling the master branch didn't have msgTx.Decode() taking up significant amount of heap.

Looking at the source code for msgTx.Decode() we can see that the deserialized pkscript is actually copied over to the large contiguous byte slice and the pkscript that was allocated separately is returned to the scriptPool.

btcd/wire/msgtx.go

Lines 661 to 676 in a18c2cf

for i := 0; i < len(msg.TxOut); i++ {
// Copy the public key script into the contiguous buffer at the
// appropriate offset.
pkScript := msg.TxOut[i].PkScript
copy(scripts[offset:], pkScript)
// Reset the public key script of the transaction output to the
// slice of the contiguous buffer where the script lives.
scriptSize := uint64(len(pkScript))
end := offset + scriptSize
msg.TxOut[i].PkScript = scripts[offset:end:end]
offset += scriptSize
// Return the temporary script buffer to the pool.
scriptPool.Return(pkScript)
}

These two bits of information gives me a high confidence that the script for a whole block is sticking around because we have a single entry that's still pointing to it. I know for a fact that go doesn't garbage collect parts of a slice as well.

So setting to nil would actually let it be freed up?

Yeah I think that's right in that we should still be replacing with nil since once every last one of the entries stops pointing to it, that whole block's script can be gc-ed. Just really shouldn't be discounting the pkscript after we nil it out since it can still be taking up space.

@@ -232,7 +232,7 @@ type utxoBatcher interface {
// getEntries attempts to fetch a serise of entries from the UTXO view.
// If a single entry is not found within the view, then an error is
// returned.
getEntries(outpoints map[wire.OutPoint]struct{}) (map[wire.OutPoint]*UtxoEntry, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I guess "conventional " wisdom is that struct{} or interface{} doesn't add over head, but that doesn't factor in the map bucket overhead 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I've noticed the map bucket overhead messing with the size calculation.

No need for a map here anyways so I think a slice is a strict improvement.

@@ -640,7 +632,7 @@ func (s *utxoCache) Commit(view *UtxoViewpoint, node *blockNode) error {
if node != nil && isBIP0030Node(node) {
// Use the entry from the view since we want to overwrite the
// cached entry.
cachedEntry = entry.Clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cloning also showed up a lot in my memory profiles, what makes it no longer required? Fact that these never get mutated anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah that was my finding. If the entry is used internally within the blockchain package, we can make sure it's not modified so no need to Clone(). Also, I wasn't sure how much the Clone() does to help with immutability since the pkscript is still mutable as the Clone() here is a shallow clone.

I saw that the mining code does fetch for entries from the tx and one of the tests fails if I don't call Clone() for the entries that are fetched by public functions. But not calling Clone() for functions that are private didn't have any effect and all the code that calls private fetch functions do not mutate the entries.

blockchain/utxocache.go Show resolved Hide resolved
blockchain/utxocache.go Show resolved Hide resolved
func (ms *mapSlice) deleteMaps() {
// The rest of the map is now available to be garbage collected.
m := ms.maps[0]
ms.maps = []map[wire.OutPoint]*UtxoEntry{m}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this once again pre-allocate? Otherwise, it needs to go through the successive doubling as new items are added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The oldest map/largest is kept and the smaller ones are left out so they can be gc-ed.

The current algorithm will take a look at the leftover memory and allocate a map that will fit inside of that. My hunch is that since the entry sizes can vary, it's better to let the smaller map be gc-ed so the second map size can be dynamic based on the needs at that moment.

@kcalvinalvin
Copy link
Collaborator Author

kcalvinalvin commented Mar 21, 2023

Found a bug in utxoCache.InitConsistentState() where if you signal to quit while the utxo cache is rolling back, it'll reject the next block after the tip and cause all the blocks afterwards to be recognized as orphans.

Working towards a fix.

@Roasbeef Roasbeef requested a review from Crypt-iQ April 5, 2023 19:14
@Roasbeef
Copy link
Member

Roasbeef commented Apr 5, 2023

cc @halseth

@kcalvinalvin kcalvinalvin changed the title (WIP) blockchain, cmd, netsync, main: Add utxocache blockchain, cmd, netsync, main: Add utxocache Apr 7, 2023
@kcalvinalvin
Copy link
Collaborator Author

The last 3 commits alleviate the InconsistentState bug that I've noticed. The PR is ready for review.

// outpointSize is the size of an outpoint in a 64-bit system.
//
// This value is calculated by running the following on a 64-bit system:
// unsafe.Sizeof(wire.OutPoint{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Index is explicitly int32, so 64-bit system or not doesn't matter IIUC.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to look up some docs to make sure.

I think you're right. [32]byte should be 32 on 32bit systems and uint32 should be 4. I'll remove the comments like these in other parts of the code as well.

for _, m := range ms.maps {
entry, found = m[op]
if found {
break
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: return early with the found element?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like return entry, found instead of break?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

blockchain/utxocache.go Show resolved Hide resolved
@kcalvinalvin
Copy link
Collaborator Author

Also one other diff with that PR is that things are cached only after a call to block.Bytes(), while the earlier one caches at basically as soon as things are read off the wire.

Out of draft now!

blockchain/chain.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@Crypt-iQ Crypt-iQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just going to test this out

blockchain/chain.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@Crypt-iQ Crypt-iQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🦖

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐊

I think this is ready to land after a rebase (fold in the fixup commits).

There's a lot of code that would violate the lnd linter/style, but we haven't yet ported over the latest version of our linter to btcd. I'll make a new PR to port that over, and fix the violations once this lands.

// Return early if the view is nil.
if view == nil {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: missing new line before the lcosing brace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

// The returned entries are NOT safe for concurrent access.
func (s *utxoCache) fetchEntries(outpoints []wire.OutPoint) ([]*UtxoEntry, error) {
entries := make([]*UtxoEntry, len(outpoints))
var missingOps []wire.OutPoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: can be compresed under a single var statement:

var (
    missingOps []wire.OutPoint
    missingOpsIdx []int
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

// unspendable. When the cache already has an entry for the output, it will be
// overwritten with the given output. All fields will be updated for existing
// entries since it's possible it has changed during a reorg.
func (s *utxoCache) addTxOut(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere we usually just fill out the first level of arguments till it exceeds 80 chars, then continue on the next line, so:

func (s *utxoCache) addTxOut(outpoint wire.OutPoint, txOut *wire.TxOut, isCoinBase bool, 
        blockHeight int32) error {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed


entry := new(UtxoEntry)
entry.amount = txOut.Value
// Deep copy the script when the script in the entry differs from the one in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: missing new line above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

}

s.cachedEntries.put(outpoint, entry, s.totalEntryMemory)
s.totalEntryMemory += entry.memoryUsage()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't cachedEntries maintain totalEntryMemory as an internal attribute? Then it can just update it internally rather than us passing the attribute in to modify right below. You mentioned above it doesn't know the entry size, but IIUC, that's what we have entry.memoryUsage for.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a patch for this but I'm not confident that it's a small change. Is it ok if I do a follow up PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

entry := entries[0]
if stxos != nil {
// Populate the stxo details using the utxo entry.
var stxo = SpentTxOut{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why var vs the normal :=?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

// be updated to append an entry for each spent txout. An error will be returned
// if the cache and the database does not contain the required utxos.
func (s *utxoCache) connectTransaction(
tx *btcutil.Tx, blockHeight int32, stxos *[]SpentTxOut) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have a pointer to the slice passed in which is mutated each time, vs just returning the set of UTXOs spent due to the transaction? Is the goal to reduce memory churn my pre-allocating stxos ahead of time? Current usage made the control flow a bit harder to follow IMO.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. There wasn't much thought that went into this besides just re-using the function prototype for connectTransaction that was in the current codebase. I'm assuming that was the intention when previous authors wrote this.

utxoBucket := dbTx.Metadata().Bucket(utxoSetBucketName)
for i := range s.cachedEntries.maps {
for outpoint, entry := range s.cachedEntries.maps[i] {
// If the entry is nil or spent, remove the entry from the database
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: delete(s.cachedEntries.maps[i], outpoint) appears 3 times in the loop, we can reduce it to one:

switch {
    case  entry == nil || entry.IsSpent():
        	err := dbDeleteUtxoEntry(utxoBucket, outpoint)
		if err != nil {
			return err
		}

    case !entry.isModified():
   default:
		// Entry is fresh and needs to be put into the database.
		err := dbPutUtxoEntry(utxoBucket, outpoint, entry)
		if err != nil {
			return err
		}
}

delete(s.cachedEntries.maps[i], outpoint)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

// get changed during the execution of this method.
func (b *BlockChain) InitConsistentState(tip *blockNode, interrupt <-chan struct{}) error {
s := b.utxoCache
// Load the consistency status from the database.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: missing space above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

@Roasbeef Roasbeef added this to the v0.24 milestone Dec 9, 2023
This change is part of the effort to add utxocache support to btcd.

sizehelper introduces code for 2 main things:
    1: Calculating how many entries to allocate for a map given a size
       in bytes.
    2: Calculating how much a map takes up in memory given the entries
       were allocated for the map.

These functionality are useful for allocating maps so that they'll be
allocating below a certain number of bytes.  Since go maps will always
allocate in powers of B (where B is the bucket size for the given map),
it may allocate too much memory.  For example, for a map that can store
8GB of entries, the map will grow to be 16GB once the map is full and
the caller puts an extra entry onto the map.

If we want to give a memory guarantee to the user, we can either:
    1: Limit the cache size to fixed sizes (4GB, 8GB, ...).
    2: Allocate a slice of maps.

The sizehelper code helps with (2).
@kcalvinalvin
Copy link
Collaborator Author

Rebased and addressed the comments from roasbeef

This change is part of the effort to add utxocache support to btcd.

mapslice allows the caller to allocate a fixed amount of memory for the
utxo cache maps without the mapslice going over that fixed amount of
memory.  This is useful as we can have variable sizes (1GB, 1.1GB, 2.3GB,
etc) while guaranteeing a memory limit.
This change is part of the effort to add utxocache support to btcd.

The fresh flag indicates that the entry is fresh and that the parent
view (the database) hasn't yet seen the entry.  This is very useful as
a performance optimization for the utxo cache as if a fresh entry is
spent, we can simply remove it from the cache and don't bother trying to
delete it from the database.
This change is part of the effort to add utxocache support to btcd.

Getting the memory usage of an entry is very useful for the utxo cache
as we need to know how much memory all the cached entries are using to
guarantee a cache usage limit for the end user.
This change is part of the effort to add utxocache support to btcd.

The utxoStateConsistency indicates what the last block that the utxo
cache got flush at.  This is useful for recovery purposes as if the node
is unexpectdly shut down, we know which block to start rebuilding the
utxo state from.
This change is part of the effort to add utxocache support to btcd.

dbPutUtxoView handled putting and deleting new/spent utxos from the
database.  These two functinalities are refactored to their own
functions: dbDeleteUtxoEntry and dbPutUtxoEntry.

Refactoring these out allows the cache to call these two functions
directly instead of having to create a view and saving that view to
disk.
This change is part of the effort to add utxocache support to btcd.

connectBlock may have an empty utxoviewpoint as the block verification
process may be using the utxo cache directly.  In that case, a nil utxo
viewpoint will be passed in.  Just return early on a nil utxoviewpoint.
This change is part of the effort to add utxocache support to btcd.

Require the caller to pass in the utxoBucket as the caller may be
fetching many utxos in one loop.  Having the caller pass it in removes
the need for dbFetchUtxoEntry to grab the bucket on every single fetch.
This change is part of the effort to add utxocache support to btcd.

fetchInputUtxos had mainly 2 functions:
1: Figure out which outpoints to fetch
2: Call fetchUtxosMain to fetch those outpoints

Functionality for (1) is refactored out to fetchInputsToFetch.  This is
done to allow fetchInputUtxos to use the cache to fetch the outpoints
as well in a later commit.
The implemented utxocache implements connectTransactions just like
utxoviewpoint and can be used as a drop in replacement for
connectTransactions.

One thing to note is that unlike the utxoViewpoint, the utxocache
immediately deletes the spent entry from the cache.  This means that the
utxocache is unfit for functions like checkConnectBlock where you expect
the entry to still exist but be marked as spent.

disconnectTransactions is purposely not implemented as using the cache
during reorganizations may leave the utxo state inconsistent if there is
an unexpected shutdown.  The utxoViewpoint will still have to be used
for reorganizations.
This change is part of the effort to add utxocache support to btcd.

utxo cache is now used by the BlockChain struct.  By default it's used
and the minimum cache is set to 250MiB.  The change made helps speed up
block/tx validation as the cache allows for much faster lookup of utxos.
The initial block download in particular is improved as the db i/o
bottleneck is remedied by the cache.
PruneBlocks used to delete files immediately before the database
transaction finished.  By making the prune atomic, we can guarantee that
the database flush will happen before the utxo cache is flushed,
ensuring that the utxo cache is never in an irrecoverable state.
flushNeededAfterPrune returns true if the utxocache needs to be flushed
after the pruning of the given slice of block hashes.  For the utxo
cache to be recoverable while pruning is enabled, we need to make sure
that there exists blocks since the last utxo cache flush.  If there are
blocks that are deleted after the last utxo cache flush, the utxo set is
irrecoverable.  The added method provides a way to tell if a flush is
needed.
If the prune will delete block past the last flush hash of the
utxocache, the cache will need to be flushed first to avoid a case
where the utxocache is irrecoverable.  The newly added code adds this
flush logic to connectBlock.
The testing function in export_test.go is changed to just export.go so
that callers outside the ffldb package will be able to call the
function.

The main use for this is so that the prune code can be triggered from
the blockchain package.  This allows testing code to have less than
1.5GB worth of blocks to trigger the prune.
@kcalvinalvin
Copy link
Collaborator Author

All fixup commits squashed now in the latest push.

@Roasbeef Roasbeef merged commit 941e42c into btcsuite:master Dec 19, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants