Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chainntnfs: introduce persistent height hint layer to ChainNotifier implementations #1278

Merged
merged 9 commits into from
Aug 23, 2018

Conversation

wpaulino
Copy link
Contributor

The role of the heightHint parameter within the ChainNotifier interface is to give them a lower bound on when to start looking for the earliest occurrence of the event. Usually this value isn't used as we remain online for the lifetime of the event. However, we currently have a concept of a "historical dispatch" of an event. This occurs when the stateless chain notifier instance is already live, and we receive a notification for an event with a heightHint value that is in the past. In this case, depending on various factors, we may have to fetch a set of blocks in the worst case to scan forward to determine if we need to dispatch an event. This can lead to some efficiency issues due to re-doing this process on every restart.

In this PR, we address this by introducing a persistent height hint layer to the different ChainNotifier implementations. Now, with this logic, we keep track of each transaction/outpoint we have registered a notification for as the chain progresses and update their height hint. This allows us to determine the exact height at which a transaction has been confirmed or an outpoint has been spent. By doing so, we optimize the ChainNotifier implementations to only have to fetch one block.

Fixes #1168.

A follow-up PR will move the logic of keeping track spend notifications and their height hints into the TxConfNotifier struct in order to further abstract the logic within the different ChainNotifier implementations.

@wpaulino wpaulino added this to the 0.5 milestone May 23, 2018
@@ -71,6 +70,8 @@ type BitcoindNotifier struct {

blockEpochClients map[uint64]*blockEpochRegistration

hintCache chainntnfs.HeightHintCache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

godoc comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if tx.Block != nil {
spendDetails.SpendingHeight = tx.Block.Height
} else {
spendDetails.SpendingHeight = bestHeight + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we set the spending height to best height+1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's because the best height hasn't been updated to the latest known height yet, so we must account for it.

}

if len(ops) > 0 {
err := b.hintCache.CommitSpendHint(uint32(block.Height), ops...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be safe to do this before notifying the epochs? If so, I think it would produce more consistent behavior since the state of db is finalized before triggering downstream actions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it would help as there's another database transaction within txConfNotifier.ConnectTip.

// encoded using the serialization format of the database.
func readElement(r io.Reader, element interface{}) error {
func ReadElement(r io.Reader, element interface{}) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expose the plural versions of ReadElement and WriteElement while we're at it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't such thing 😛

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be here :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirmedTxs = append(confirmedTxs, tx)
}

err := tcn.hintCache.CommitConfirmHint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we rewind the height hints for unconfirmed transactions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did an initial quick review.

return err
}

outpoint.Reset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would move this to just before writing to it, for readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

tcn.currentHeight, *ntfn.TxID,
)
if err != nil {
Log.Errorf("Unable to update confirm hint to %d for "+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it'd be ideal to prevent registering a notification for a transaction if we fail to add a height hint for it.

chainregistry.go Outdated
@@ -163,6 +163,8 @@ func newChainControlFromConfig(cfg *config, chanDB *channeldb.DB,
CoinType: activeNetParams.CoinType,
}

hintCache := chainntnfs.NewHeightHintCache(chanDB)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should use a separate db for the height hints? I'm thinking if a user can delete the heightHint.db file, and lnd will continue working, then it's nice to have them separate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. It's now under its own database within the network-specific chain directory.

lnd_test.go Outdated
@@ -1847,7 +1847,7 @@ func testChannelForceClosure(net *lntest.NetworkHarness, t *harnessTest) {
// operation upon restart. This will change the blockheights from what
// is expected by the test.
// TODO(bvu): refactor out this sleep.
duration := time.Millisecond * 300
const duration = 3 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about wrapping net.Alice.PendingChannels in a waitPredicate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@wpaulino wpaulino force-pushed the persistent-chain-notifiers branch 2 times, most recently from 40dc042 to fdfb776 Compare June 6, 2018 03:15
Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work on this one! Especially the sleep removal in integration tests, makes me very happy to see 😀

lnd_test.go Outdated

// Due to mining a large amount of blocks, we'll sleep to allow the UTXO
// nursery to catch up before restart.
time.Sleep(3 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why the UTXO nursery is unable to handle the restart if this sleep is removed?

Ideally it should be able to. Maybe not something that needs to be fixed in this PR, but we should at least add a TODO to fix it and remove this sleep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, this took me some time to figure out... Found out it's actually because of #1608. We're expecting things to kick off at height-1, but because we restarted, we now wait at height instead. So we basically time out the WaitPredicate because a block should be mined but isn't. Added a TODO which can be removed in your PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1608 has been merged.

// encoded using the serialization format of the database.
func readElement(r io.Reader, element interface{}) error {
func ReadElement(r io.Reader, element interface{}) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be here :)

// outpoint.
QuerySpendHint(op wire.OutPoint) (uint32, bool)

// PurgeSpendHint removes the height hint for the outpoints from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spend hint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// CommitSpendHint commits a height hint for the outpoints to the cache.
CommitSpendHint(heightHint uint32, ops ...wire.OutPoint) error

// QuerySpendHint returns the latest height hint for an outpoint. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spend hint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// cache.
PurgeSpendHint(ops ...wire.OutPoint) error

// CommitConfirmHint commits a height hint for the transactions to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: commit hint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

pendingHtlc.BlocksTilMaturity)
}
}

return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big 👍 to all the changes above!

lnd_test.go Outdated
return false
}

// None of our outputs have been swept, so they should all be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "in limbo"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return true
}, 15*time.Second)
if err != nil {
t.Fatalf(predErr.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent use of waitPredicates here!


// The htlc funds will still be shown as limbo, since they are still in
// their first stage. The commitment funds will have been recovered
// after the commit txn was included in the last block.
forceClose, err := findForceClosedChannel(pendingChanResp, &op)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? How did this work before? 😛

t.Fatalf("unable to query for pending channels: %v", err)
}
assertNumForceClosedChannels(t, pendingChanResp, 1)
err = lntest.WaitPredicate(func() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These wait predicates are quite big. I think it would make sense to add a comment above each of them, summarizing what they wait for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted w/ @wpaulino, I don't think this should be merged until a sound solution for #1220 is implemented. The current version can tolerate w/o historical dispatch since we always start from the og height hint, and all state is volatile. The persistent version will be less forgiving, and any errors will continue to persist across restarts.

@halseth made a great suggestion of making this a different db, so that we can drop/delete and reindex. This gives us a way to restart from scratch, assuming we are able to detect inconsistencies in the first place.

This change will require some reworking of the interfaces in this package, but I think they'll be worth. Ideally, we can tighten everything up so that we only do one db write per block connected/disconnected, that simulatenously updates all spend/commit hints.

Combined with #1220, this should vastly simplify the required recovery logic that would otherwise need to be added to this PR.

@@ -170,6 +186,14 @@ func (tcn *TxConfNotifier) Register(ntfn *ConfNtfn, txConf *TxConfirmation) erro
tcn.txsByInitialHeight[txConf.BlockHeight] = txSet
}
txSet[*ntfn.TxID] = struct{}{}

err := tcn.hintCache.CommitConfirmHint(
txConf.BlockHeight, *txConf.BlockHash,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why BlockHash? Would have guessed TxID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return c.db.Batch(func(tx *bolt.Tx) error {
spendHints := tx.Bucket(spendHintBucket)
if spendHints == nil {
return ErrSpendHintNotFound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this bucket is not found, we should probably return an error, but one indicating db corruption. this can be done on startup similar to the circuit map:
https://github.com/lightningnetwork/lnd/blob/master/htlcswitch/circuit_map.go#L214
if the bucket isn't found, we then report the database being corrupted:
https://github.com/lightningnetwork/lnd/blob/master/htlcswitch/circuit_map.go#L243

@@ -330,50 +333,95 @@ out:

// handleRelevantTx notifies any clients of a relevant transaction.
func (b *BitcoindNotifier) handleRelevantTx(tx chain.RelevantTx, bestHeight int32) {
// Check if this transaction spends any outputs that have existing spend
// notifications for it.
var outpointsSpent []wire.OutPoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: naming suggestion spentOutpoints :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@Roasbeef Roasbeef added P2 should be fixed if one has time first pass review done PR has had first pass of review, needs more tho needs testing PR hasn't yet been actively tested on testnet/mainnet labels Jul 10, 2018
@wpaulino wpaulino force-pushed the persistent-chain-notifiers branch 2 times, most recently from d02796f to 481b039 Compare August 15, 2018 01:53
@wpaulino
Copy link
Contributor Author

Feedback addressed, PTAL. Still need to figure out the best way to determine when we should prune the hints from the cache. Thinking it might be better to just expose a callback to the caller for more granularity. This could then be used, for example, once we detect a channel has been closed, as we'll no longer register for spends on the channel point.

@wpaulino wpaulino removed the blocked label Aug 15, 2018
@wpaulino wpaulino force-pushed the persistent-chain-notifiers branch 8 times, most recently from dcdc9bc to 396e3e0 Compare August 16, 2018 05:26
return err
}

err = spendHints.Put(outpoint.Bytes(), hint.Bytes())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure Put will work as expected when the slice key changes during the transaction. Just make sure we don't run into issues here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my spidey sense are going off a bit here as well. I recall that in the past I've had issues with properly re-using a buffer like this. Could run a quick playground attempt.

In any case, the buffer is created by default with a 64-byte static buffer, so it won't allocate beyond that first set if it's done correctly. However the Bytes method slices into the underlying buffer. What we really need here is a copy otherwise, we may run into issues within bolt. So it's best just to declare a new buffer (var) at the top of each loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded tests to account for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


// QuerySpendHint returns the latest spend hint for an outpoint. The bool
// returned signifies whether we have a hint cached for this outpoint.
func (c *heightHintCache) QuerySpendHint(op wire.OutPoint) (uint32, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead return error such that we can distinguish not found and actual error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


// QueryConfirmHint returns the latest confirm hint for a transaction. The bool
// returned signifies whether we have a hint cached for this transaction.
func (c *heightHintCache) QueryConfirmHint(txid chainhash.Hash) (uint32, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, return error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return err
}

err = confirmHints.Put(txHash.Bytes(), hint.Bytes())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should optimize by keeping the largest hint if it exists from before in the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that, guess that is added in a later commit.

// under one DB transaction, we'll gather the set of unconfirmed
// transactions along with the ones that confirmed at the current
// height.
var txs []chainhash.Hash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equivalent to txs = tcn.txsByInitialHeight[tcn.currentHeight]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be a slice so would have to iterate anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming suggestion: txnsToUpdateHints, or something along those lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to txsToUpdateHints.

for height, confirmedTxs := range tcn.txsByInitialHeight {
// Skip the transactions that confirmed at the current
// height as those have already been added.
if height == tcn.currentHeight {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using blockHeight makes it more clear that we are looking at the height of the new connected block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

out:
for maybeUnconfirmedTx := range tcn.confNotifications {
for height, confirmedTxs := range tcn.txsByInitialHeight {
// Skip the transactions that confirmed at the current
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add them in the same loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bit easier to parse as is.

txs = append(txs, confirmedTx)
}
out:
for maybeUnconfirmedTx := range tcn.confNotifications {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is a bit hard to parse what is going on here..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified why things are done this way in the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would say that it's still a bit hard to parse the first time around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, it seems like you can just update the entire set of outstanding notifications at this instance. The worst case is that something will have its height hint (already confirmed, but not yet dropped off from the notifier) updated to a later height.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it'd be a good idea to do so. If we update it for something that already confirmed and attempt to query for its height hint later on, then it'll be off by some margin of blocks, causing us to miss the event.

txids := make([]chainhash.Hash, 0, len(matureTxids))
for txid := range matureTxids {
delete(tcn.confNotifications, txid)
txids = append(txids, txid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

txids looks unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meant to remove that. Fixed.

@wpaulino wpaulino force-pushed the persistent-chain-notifiers branch 2 times, most recently from c0073a9 to e477355 Compare August 17, 2018 21:35
return err
}

err = spendHints.Put(outpoint.Bytes(), hint.Bytes())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my spidey sense are going off a bit here as well. I recall that in the past I've had issues with properly re-using a buffer like this. Could run a quick playground attempt.

In any case, the buffer is created by default with a 64-byte static buffer, so it won't allocate beyond that first set if it's done correctly. However the Bytes method slices into the underlying buffer. What we really need here is a copy otherwise, we may run into issues within bolt. So it's best just to declare a new buffer (var) at the top of each loop.

// PurgeSpendHint removes the spend hint for the outpoints from the
// cache.
PurgeSpendHint(ops ...wire.OutPoint) error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about splitting this into two interfaces, as the set of methods can cleanly be split into two categories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


for _, op := range ops {
outpoint.Reset()
err := channeldb.WriteElement(&outpoint, op)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here re-buffer.


var txHash bytes.Buffer
for _, txid := range txids {
txHash.Reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examine buffer usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


var txHash bytes.Buffer
for _, txid := range txids {
txHash.Reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

if len(txs) > 0 {
err := tcn.hintCache.CommitConfirmHint(tcn.currentHeight, txs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a Tracef log here and below when updating height hints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to keep the log statements within the calls themselves to avoid having to do it everywhere.

@@ -682,7 +712,20 @@ func (b *BitcoindNotifier) RegisterSpendNtfn(outpoint *wire.OutPoint,
return nil, err
}

if txOut == nil {
// If the output is unspent, then we'll write it to the cache with the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matters more for neutrino, but if the cached height hint is the same as the current height, then we can skip the GetUTXO query all together. This will save us filter fetches which will (eventually) add to the cache, and also possibly fetch a block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd still need to fetch a block to manually check for the spend, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At tip now, it should be cached so doesn't make a difference.

return nil, err
}

db, err := bolt.Open(filepath.Join(dbDir, dbName), dbFilePermission, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than create a new database which will add yet another memory map, this should use the existing database. Arguably, the replay log should also be within the same database.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

lnd_test.go Outdated

// Due to mining a large amount of blocks, we'll sleep to allow the UTXO
// nursery to catch up before restart.
time.Sleep(3 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1608 has been merged.


// PurgeSpendHint removes the spend hint for the outpoints from the
// cache.
PurgeSpendHint(ops ...wire.OutPoint) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see hints purged anywhere in the PR as is. It seems safe to just use our regular re-org horizon. Since we now have persistent spend notifications for each channel via the contract court, we no longer run into an edge case of a output being spent, but peer not coming online for hundreds of blocks leading us to remove a stale height hint entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems safe to just use our regular re-org horizon.

Due to the way the TxConfNotifier works, it's possible that we end up not removing some hints from the cache if we go down before some transactions reach the reorg horizon.

Since we now have persistent spend notifications for each channel via the contract court, we no longer run into an edge case of a output being spent, but peer not coming online for hundreds of blocks leading us to remove a stale height hint entry.

Not sure what you mean by "persistent spend notifications". Isn't this what the cache aims to solve?

@wpaulino wpaulino force-pushed the persistent-chain-notifiers branch 3 times, most recently from d7a2f12 to a7d8e78 Compare August 21, 2018 16:54
…tion

In this commit, we introduce two new interfaces: SpendHintCache and
ConfirmHintCache for a height hint cache. The SpendHintCache is
responsible for maintaining the earliest height at which an outpoint
could have been spent within the chain, while the ConfirmHintCache is
responsible for maintaining the earliest height at which a transaction
confirms within the chain. We also add an implementation of these
interfaces with a single struct HeightHintCache, backed by a channeldb
instance, which will cary the duties of the interfaces above.
In this commit, we extend our TxConfNotifier to cache height hints for
our confirmation events. Each transaction we've requested a confirmation
notification for will have its initial height hint cached. We increment
this height hint at every new block for unconfirmed transactions. This
allows us to retrieve the *exact* height at which the transaction has
been included in a block. By doing this, we optimize the different
ChainNotifier implementations since they will no longer have to scan
forward (and possibly fetch blocks in the neutrino/pruned node case)
from the initial height hint looking for the confirmation.
In this commit, we alter the different chain notifiers to query their
height hint cache before registering a confimation notification. We do
this as it's possible that the cache has a higher height hint, which
can potentially reduce the amount of blocked fetched when attempting
historical dispatches.
In this commit, we extend the different ChainNotifier implementations to
cache height hints for our spend events. Each outpoint we've requested a
spend notification for will have its initial height hint cached. We then
increment this height hint at every new block for unspent outpoints.
This allows us to retrieve the *exact* height at which the outpoint has
been spent. By doing this, we optimize the different ChainNotifier
implementations since they will no longer have to rescan forward (and
possibly fetch blocks in the neutrino/pruned node case) from the initial
height hint.
In this commit, we remove the sleep timeouts used within the channel
force closure integration test. This is needed because the recent changes
within the ChainNotifier require a longer timeout due to making a
database transaction on every new block to update the confirm/spend
hints of transactions. Rather than increasing the timeouts, we simply
remove them to ensure this isn't an issue down the road.
rescanning

In this commit, we modify the rescan options Neutrino uses when
performing a rescan for historical chain events to disable disconnected
block notifications. This is needed as the Neutrino backend will mutate
its internal state while rewinding, which causes disconnected block
notifications to be sent. Since the notifier acts upon these
notifications, they would cause it to also rewind unnecessarily.
// NewHeightHintCache returns a new height hint cache backed by a database.
func NewHeightHintCache(db *channeldb.DB) (*HeightHintCache, error) {
cache := &HeightHintCache{db}
if err := cache.initBuckets(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good since I peeped last, did a review pass and most things look in place to me. I think the consensus was that we'd get this in w/ a few modifications, then work out another PR for purge spend hints.

txs = append(txs, tx)
}

err := tcn.hintCache.CommitConfirmHint(tcn.currentHeight, txs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this section be moved before we dispatch the disconnect notifications? The main for loop doesn't seem to modify anything that this depends on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, should be fine to move it before.

}

if len(txsToUpdateHints) > 0 {
err := tcn.hintCache.CommitConfirmHint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great ordering here: prep, commit, notify 🔥

@@ -11,6 +12,90 @@ import (

var zeroHash chainhash.Hash

type mockHintCache struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solid mock

}

if len(ops) > 0 {
err := b.spendHintCache.CommitSpendHint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anyway we can commit the spend heights before delivering block epochs? In neutrino it also looks like we deliver the spend ntfns before commiting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the ordering here is crucial. The reason I made it to commit the new height hint after notifying is because we don't keep track of things that have already been notified, so we can just simply iterate over the current pending notifications.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this call is actually in the wrong place then, we only dispatch spend ntfns when they come through as relevant transaction. I'm not sure what the ordering guarantees are on those vs delivery of blocks, but it seems like it could be a little racey

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there's a good way around this though, as we need to update the spend hints at some point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably merge it as is and then revisit later if we think of a more concrete solution

@@ -869,6 +869,18 @@ type confirmationNotification struct {
func (b *BitcoindNotifier) RegisterConfirmationsNtfn(txid *chainhash.Hash,
_ []byte, numConfs, heightHint uint32) (*chainntnfs.ConfirmationEvent, error) {

// Before proceeding to register the notification, we'll query our
// height hint cache to determine whether a better one exists.
if hint, err := b.confirmHintCache.QueryConfirmHint(*txid); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That querying for height hints are done within the notifier, while committing to them is done in the txConfNotifier makes it a bit hard to follow.

Ideally we would just pass the original height hint along to the txConfNotifier, then it would query the cache to check if it had a more up to date hint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for this is because the notifier needs to know of the hint so that it can start the historical scan from there. I guess we can avoid the the hint cache dependency in the notifier by creating some proxy methods on the TxConfNotifier that query the hint cache from there.

@@ -373,6 +373,7 @@ out:
rescanUpdate := []neutrino.UpdateOption{
neutrino.AddAddrs(addrs...),
neutrino.Rewind(currentHeight),
neutrino.DisableDisconnectedNtfns(true),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why were these notifications not a problem before the height hint cache was added?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really a problem, but it causes unnecessary rewinding to happen.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👗

@Roasbeef Roasbeef merged commit c9bead7 into lightningnetwork:master Aug 23, 2018
@wpaulino wpaulino deleted the persistent-chain-notifiers branch August 24, 2018 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
first pass review done PR has had first pass of review, needs more tho needs testing PR hasn't yet been actively tested on testnet/mainnet notifications P2 should be fixed if one has time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants