channeldb: Index payments by sequence number #4261

carlaKC · 2020-05-07T14:17:43Z

First draft of migration for indexing payments, see #4164 for details.

payments-index-bucket
      <sequence nr>: <payment hash>
      <sequence nr>: <payment hash>

Since we have historic duplicate payments to the same hash, multiple entries in this index may point to the same payment. We take this into account when retrieving payments for a given hash by making sure that our sequence number matches what we're looking for.

This PR also includes unification of our pagination logic into a single paginater struct, since we use the same logic to invoices and payments. This can be split out into a separate PR, it was just useful to have all in one during dev.

joostjager

A few comments are about commit structure. I think that ideally it should be possible to run every commit on its own without leaving the db in an unusable state for example. Maybe a bit more squashing can accomplish that, or a slightly different split.

channeldb/payment_control.go

channeldb/payment_control_test.go

channeldb/payments_test.go

channeldb/migration14/migration_test.go

joostjager · 2020-05-22T07:15:40Z

channeldb/payments.go

+	// payment hash.
+	// payments-sequence-index-bucket
+	// 	|--sequence-number: <payment hash>
+	//      |--sequence-number: <payment hash>


two identical keys?

Ah, wasn't clear on the schema for these comments. Wrapped in <> to indicate that they're keyed by value, not a string.

channeldb/payment_control.go

channeldb/payment_control_test.go

joostjager · 2020-05-22T07:19:28Z

channeldb/payments_test.go

+
+// appendDuplicatePayment adds a duplicate payment to an existing payment. Note
+// that this function requires a unique sequence number.
+func appendDuplicatePayment(t *testing.T, db *DB, paymentHash lntypes.Hash,


There is quite a bit of code in this PR to deal with duplicate payments. I am wondering for how long we are going to invest time in those? Maybe there is some kind of middle ground where we don't delete them completely, but only make them retrieval through a special endpoint. They would then also not need the indexing changes and all the associated test code

I think it's worth dealing with legacy in a way that as as seamless to the user as possible, extra endpoints feels like another thing for people to load up into their mental model of how lnd works. If we want to sunset duplicates, I think we'd at least need to communicate that pretty widely, then perhaps have a release which deletes and csv-dumps those payments for the user. We could go down that road, but I think it's a long term plan, and people are having issues with their ListPyayments endpoint now, so I'd argue for tolerating them once more and then making a proper deprecation plan.

Agree on extra endpoints. We'd need to add them in 0.11 and remove the payments from the regular query calls in 0.12.

joostjager · 2020-05-22T07:21:13Z

channeldb/paginate.go

+			// index offset we have been provided is beyond our last
+			// item. Since we are paginating backwards, we simply
+			// start at the last item.
+			if indexKey == nil {


Not part of test case?

This behaviour is pre-existing for our invoices pagination, so I wanted to separate it out. The previous commit moves the logic across as is, and this one improves it. But I think a better way would be to add the failing test then fix it, going to switch the order around.

carlaKC · 2020-05-27T14:25:12Z

I think that ideally it should be possible to run every commit on its own without leaving the db in an unusable state for example.

Squashed and re-ordered a bit. Only one that's questionable is the first two commits (add index/ delete index when we delete payments). I can squash them so that everything is done in one, but that makes for a pretty big commit.

I think the likelihood of us splitting these commits is pretty low (we wouldn't split cherry pick half way through a migration), but it really comes down to reviewer preference so happy to go either way (could even squash when we're closer to merge?)

cfromknecht

nice changes! no major comments on the indexing itself, just some minor thigns focused on how we can build on the changes introduced in #4285

cfromknecht · 2020-05-28T02:34:25Z

channeldb/payment_control.go

+func createPaymentIndexEntry(tx kvdb.RwTx, sequenceNumber []byte,
+	hash lntypes.Hash) error {
+
+	indexes, err := tx.CreateTopLevelBucket(paymentsIndexBucket)


can we use the CreateTLB migration added in #4285 to avoid dynamic bucket creation?

channeldb/migtest/migtest.go

channeldb/migration14/log.go

channeldb/paginate.go

channeldb/invoice_test.go

joostjager

Tested migration, list payments still works. Did you test this manually with duplicates? I don't know how to get them really, maybe roll back to an old lnd release.

For commit structure, I'd squash the first three commits into one. That way it is a self-contained change that moves to the new db structure. I don't mind the commit being bigger in this case. But not a blocker.

joostjager · 2020-05-29T06:47:06Z

channeldb/payment_control.go

+		return lntypes.Hash{}, err
+	}
+
+	hash, err := lntypes.MakeHash(paymentHash[:])


One thing to quick check with @cfromknecht is how this payment index is going to work together with the upcoming AMP payments that don't have (a single) payment hash

good point, that is still tbd as to the best way to do that. more than likely we will start to index payments by payment address since that will be a uniform identifier that also works for MPP. when you start getting to recurring payments we would need to index by set_id in order to differentiate logical payments to the same invoice.

this could be implemented by adding a type byte to the index to signify which index the following 32-byte correspond to, but i don't know how much effort it would be for this PR. perhaps the best approach would be to deal with these details when we get there, and say that len(value) == 32 means payment hash. when we add the newer indexes, they could receive this type byte and be 33 bytes. i wouldn't want to block progress on this for some undecided details on AMP, but all in all i think we should be able to extend it as we need 👍

channeldb/payment_control_test.go

channeldb/payments.go

joostjager · 2020-05-29T06:55:48Z

channeldb/invoices.go

 			}

 			// At this point, we've exhausted the offset, so we'll
 			// begin collecting invoices found within the range.
 			resp.Invoices = append(resp.Invoices, invoice)
+			return 1, nil


I think it has a high risk of yagni

joostjager · 2020-05-29T06:58:52Z

channeldb/payment_control_test.go

-			success: true,
+			failed:       false,
+			success:      false,
+			hasDuplicate: false,


In flight with a duplicate? Probably not going to worry about that

channeldb/paginate.go

channeldb/payments_test.go

joostjager · 2020-05-29T07:29:47Z

channeldb/payments_test.go

+
+// appendDuplicatePayment adds a duplicate payment to an existing payment. Note
+// that this function requires a unique sequence number.
+func appendDuplicatePayment(t *testing.T, db *DB, paymentHash lntypes.Hash,


Agree on extra endpoints. We'd need to add them in 0.11 and remove the payments from the regular query calls in 0.12.

channeldb/db.go

joostjager · 2020-05-29T07:53:02Z

channeldb/migration16/migration.go

+	for _, index := range indexList {
+		// Write indexes for each of our sequence numbers.
+		for _, duplicate := range index.sequenceNumbers {
+			err := putIndex(bucket, duplicate, index.paymentHash)


Would it be worth keeping an in-memory set here to detect duplicate sequence numbers? As a sanity check

Why would we need this in addition to the check in putIndex?

carlaKC · 2020-06-02T13:47:26Z

Did you test this manually with duplicates?

I haven't yet, was thinking about manually inserting a duplicate into a local DB then running the migration, but that probably isn't the same as a real legacy db.

For commit structure, I'd squash the first three commits into one.

Squashed 👌

carlaKC · 2020-06-04T12:38:49Z

@guggero has some duplicates on his node and managed a successful dry run:

2020-06-04 12:34:52.208 [WRN] LTND: open /root/.lnd/lnd.conf: no such file or directory
2020-06-04 12:34:52.209 [INF] LTND: Version: 0.10.99-beta commit=v0.5-beta-4510-gb969eb4420354cc13f9d434da93fa093b64529ac, build=production, logging=default
2020-06-04 12:34:52.209 [INF] LTND: Active chain: Bitcoin (network=mainnet)
2020-06-04 12:34:52.209 [INF] LTND: Opening the main database, this might take a few minutes...
2020-06-04 12:34:52.487 [INF] CHDB: Checking for schema update: latest_version=16, db_version=13
2020-06-04 12:34:52.487 [INF] CHDB: Performing database schema migration
2020-06-04 12:34:52.487 [INF] CHDB: Applying migration #14
2020-06-04 12:34:52.487 [INF] CHDB: Creating top-level bucket: "pay-addr-index" ...
2020-06-04 12:34:52.487 [INF] CHDB: Created top-level bucket: "pay-addr-index"
2020-06-04 12:34:52.487 [INF] CHDB: Applying migration #15
2020-06-04 12:34:52.487 [INF] CHDB: Creating top-level bucket: "payments-index-bucket" ...
2020-06-04 12:34:52.487 [INF] CHDB: Created top-level bucket: "payments-index-bucket"
2020-06-04 12:34:52.487 [INF] CHDB: Applying migration #16
2020-06-04 12:34:52.487 [INF] CHDB: Migrating payments to add sequence number index
2020-06-04 12:34:52.494 [INF] LTND: dry run migration successful, exiting
2020-06-04 12:34:52.494 [INF] LTND: Shutdown complete

cfromknecht

mostly gtg here i think, main comment is that we can remove errNoIndexBucket because we can assume it exists due to the migration

channeldb/payment_control.go

cfromknecht · 2020-06-09T02:39:29Z

channeldb/payment_control.go

+
+// fetchPaymentIndexEntry gets the payment hash for the sequence number provided
+// from our payment indexes bucket.
+func (p *PaymentControl) fetchPaymentIndexEntry(sequenceNumber uint64) (


is this method only used for testing?

channeldb/payment_control.go

cfromknecht · 2020-06-09T02:49:19Z

channeldb/payment_control.go

+
+	indexes := tx.ReadWriteBucket(paymentsIndexBucket)
+	if indexes == nil {
+		return errNoIndexBucket


can't we assume this index always exists bc of the migration?

cfromknecht · 2020-06-09T02:49:45Z

channeldb/payment_control.go

+	if err := kvdb.View(p.db, func(tx walletdb.ReadTx) error {
+		indexBucket := tx.ReadBucket(paymentsIndexBucket)
+		if indexBucket == nil {
+			return errNoIndexBucket


same here wrt assuming the index exists. perhaps the error is not needed?

carlaKC · 2020-06-09T17:03:54Z

remove errNoIndexBucket because we can assume it exists due to the migration

Indeed, removed this err check, don't need it anymore.

Also added a type enum for payments so that we're future proofed for AMP. I assume we don't need length since we'll always know what our own db types are, and I haven't bothered with splitting de/serialization out by type since we only have a single type (just read it out and santity check for now, future changes can rework to check type then read accordingly).

cfromknecht

completed another pass, great that you were able to add the paymentIndexType so that we can extend this as we move to AMP! few last nits

channeldb/payment_control.go

cfromknecht · 2020-06-10T10:17:42Z

channeldb/payment_control.go

+
+// paymentIndexTypeHash is a payment index type which indicates that we have
+// created an index of payment sequence number to payment hash.
+const paymentIndexTypeHash paymentIndexType = iota


nice addition 😎

we typically assign persisted enums to constants rather than using iota, otherwise reordering the definitions can break the mapping. not a blocker though, happy to change since i'll be extending this area in the near future

cfromknecht · 2020-06-10T10:21:53Z

channeldb/payments.go

@@ -104,6 +104,14 @@ var (
 	paymentsIndexBucket = []byte("payments-index-bucket")
 )

+var (
+	errNoSequenceNumber        = errors.New("sequence number not found")


in the next commit it looks like it's possible for these errors to be returned via the public interface? if so we should consider exposing them as well and adding godocs

Update our current tests to include lookup of duplicate payments. We do so in preparation for changing our lookup to be based on a new payments index. We add an append duplicate function which will add a duplicate payment with the minimum information required to successfully read it from disk in tests.

Add an entry to a payments index bucket which maps sequence number to payment hash when we initiate payments. This allows for more efficient paginated queries. We create the top level bucket in its own migration so that we do not need to create it on the fly. When we retry payments and provide them with a new sequence number, we delete the index for their existing payment so that we do not have an index that points to a non-existent payment. If we delete a payment, we also delete its index entry. This prevents us from looking up entries from indexes to payments that do not exist.

In our current invoice pagination logic, we would not return any invoices if our offset index was more than 1 off our last index and we were paginating backwards. This commit adds a test case for this behaviour before fixing it in the next commit.

We now use the same method of pagination for invoices and payments. Rather than duplicate logic across calls, we add a pagnator struct which can have query specific logic plugged into it. This commit also addresses an existing issue where a reverse query for invoices with an offset larger than our last offset would not return any invoices. We update this behaviour to act more like c.Seek and just start from the last entry. This behaviour change is covered by a unit test that previously checked for the lack of invoices.

With our new index of sequence number to index, it is possible for more than one sequence number to point to the same hash because legacy lnd allowed duplicate payments under the same hash. We now store these payments in a nested bucket within the payments database. To allow lookup of the correct payment from an index, we require matching of the payment hash and sequence number.

Use the new paginatior strcut for payments. Add some tests which will specifically test cases on and around the missing index we force in our test to ensure that we properly handle this case. We also add a sanity check in the test that checks that we can query when we have no payments.

cfromknecht

LGTM 🤠

carlaKC requested a review from cfromknecht May 11, 2020 08:10

carlaKC added the v0.11 label May 12, 2020

cfromknecht added this to the 0.11.0 milestone May 14, 2020

carlaKC force-pushed the 4164-indexpayments branch from f59bc3f to 5dffade Compare May 15, 2020 13:49

carlaKC changed the title ~~[WIP]: Index payments by sequence number~~ channeldb: Index payments by sequence number May 15, 2020

carlaKC force-pushed the 4164-indexpayments branch 2 times, most recently from d9d94d0 to 5d2b5bf Compare May 17, 2020 14:55

carlaKC marked this pull request as ready for review May 18, 2020 08:13

carlaKC requested review from halseth, joostjager and Roasbeef as code owners May 18, 2020 08:13

carlaKC removed request for Roasbeef and halseth May 18, 2020 08:13

carlaKC force-pushed the 4164-indexpayments branch 2 times, most recently from 8771811 to f6adab1 Compare May 18, 2020 12:16

joostjager reviewed May 22, 2020

View reviewed changes

carlaKC force-pushed the 4164-indexpayments branch from f6adab1 to f6d860f Compare May 27, 2020 14:17

carlaKC requested a review from joostjager May 27, 2020 17:43

cfromknecht reviewed May 28, 2020

View reviewed changes

carlaKC force-pushed the 4164-indexpayments branch from f6d860f to 1a50d3a Compare May 28, 2020 09:28

carlaKC requested a review from cfromknecht May 28, 2020 11:24

joostjager approved these changes May 29, 2020

View reviewed changes

carlaKC force-pushed the 4164-indexpayments branch from 1a50d3a to b969eb4 Compare June 1, 2020 13:11

cfromknecht reviewed Jun 9, 2020

View reviewed changes

carlaKC force-pushed the 4164-indexpayments branch from b969eb4 to 08e2e03 Compare June 9, 2020 09:04

carlaKC requested a review from cfromknecht June 9, 2020 17:05

cfromknecht reviewed Jun 10, 2020

View reviewed changes

carlaKC added 6 commits June 10, 2020 12:34

carlaKC force-pushed the 4164-indexpayments branch from 08e2e03 to ab594ea Compare June 10, 2020 10:56

cfromknecht approved these changes Jun 10, 2020

View reviewed changes

cfromknecht merged commit d47d17b into lightningnetwork:master Jun 10, 2020

cfromknecht mentioned this pull request Jun 17, 2020

Index Payments by Sequence Number #4164

Closed

Roasbeef mentioned this pull request Jul 16, 2020

LND becomes unresponsive and fills RAM/swap after a few 'listpayments' calls #4119

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

channeldb: Index payments by sequence number #4261

channeldb: Index payments by sequence number #4261

carlaKC commented May 7, 2020 •

edited

Loading

joostjager left a comment

joostjager May 22, 2020

carlaKC May 27, 2020

joostjager May 22, 2020

carlaKC May 27, 2020

joostjager May 29, 2020

joostjager May 22, 2020

carlaKC May 27, 2020

carlaKC commented May 27, 2020

cfromknecht left a comment

cfromknecht May 28, 2020

joostjager left a comment

joostjager May 29, 2020

cfromknecht Jun 2, 2020

joostjager May 29, 2020

joostjager May 29, 2020

joostjager May 29, 2020

joostjager May 29, 2020

carlaKC Jun 1, 2020

carlaKC commented Jun 2, 2020

carlaKC commented Jun 4, 2020

cfromknecht left a comment

cfromknecht Jun 9, 2020

cfromknecht Jun 9, 2020

cfromknecht Jun 9, 2020

carlaKC commented Jun 9, 2020

cfromknecht left a comment

cfromknecht Jun 10, 2020

cfromknecht Jun 10, 2020

cfromknecht left a comment

channeldb: Index payments by sequence number #4261

channeldb: Index payments by sequence number #4261

Conversation

carlaKC commented May 7, 2020 • edited Loading

joostjager left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlaKC commented May 27, 2020

cfromknecht left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joostjager left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlaKC commented Jun 2, 2020

carlaKC commented Jun 4, 2020

cfromknecht left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlaKC commented Jun 9, 2020

cfromknecht left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfromknecht left a comment

Choose a reason for hiding this comment

carlaKC commented May 7, 2020 •

edited

Loading