watchtower: reduce AckedUpdate storage footprint #7055

ellemouton · 2022-10-18T11:31:16Z

This PR contains a few migrations of the watchtower client DB.

Main Achievements:

It greatly reduces the storage requirements of the session AckedUpdates through the use of a new RangeIndex model.
The various indexes added in this PR nicely sets us up for a follow up PR in which we we will be able to determine which sessions we can delete (and tell the tower server that it can delete) based on if all channels that the session has updates for have been deleted. (Ie, this is a step towards [Question] Watchtower database cleanup? #6259, [feature]: cleanup of wtclient.db / removal of watchtower #7035)

Change Log:

1. Migrate ChanDetails Bucket

A ChannelDetails bucket is added and each registered channel gets its own sub-bucket in this bucket. The existing channel summaries are moved here under the cChannelSummary key and the old top-level cChanSummaryBkt is deleted. This is done so that we can store more details about a channel without needing to add new info to the encoded ChannelSummary.

2. Add a new Channel-ID index

A new channel-ID index is added. The index holds a mapping from db-assigned-ID (8 bytes) to real channel-ID (32 bytes). This mapping will allow us to preserve disk space in future when persisting references to channels.

3. Migrate AckedUpdates to the RangeIndex model.

This is the main meat of the PR.

Currently, for each session, we store its acked updates as follows:

seqNum (2 bytes) -> encoded BackupID{channelID (32 bytes) + commitHeight (8 bytes)}

The default number of backups per session is 1024 and so this means that regardless of the number of channels the session is actually covering, it takes 1024 * (2 + 32 + 8) bytes per session. Depending on how busy your node is, you could have many thousands of sessions (this user has 60k which would mean 2.5GB!).

The only reason that we want to keep track of AckedUpdates imo is if we need to determine which updates of a channel have/have not been backed up or if we want to replay backups to a different tower. So I argue that knowing the sequence number associated with each back up is not necessary.

Given this assumption, we can store the AckedUpdates using a RangeIndex (per channel). We will also make use of the new channel index described above (in point 2 of the change log) so that we only need 8 bytes per channel.
The RangeIndex is just a map from start height to end height of a range. This means that in the best case where all commit heights are present, we only need to store the start and end height. And if there are holes, we store separate ranges and that will still be more efficient since each range only requires 16 bytes (8 byte key for the start value and 8 bytes for the end value of the range). Our session can thus store its AckedUpdates as follows:

"acked-updates" => channel-ID (8 byte key sub-bucket) => start height (8 byte key) -> end-height (8 byte value)

Here is a quick example showing the the change of an index that starts empty and then gets the numbers 1, 2, 4, 3:

add 1:
    1 -> 1

add 2:
    1 -> 2

add 4:
    1 -> 2
    4 -> 4

add 3:
    1 -> 4

This means that every session only refers to a channel once and then just stores the index of ranges for that channel that is covered by the session.

Any additions to the index will at most require one value update & one key deletion.

The PR contains the necessary logic to make sure that the calculation of how to update an index given a new commit height is as efficient as possible.

Note:

I originally had the changes in #7059 here too since the bug in wtclientrpc makes it difficult to use the rpc to check the correctness of this PR. Separated it into its own PR now just to make the diff of this PR a bit smaller. But suggest we get that PR in first even though the two are not dependant on each other.

Fixes #6886

bhandras

Very cool PR! ✨ Did just a very light pass to understand the intuition behind the changes. Concept ACK! 🌮

Will provide a more detailed review after @Roasbeef has taken another look.

watchtower/wtdb/client_db.go

watchtower/wtdb/range_index.go

watchtower/wtdb/client_db.go

bhandras · 2022-10-25T17:01:08Z

watchtower/wtdb/client_db.go

+
+			// Get the range index for the session-channel pair.
+			chanID := byteOrder.Uint64(dbChanID)
+			index, err := c.getRangeIndex(


Here we could also use the in-memory cache exclusively if we know that all range indexes for all channels are already cached. Maybe not as critical if NumAckedUpdates isn't called too frequently.

yeah the reason I didnt use the in-mem one here is cause this iterates through all sessions (even ones we will never use again) and would result in them then being stored in the cache even if they are not ever needed again

Roasbeef

Initial pass through the diff, need to revisit some of the existing client code to build a better mental model to review these changs.

At a glance, a few things pop out at me:

will there ever actually be a gap in the commitment height that a tower has ACK'd?
- the new range index design seems to assume that a tower either deletes some random states, or that a client can somehow skip uploading a set of states (causing a gap in the set of ACK'd heights)
for the mapping of the 32-byte cid, why can't we just use the existing 8-byte scid instead?
We'll need to properly test these persistent changes to ensure they don't lead to a performance regression in practice. In particular I'm curious to see how this fares in a remote db setting, given there seems to be a few additional layers of bucket nesting (which can mean additional round trips w/ the current kv mapping)

watchtower/wtdb/client_db.go

watchtower/wtdb/range_index.go

Roasbeef · 2022-10-28T02:22:33Z

watchtower/wtdb/range_index.go

+	"sync"
+)
+
+// RangeIndex can be used go keep track of which numbers have been added to a


I think I'm missing the motivation a bit here, in which case would there be a series of disparate gaps in what we've uploaded to the tower?

In other words, afaict, even if we're down for some time, we won't have gaps in prior heights uploaded unless the tower actually deleted information.

In the case where we switching to a tower and want to know the extent that we've re-uploaded our prior states, we'd still only need to track a single height (the final height that we've uploaded).

in which case would there be a series of disparate gaps in what we've uploaded to the tower?

If we have more than one tower, we might switch to a session with a different tower for a bit if the first one goes down. Also on startup we just pick a random session to start sending updates on. In the idea case, yes there would not be gaps but we need to account for there being gaps. In future we may even want a more "Round Robin" type thing were we cycle through sessions more often & dont just use one till exhaustion before moving on to the next one.

In other words, afaict, even if we're down for some time, we won't have gaps in prior heights uploaded unless the tower actually deleted information.

There is nothing that ensures that we send the commitment update heights in a monotonically increasing way on a single session. On startup we just pick a random non-exhausted session to continue on.

In the case where we switching to a tower and want to know the extent that we've re-uploaded our prior states, we'd still only need to track a single height (the final height that we've uploaded).

Dont think I agree - if we have some of the commitments for a channel backed up to one tower and some to another & then we want remove the one tower, we just want to know the heights we uploaded to that session so that we know which we need to re-upload to the new tower

There is nothing that ensures that we send the commitment update heights in a monotonically increasing way on a single session.

Can you point to the code that makes this not monotonically increasing? I thought the queues involved in sending to the tower were FIFO?

Dont think I agree - if we have some of the commitments for a channel backed up to one tower and some to another & then we want remove the one tower, we just want to know the heights we uploaded to that session so that we know which we need to re-upload to the new tower

I couldn't find where this logic is - can you point me to it? Or is this something for the future?

I thought the queues involved in sending to the tower were FIFO?

they are FIFO. But what can happen is that: we send backups for heights 1, 2, 3 on session A. Then we restart but this time choose session B as our active session (cause maybe the tower for session A is temporarily down) so we send heights 4, 5, 6 on session B. Now we restart again and choose session A again etc.
See here where we load all our candidate sessions (map!) that we will iterate through to find one that works

I couldn't find where this logic is - can you point me to it? Or is this something for the future?

I think my comment above answers this. Let me know if it does not

So you're saying that there can be gaps on session A? This makes sense. It's been a while since I took real analysis, but what tripped me up is that you said the heights cannot be monotonically increasing. I thought, but can't find a good internet answer, that a monotonically increasing sequence could have gaps - the only constraint here being that the sequence increases. I am curious to know the answer but it doesn't matter for this patch.

oh, I see what you mean. I was mixing up the definition of "monotonically".
So:

session seq number will be monotonically increasing without gaps.

commitment height of a channel within a session is monotonically increasing and could possible have gaps.

watchtower/wtdb/range_index_test.go

lnrpc/wtclientrpc/wtclient.go

bhandras · 2022-10-28T07:43:50Z

We'll need to properly test these persistent changes to ensure they don't lead to a performance regression in practice. In particular I'm curious to see how this fares in a remote db setting, given there seems to be a few additional layers of bucket nesting (which can mean additional round trips w/ the current kv mapping)

I think it's quite easy to work around those since we can cache a range index and then there's no need for nesting down anymore after that if done properly, see: #7055 (comment)

C-Otto · 2022-11-20T16:44:38Z

I had a look at the migrations and the general approach, and I think this is a great idea. Even though some ranges might end up being tiny (in weird multi-WT scenarios), this shouldn't be an issue when compared to the status quo. Instead of iterating through the ranges a binary search might be better, but I don't think this is relevant for a one-off migration with just a couple of ranges (i.e. not millions).

watchtower/wtdb/range_index.go

watchtower/wtdb/client_db.go

watchtower/wtdb/client_db_test.go

watchtower/wtdb/migration4/client_db.go

C-Otto · 2022-11-28T12:49:31Z

The main migration takes quite a bit of RAM. I don't think this is a huge issue, but it might be problematic for some larger nodes (or nodes without a lot of RAM/swap).

C-Otto · 2022-11-28T13:00:17Z

Aside from the RAM spike, the migration worked out just fine and my node did a few forwards. I restarted again to compact the DB:

13:57:44.847 [INF] CHDB: DB compaction of /home/bitcoin/.lnd/data/graph/mainnet/wtclient.db successful, 5958950912 -> 1115611136 bytes (gain=5.34x)

Before the migration the DB had a size of 4650852352 bytes, so it's more like 4.17x. Nice, thanks a lot!

PS: With gzip I was able to bring this down to 65 MByte, so maybe there's more potential.

ellemouton · 2022-11-28T13:27:36Z

PS: With gzip I was able to bring this down to 65 MByte, so maybe there's more potential.

ok cool - so with gzip, the gain is about 72x? awesome! Thanks for giving this a spin @C-Otto ! should get even more gains once the DeleteSession PR is in :)

Also - my original rough estimate for the db gains a node would get if they only had 1 channel was 1792x. So am I correct in saying that the node you migrated has over its lifetime roughly always had about 24 active channels at any given time (very rough estimate)?

C-Otto · 2022-11-28T13:32:55Z

For each session I checked, I see a bunch of channels - more than 30 each. But I'd say 24 active channels per sessions sounds very realistic to me!

ellemouton · 2022-12-23T10:03:04Z

Report

I wrote a script here that allows you to quickly create a large wtclient.db file. It attempts to always have about 20 active channels at any given point.

Using this script, I created a wtclient.db file of 3GB. It had 25756 sessions with updates across a total of 125 channels (with around 20 being active at any given time)

End result:

After migrating this DB and the compacting it, I ended up with a db of size 185MB (gain of 16x).

NOTE: I found that first migrating the DB and then compacting it was significantly faster than first compacting and the migrating and compacting again. The only down side here is that if you migrate first, your db size will grow slightly (in my example, it grew to 3.2GB).

Migrating using multiple Txs vs 1 big TX

The migration has been split up into multiple transactions so that the migration is less RAM intensive.
Using my 3GB example DB, I preformed the migration using both methods. With the all-in-one-tx method, the average RAM usage was around 185MB and with the multiple-txs method (which migrates 5000 sessions per tx), the average RAM usage was around 12800 KB.

Note that when @C-Otto tried the migration (all-in-one) tx with his ~5GB db, the average RAM usabel was around 3GB but it is worth nothing that the encoding of certain items has changed since then so should be less now. @C-Otto, if you still have a copy of your previous DB around, it would be awesome to hear how this new multi-tx approach performs on it if you are willing to try it on a copy of that copy :)

cc @Crypt-iQ & @bhandras

C-Otto · 2022-12-23T11:33:21Z

Thanks, @ellemouton! I don't think I have the old DB anymore, but we'll get plenty of reports once this is released :)

lightninglabs-deploy · 2022-12-30T11:46:08Z

@bhandras: review reminder
@Roasbeef: review reminder
@Crypt-iQ: review reminder
@ellemouton, remember to re-request review from reviewers when ready

bhandras

Looking really good! 🥇 Great work on the range index code too 🚀 My remaining question is whether to keep the new DB channel ID vs just using channel IDs?

watchtower/wtdb/range_index.go

watchtower/wtdb/range_index_test.go

watchtower/wtdb/migration4/client_db.go

watchtower/wtdb/client_db.go

bhandras

LGTM 🚀 Excellent work @ellemouton! ✨

watchtower/wtdb/migration4/client_db.go

bhandras · 2023-01-05T19:37:25Z

watchtower/wtdb/migration4/client_db.go

+	}
+
+	// Start the iteration and exit on condition.
+	for k := k; k != nil; k, _ = c.Next() {


Just realized the kvdb cursor should have a Current() member so it's easier to write these type of loops.

watchtower/wtdb/migration4/codec.go

watchtower/wtdb/client_db.go

Crypt-iQ · 2023-01-05T16:12:52Z

watchtower/wtdb/migration2/client_db.go

+	// For each of these, create a new sub-bucket with this key in
+	// cChanDetailsBkt. In this sub-bucket, add the cChannelSummary key with
+	// the encoded ClientChanSummary as the value.
+	err = chanSummaryBkt.ForEach(func(chanID, summary []byte) error {


This is not a RAM-intensive migration since the summaries are just small pkscripts, right? If it is intensive, could be worth splitting this ForEach into multiple transactions.

did a test that does this migration with 500 channels & found that the path does not even show up in the heap profile graph. Same for migration 3.

watchtower/wtdb/migration3/client_db.go

watchtower/wtdb/range_index_test.go

watchtower/wtdb/client_db.go

watchtower/wtdb/version.go

Crypt-iQ · 2023-01-06T18:33:49Z

watchtower/wtdb/version.go

+			continue
+		}
+
+		err = kvdb.Update(db.bdb(), func(tx kvdb.RwTx) error {


👍 the migrations should've been in separate tx's from the beginning

watchtower/wtdb/migration4/client_db.go

watchtower/wtdb/client_db.go

Crypt-iQ · 2023-01-09T16:18:53Z

watchtower/wtdb/range_index.go

+	"sync"
+)
+
+// RangeIndex can be used go keep track of which numbers have been added to a


So you're saying that there can be gaps on session A? This makes sense. It's been a while since I took real analysis, but what tripped me up is that you said the heights cannot be monotonically increasing. I thought, but can't find a good internet answer, that a monotonically increasing sequence could have gaps - the only constraint here being that the sequence increases. I am curious to know the answer but it doesn't matter for this patch.

Small refactor just to make the upcoming commit easier to parse. In this commit, we make better use of the getChanSummary helper function.

In this commit a migration of the tower client db is done. The migration creates a new top-level cChanDetailsBkt bucket and for each channel found in the old cChanSummaryBkt bucket, it creates a new sub-bucket. In the subbucket, the ClientChanSummary is then stored under the cChannelSummary key. The reason for this migration is that it will be useful in future when we want to store more easily accessible data under a specific client ID.

In this commit, a new channel-ID index is added to the tower client db with the help of a migration. This index holds a mapping from a db-assigned-ID (a uint64 encoded using BigSize encoding) to real channel ID (32 bytes). This mapping will help us save space in future when persisting references to channels.

In this commit, a new concept called a RangeIndex is introduced. It provides an efficient way to keep track of numbers added to a set by keeping track of various ranges instead of individual numbers. Notably, it also provides a way to map the contents & diffs applied to the in memory RangeIndex (which uses a sorted array structure) to a persisted KV structure.

Refactor the putClientSessionBody to take in a session sub-bucket rather than the top-level session bucket. This is mainly to make an upcoming commit diff easier to parse.

In this commit, an RangeIndex set is added to the ClientDB along with getter methods that can be used to populate this in-memory set from the DB.

In preparation for an upcoming commit where some helper functions will need access to the ClientDB's ackedRangeIndex member, this commit converts those helper functions into methods on the ClientDB struct.

Add a shouldFail boolean parameter to the migtest ApplyMigrationWithDB in order to make it easier to test migration failures.

In this commit, we add the ability to add a wtdb version migration that does not get given a transaction but rather a whole db object. This will be useful for migrations that are best done in multiple transaction in order to use less RAM.

In this commit, the code for migration 4 is added. This migration takes all the existing session acked updates and migrates them to be stored in the RangeIndex form instead. Note that this migration is not activated in this commit. This is done in a follow up commit in order to keep this one smaller.

In this commit, a migration is done that takes all the AckedUpdates of all sessions and stores them in the RangeIndex pattern instead and deletes the session's old AckedUpdates bucket. All the logic in the code is also updates in order to write and read from this new structure.

ellemouton · 2023-01-11T13:34:26Z

Thanks so much for the reviews @Crypt-iQ & @bhandras !!! 🚀

@Crypt-iQ - I fixed the nits you found and left comments re your question about if the first 2 migrations are intensive.

Roasbeef

LGTM 🐒

ellemouton force-pushed the wtclientMigrations branch from e2ca32a to cf1d99c Compare October 18, 2022 11:32

bhandras self-requested a review October 18, 2022 11:53

ellemouton added watchtower migration disk space labels Oct 18, 2022

ellemouton mentioned this pull request Oct 19, 2022

multi+towers: add ClientSessionFilterFn option #7059

Merged

ellemouton force-pushed the wtclientMigrations branch from cf1d99c to 01a530c Compare October 19, 2022 09:10

Roasbeef self-requested a review October 20, 2022 17:39

ellemouton force-pushed the wtclientMigrations branch 2 times, most recently from 1683be8 to 94005e4 Compare October 21, 2022 18:46

This was referenced Oct 21, 2022

delete sessions ellemouton/lnd#37

Closed

watchtower: start using the DeleteSession message #7069

Merged

bhandras reviewed Oct 25, 2022

View reviewed changes

Roasbeef reviewed Oct 28, 2022

View reviewed changes

C-Otto reviewed Nov 20, 2022

View reviewed changes

watchtower/wtdb/range_index.go Outdated Show resolved Hide resolved

ellemouton mentioned this pull request Nov 21, 2022

[feature]: channel watchtower session counter #7136

Open

ellemouton force-pushed the wtclientMigrations branch 2 times, most recently from d192c7e to 51ff437 Compare November 28, 2022 11:34

C-Otto reviewed Nov 28, 2022

View reviewed changes

watchtower/wtdb/client_db.go Outdated Show resolved Hide resolved

C-Otto reviewed Nov 28, 2022

View reviewed changes

watchtower/wtdb/client_db_test.go Outdated Show resolved Hide resolved

C-Otto reviewed Nov 28, 2022

View reviewed changes

watchtower/wtdb/migration4/client_db.go Outdated Show resolved Hide resolved

ellemouton requested a review from Roasbeef November 30, 2022 15:36

ellemouton force-pushed the wtclientMigrations branch 2 times, most recently from 1faac9f to fd37c72 Compare December 5, 2022 15:23

ellemouton requested review from Crypt-iQ and bhandras December 23, 2022 10:06

bhandras reviewed Jan 4, 2023

View reviewed changes

ellemouton force-pushed the wtclientMigrations branch 2 times, most recently from ce108a6 to 2aefe41 Compare January 5, 2023 14:07

ellemouton requested a review from bhandras January 5, 2023 15:11

bhandras approved these changes Jan 5, 2023

View reviewed changes

ellemouton force-pushed the wtclientMigrations branch 2 times, most recently from 1aeeb50 to 7f40398 Compare January 6, 2023 08:25

Crypt-iQ approved these changes Jan 9, 2023

View reviewed changes

ellemouton added 12 commits January 11, 2023 13:46

watchtower: make better use of getChanSummary

4ab8c57

Small refactor just to make the upcoming commit easier to parse. In this commit, we make better use of the getChanSummary helper function.

watchtower: refactor putClientSessionBody

4ea6c7d

Refactor the putClientSessionBody to take in a session sub-bucket rather than the top-level session bucket. This is mainly to make an upcoming commit diff easier to parse.

watchtower: add in-mem acked range index to ClientDB

e2ae563

In this commit, an RangeIndex set is added to the ClientDB along with getter methods that can be used to populate this in-memory set from the DB.

watchtower: convert helpers to methods

0f6229d

In preparation for an upcoming commit where some helper functions will need access to the ClientDB's ackedRangeIndex member, this commit converts those helper functions into methods on the ClientDB struct.

channeldb: add shouldFail param to ApplyMigrationWithDB

af076d8

Add a shouldFail boolean parameter to the migtest ApplyMigrationWithDB in order to make it easier to test migration failures.

docs: add release notes for 7055

e52461b

ellemouton force-pushed the wtclientMigrations branch from 7f40398 to e52461b Compare January 11, 2023 12:00

Roasbeef approved these changes Jan 13, 2023

View reviewed changes

guggero merged commit 0444792 into lightningnetwork:master Jan 17, 2023

ellemouton deleted the wtclientMigrations branch January 17, 2023 10:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

watchtower: reduce AckedUpdate storage footprint #7055

watchtower: reduce AckedUpdate storage footprint #7055

ellemouton commented Oct 18, 2022 •

edited

bhandras left a comment

bhandras Oct 25, 2022

ellemouton Oct 28, 2022

Roasbeef left a comment

Roasbeef Oct 28, 2022

ellemouton Oct 28, 2022

Crypt-iQ Dec 9, 2022

ellemouton Jan 5, 2023

ellemouton Jan 5, 2023

Crypt-iQ Jan 9, 2023

ellemouton Jan 11, 2023

bhandras commented Oct 28, 2022 •

edited

C-Otto commented Nov 20, 2022

C-Otto commented Nov 28, 2022

C-Otto commented Nov 28, 2022 •

edited

ellemouton commented Nov 28, 2022 •

edited

C-Otto commented Nov 28, 2022

ellemouton commented Dec 23, 2022 •

edited

C-Otto commented Dec 23, 2022

lightninglabs-deploy commented Dec 30, 2022

bhandras left a comment

bhandras left a comment

bhandras Jan 5, 2023

Crypt-iQ Jan 5, 2023

ellemouton Jan 11, 2023

Crypt-iQ Jan 6, 2023

Crypt-iQ Jan 9, 2023

ellemouton commented Jan 11, 2023

Roasbeef left a comment

watchtower: reduce AckedUpdate storage footprint #7055

watchtower: reduce AckedUpdate storage footprint #7055

Conversation

ellemouton commented Oct 18, 2022 • edited

Main Achievements:

Change Log:

1. Migrate ChanDetails Bucket

2. Add a new Channel-ID index

3. Migrate AckedUpdates to the RangeIndex model.

Note:

bhandras left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roasbeef left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhandras commented Oct 28, 2022 • edited

C-Otto commented Nov 20, 2022

C-Otto commented Nov 28, 2022

C-Otto commented Nov 28, 2022 • edited

ellemouton commented Nov 28, 2022 • edited

C-Otto commented Nov 28, 2022

ellemouton commented Dec 23, 2022 • edited

Report

End result:

Migrating using multiple Txs vs 1 big TX

C-Otto commented Dec 23, 2022

lightninglabs-deploy commented Dec 30, 2022

bhandras left a comment

Choose a reason for hiding this comment

bhandras left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ellemouton commented Jan 11, 2023

Roasbeef left a comment

Choose a reason for hiding this comment

ellemouton commented Oct 18, 2022 •

edited

bhandras commented Oct 28, 2022 •

edited

C-Otto commented Nov 28, 2022 •

edited

ellemouton commented Nov 28, 2022 •

edited

ellemouton commented Dec 23, 2022 •

edited