Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Avoid loading all notes in memory #2183

Merged
merged 1 commit into from
Apr 17, 2023

Conversation

douglaz
Copy link
Contributor

@douglaz douglaz commented Apr 10, 2023

Stream data from DB in descending denomination order then only uses the required notes. Also calculates and uses a summary of the notes when some metrics like total amount are required.

Although this started as solution to #82 and may improve the speed of select_notes, the focus is mostly on reducing memory usage.

@codecov
Copy link

codecov bot commented Apr 10, 2023

Codecov Report

Patch coverage: 84.26% and project coverage change: +0.52 🎉

Comparison is base (3c15cba) 59.22% compared to head (8bdf528) 59.75%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2183      +/-   ##
==========================================
+ Coverage   59.22%   59.75%   +0.52%     
==========================================
  Files         152      153       +1     
  Lines       31759    32213     +454     
==========================================
+ Hits        18810    19248     +438     
- Misses      12949    12965      +16     
Impacted Files Coverage Δ
fedimint-cli/src/lib.rs 5.65% <0.00%> (+0.04%) ⬆️
gateway/ln-gateway/src/actor.rs 46.19% <0.00%> (+1.35%) ⬆️
modules/fedimint-mint-client/src/lib.rs 0.51% <0.00%> (-0.02%) ⬇️
fedimint-core/src/db/mod.rs 84.97% <55.20%> (-2.63%) ⬇️
fedimint-rocksdb/src/lib.rs 81.46% <90.96%> (+9.96%) ⬆️
fedimint-client-legacy/src/mint/mod.rs 88.07% <93.13%> (+2.06%) ⬆️
fedimint-client-legacy/src/lib.rs 78.71% <100.00%> (+0.56%) ⬆️
fedimint-core/src/db/mem_impl.rs 90.40% <100.00%> (+1.01%) ⬆️
fedimint-core/src/db/notifications.rs 95.39% <100.00%> (+0.25%) ⬆️
fedimint-core/src/tiered.rs 91.66% <100.00%> (+0.30%) ⬆️
... and 2 more

... and 22 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

let (big_note_amount, big_note, checkpoint) =
last_big_note_checkpoint.expect("to have at least one big note");
// we can't make exact change, remove all small notes after the big note
selected.truncate(checkpoint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what's going here. Either there's something wrong here, or this function might need a bit more comments to explain things for a bit slower (yet still valuable!) developers like myself. :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the new comments


#[test_log::test(tokio::test)]
async fn select_notes_returns_exact_amount_with_minimum_notes() {
let f = || {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.clone a Vec<_> instead of lambda?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine as is

.into_iter()
.flat_map(|(amount, number)| vec![(amount, 0usize); number])
.sorted()
.rev(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.rev confuses me here. Said it's sorted... I didn't expect it to be in reversed order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

futures::stream::iter(
notes
.into_iter()
.flat_map(|(amount, number)| vec![(amount, 0usize); number])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How ... what ... that 0usize. I don't get it. Why do we need a constant 0 here? What is that usize in futures::Stream<Item = (Amount, usize)>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can think the zero is something like a SpendableNote, but just a mock for test purposes. But note this code already exists, I've just changed to a stream and moved it to perhaps a better place: https://github.com/fedimint/fedimint/blob/master/fedimint-core/src/tiered_multi.rs#L447

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be () if it doesn't mean anything? Or it has to be 0usize? 🗡️

I still don't get it, but whatever - not very important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment would be helpful, I am confused too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See now

let stream = isolated
.raw_find_by_prefix_sorted_descending(key_prefix)
.await?
.collect::<Vec<_>>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to collect into a Vec, which seems undesirable. I see that raw_find_by_prefix was doing it already, but isn't this ruining the benefits of streaming? We are still going to load all the matching notes at once into memory, just fake later that it was streaming ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, this was an oversight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trickier than it seems. The borrow checker isn't happy if you remove the collect. I know it is feasible because if I inline the code of IsolatedDatabaseTransaction then it works: #2188

@m1sterc001guy any ideas here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where I got stuck too :) Unless @dpc has ideas I think its fine to just inline the isolated code here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be addressed on #2202

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact it's blocked by that PR because raw_find_by_prefix_sorted_descending will be implemented in the same way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires a rebase now I assume?

.expect("Error doing prefix search in database")
.map(move |(key_bytes, value_bytes)| {
let key = KP::Record::from_bytes(&key_bytes, &decoders)
.expect("Unrecoverable error reading DatabaseKey");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's print at least the key (in case the value is somehow sensitive) to help debugging if this ever happens? We had AbbreviateHexBytes to help with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added context

let mut next_prefix = prefix.to_vec();
let mut increased = false;
for i in (0..next_prefix.len()).rev() {
if next_prefix[i] < 0xff {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if next_prefix[i] == 0xff seems easier to grasp.

Copy link
Contributor

@dpc dpc Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for i in (0..next_prefix.len()).rev() {
  let next_byte = next_prefix[i].wrapping_add(1);
  next_prefix[i] = next_byte;
  if next_byte != 0 {
    increased = true;
    break;
  }
}

or something like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That increased is kind of meh name. We always going to increase that number ... kind of ... . Maybe non_zero_digit? Especially with my version it seems to be a bit more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look now

fedimint-rocksdb/src/lib.rs Outdated Show resolved Hide resolved
}
}
if !increased {
bail!("Prefix is already the max one: {prefix:?}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should panic here. Seems like handling 0xff... should be taken care of. Return an Option maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't panic, it's an error. In fact it could be a panic because it's a bug (an invalid prefix has been created).

But if we return an option the caller should do the same, IMHO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, why if 0xff an invalid prefix? We might declare it as such because it's convenient, but it's not obviously true to me.

Correct me if I'm wrong, but in that case we might want to return the empty prefix vec![] so that we start iterating from the absolutely last element when iterating in reverse order. Naturally the 0xff prefix should come first then.

fedimint-rocksdb/src/lib.rs Outdated Show resolved Hide resolved
) -> Result<PrefixStream<'_>> {
let mut str_prefix = "".to_string();
for prefix in key_prefix {
str_prefix = format!("{str_prefix}{prefix:02X?}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a meh way to format this prefix allocating new string for every byte.

format!("{}%", to_hex(key_prefix));

kind of thing should work instead?

Copy link
Contributor Author

@douglaz douglaz Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems inefficient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced by hex::encode

let results = self.tx.fetch_all(query_prepared).await;

if results.is_err() {
warn!("sqlite find_by_prefix failed to retrieve key range. Returning empty iterator");
Copy link
Contributor

@dpc dpc Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really an error? I don't think queries that return nothing should unconditionally lead to warning messages in logging.

Oh, I see that it was like that in the previous code...

Oh and that formatting as well... :D ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not mandatory, but please consider making it better (here and in existing version) while at it. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, we should probably also make raw_find_by_prefix return a Result so that we can just return the error here.

Copy link
Contributor Author

@douglaz douglaz Apr 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added a FIXME, this looks suitable to be improved in a future PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already made the return type a result by now it seems, could we return the error here now?

return Ok(Box::pin(stream::iter(Vec::new())));
}

let rows = results.unwrap().into_iter().map(|row| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could all be

results.unwrap_or_else(|| vec![]).map(|row| {

and that if results.is_err() could be avoided altogether...

Is sqlite really returning error on empty result? That seems ... not right ... how is the caller supposed to figure out if there were no matches, or maybe someone unplugged the HDD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't ask me, I've just copied the code 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, it was throwing an error :) There might be a better API to use as opposed to fetch_all but at the time this was the easiest to get it to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a closer look at sqlite, perhaps we can improve that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added a FIXME, this looks suitable to be improved in a future PR

Copy link
Contributor

@dpc dpc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had plenty of comments (one of them being the actual streaming of notes not working like streaming and fetching all notes anyway), but putting them aside, on the higher level...

What are we optimize for exactly here, and should we get some benchmarks?

I assume that if we are going to stream the notes, then there's going to be some extra overhead of "give me more". When fetching all notes at once, I'd assume sqlite does everything once, returns it and that's about it. With an iterator (depending on how it is actually implement it) it could be: "give me one; here it goes; give me another one; here it goes; and another one please; there you go...". Each back and forth by necessity must have some overhead. Maybe negligible, maybe amortized, but still some.

With fetching all, we obviously waste (briefly, but still) some memory (allocation), and we the db needs to overfetch some data.

But with iterator we might actually be doing many more IO operations scattered over longer time. Total load on the system might be higher, especially if typically we need to fetch .. donno... 75% of all notes anyway (given that big notes are ... well... big, so there need to be fewer of them to compose a certain amount, one would expect).

If you have 1, 2, 4, 8 notes, you have 15 msats total, but if you make payments in 4-7 range you usually need to fetch all but one note anyway. So I expect, that even while having a decent balance, and making smallish payments, still a good chunk of notes will need to be fetched?

So I think it's fair to ask - is it worth it? The number of notes is not going to be sky high. I suspect 50-150 range, because:

pub const DEFAULT_MAX_NOTES_PER_DENOMINATION: u16 = 3;

and we power-of-2 tiers by default, so we can't be going all that far w.r.t number of tier.

@dpc
Copy link
Contributor

dpc commented Apr 10, 2023

Oooohhh... I was confused. So the algorithm is going from largest to smallest ...

But doesn't have the opposite problem? If you need to make a odd-value payment ... you're going to need to fetch almost all notes anyway, no? Because you need that 1-value note, that will come last, no matter what you do...

And then 3 out or 4 times you need all but 2 smallest tiers...

These exponential odds seem to conspire against us!

@dpc
Copy link
Contributor

dpc commented Apr 10, 2023

It seems to me that instead of streaming notes, if performance is the goal, it would be best to just put an in-memory write-through cache in front of that notes in database thing, and always have it memory after initial load. 🤷

@douglaz
Copy link
Contributor Author

douglaz commented Apr 10, 2023

I had plenty of comments (one of them being the actual streaming of notes not working like streaming and fetching all notes anyway), but putting them aside, on the higher level...

What are we optimize for exactly here, and should we get some benchmarks?

Average/max memory usage and average I/O operations. Benchmarks would be nice, but I guess tricky to get?

I assume that if we are going to stream the notes, then there's going to be some extra overhead of "give me more".

This may or may not be not true. But even if it's true, note that the new code won't add more overhead for a rocksdb database. It will just read less of the existing iterator.

When fetching all notes at once, I'd assume sqlite does everything once, returns it and that's about it. With an iterator (depending on how it is actually implement it) it could be: "give me one; here it goes; give me another one; here it goes; and another one please; there you go...". Each back and forth by necessity must have some overhead. Maybe negligible, maybe amortized, but still some.

This is a good point. Some notes:

  1. Both the new code and the old code will sort the data on sqlite (But you could argue there is a possible optimization where no sorting could be done)
  2. Both the old code and the new code won't really stream on sqlite (I didn't change the fetchall)
  3. Given the point above you can see I treated sqlite as a second citizen database. The whole idea was focused on rocksdb-like databases (where data is naturally sorted by key).

So this implementation doesn't really make anything better for databases like sqlite, but it also doesn't make it worse performance-wise.

With fetching all, we obviously waste (briefly, but still) some memory (allocation), and we the db needs to overfetch some data.

But with iterator we might actually be doing many more IO operations scattered over longer time. Total load on the system might be higher, especially if typically we need to fetch .. donno... 75% of all notes anyway (given that big notes are ... well... big, so there need to be fewer of them to compose a certain amount, one would expect).

Look at the implementation of fetch_all, it's iterators/streams all way down. So it won't require more IO operations.

If you have 1, 2, 4, 8 notes, you have 15 msats total, but if you make payments in 4-7 range you usually need to fetch all but one note anyway. So I expect, that even while having a decent balance, and making smallish payments, still a good chunk of notes will need to be fetched?

So I think it's fair to ask - is it worth it? The number of notes is not going to be sky high. I suspect 50-150 range, because:

pub const DEFAULT_MAX_NOTES_PER_DENOMINATION: u16 = 3;

and we power-of-2 tiers by default, so we can't be going all that far w.r.t number of tier.

If the number of notes is and will always be small then it's fair to say: don't bother.

But is this really the case? So there is not such thing as a "big wallet"?

@douglaz
Copy link
Contributor Author

douglaz commented Apr 10, 2023

It seems to me that instead of streaming notes, if performance is the goal, it would be best to just put an in-memory write-through cache in front of that notes in database thing, and always have it memory after initial load. shrug

Although the new code have the same or better performance (as far as it can be determined by simple inspection), I can't say it's the goal. See the comment above. But I agree, putting everything in memory will probably be faster if you don't bother about memory consumption.

@douglaz
Copy link
Contributor Author

douglaz commented Apr 11, 2023

@douglaz Currently we use a greedy note selection algorithm, but in the future we may want to choose notes that have a large enough anonymity set. How would this approach change?

Just to be clear, this PR opens up possibilities, it doesn't restrict you in any way. More specifically:

  1. It formalizes that notes can be streamed up from DB sorted by denomination amount. This was already the case but it was actually broken due to small implementation details.
  2. It formalizes a TieredSummary when you just need to know how many notes of each denomination you have. This structure was mostly there (without this fancy name), but now the code calculates it without loading the full notes on memory. A further optimization would be, as suggested by @elsirion elsewhere, to skip or delay decoding the full SpendableNote on this and other cases.
  3. Uses 1) to implement the greedy selection algorithm by descending denomination amount, reading only a subset of the notes in many cases (reducing IO) and keeping the minimal number of notes on memory (reducing memory usage)

Now if you want to implement a algorithm focused on common notes you can use the idea of 1) and 3) but perhaps reading in ascending denomination order. Then you have at least two options:
A) If you have an up to date TieredSummary, you iterate over the stream keeping exactly what you need, stopping as soon as you are done
B) Without a TieredSummary, the accumulator keeps all of read lower denomination notes, stopping as soon as you are done

And of course you can do:
C) Same as B), but always reading until the end, without stopping earlier

Note that the current code on master implements C focusing on high denomination notes. This PR still keeps that possible but, besides implementing 3), it also makes it easier to implement A) and B)

(Other than all that, if you have a TieredSummary on hand, there are other possibilities involving more complex DB queries, but I won't detail it now)

@jkitman
Copy link
Contributor

jkitman commented Apr 11, 2023

@douglaz Thanks for the detailed explanation, glad we have that case covered.

@douglaz
Copy link
Contributor Author

douglaz commented Apr 13, 2023

Rebased

@douglaz
Copy link
Contributor Author

douglaz commented Apr 13, 2023

Addressed code review

@douglaz douglaz force-pushed the limited_memory_usage_ng branch 2 times, most recently from 2cb0a55 to be65b79 Compare April 13, 2023 04:57
}
None => {
assert!(pending_amount > Amount::ZERO);
if requested_amount > total_amount {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Do we need total_amount here? To me it seems we could just check if last_big_note_checkpoint is Some, which would also avoid the expect further down:

let (big_note_amount, big_note, checkpoint) =
last_big_note_checkpoint.expect("to have at least one big note");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would make things more clearer, but it isn't strictly necessary. Changed to not use it

let stream = isolated
.raw_find_by_prefix_sorted_descending(key_prefix)
.await?
.collect::<Vec<_>>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires a rebase now I assume?

}
}
if !increased {
bail!("Prefix is already the max one: {prefix:?}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, why if 0xff an invalid prefix? We might declare it as such because it's convenient, but it's not obviously true to me.

Correct me if I'm wrong, but in that case we might want to return the empty prefix vec![] so that we start iterating from the absolutely last element when iterating in reverse order. Naturally the 0xff prefix should come first then.

@@ -15,6 +15,7 @@ sqlx = { version = "0.7.0-alpha.2", default-features = false, features = ["runti
fedimint-core ={ path = "../fedimint-core" }
rand = "0.8"
tracing = "0.1.37"
hex = "0.4.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tried to get rid of the hex dependency some time ago and instead used the impl inside bitcoin_hashes. Not sure if worth the hassle though … @mxz42

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure either, but at some point we can depend on their hex crate as we depend on bitcoin_hashes anyway.
we also use the format_hex macro in a few places.

let results = self.tx.fetch_all(query_prepared).await;

if results.is_err() {
warn!("sqlite find_by_prefix failed to retrieve key range. Returning empty iterator");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already made the return type a result by now it seems, could we return the error here now?

@douglaz
Copy link
Contributor Author

douglaz commented Apr 14, 2023

Rebased

@douglaz
Copy link
Contributor Author

douglaz commented Apr 14, 2023

Addressed code review

Copy link
Contributor

@elsirion elsirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, referring to @m1sterc001guy for the DB specific impls.

@@ -351,8 +351,16 @@ pub trait IDatabaseTransaction<'a>: 'a + MaybeSend {

async fn raw_remove_entry(&mut self, key: &[u8]) -> Result<Option<Vec<u8>>>;

/// Returns an stream of key-value pairs with keys that start with
/// `key_prefix`. No particular ordering is guaranteed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we might want to make this ascending in the future

@m1sterc001guy m1sterc001guy merged commit bc488c2 into fedimint:master Apr 17, 2023
21 checks passed
@douglaz douglaz deleted the limited_memory_usage_ng branch April 17, 2023 19:48
elsirion added a commit to elsirion/fedimint that referenced this pull request Apr 18, 2023
Avoid loading all notes in memory on input creation.
elsirion added a commit to elsirion/fedimint that referenced this pull request Apr 19, 2023
Avoid loading all notes in memory on input creation.
elsirion added a commit to elsirion/fedimint that referenced this pull request Apr 19, 2023
Avoid loading all notes in memory on input creation.
elsirion added a commit to elsirion/fedimint that referenced this pull request Apr 20, 2023
Avoid loading all notes in memory on input creation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants