Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wallet] Transaction creation overhaul #258

Merged

Conversation

LLFourn
Copy link
Contributor

@LLFourn LLFourn commented Jan 1, 2021

HAPPY NEW YEAR!

Background

While trying to do #114 I was making a mess.
This PR is an attempt to tackle the problems with the internals of transactions creation that made the PR difficult.

In addition it fixes #251.

Major Changes/Improvements

  • TxBuilders are not created directly but through build_tx and build_fee_bump methods on Wallet. To finally create the tx you call .finish() on the builder.
    • The same code is now used to create the fee bumping tx as a normal tx 🎉. A lot of the code from bump_fee is gone!
  • TxBuilder methods take &mut self rather than self -- makes building transactions more ergonomic since you can use chaining or assign the tx builder and just make calls on it.
  • TxBuilder.add_utxo now retrieves UTXO data and stores it. If the UTXO doesn't exist then you get an error early.
    • This fixes TxBuilder should validate add_utxo early #251 and is the main change intended to make Add foreign outputs to TxBuilder #114 easier since it allows normalizing the local and foreign UTXOs into the same list early.
    • TxBuilder.utxos has been removed in favor of repeatedly calling add_utxo (which is now more ergonomic because of the &mut self change). TxBuilder.utxos has been replaced with TxBuilder.add_utxos which doesn't overwrite the existing utxos but allows you to add a list on top the existing list.
    • TxBuilder's internal list of UTXOs is now stored as a BTreeMap to prevent duplicates.
  • wallet.get_utxo and wallet.get_descriptor_for_keychain are now exposed publicly.
    • I could have made them pub(crate) only but I think they are fine to be public.
  • The [cfg(test)] only method received_tx that was used internally for testing is now a macro so it can be used from doctests. It's called populate_test_db. Several no_run tests can now be run. See fb167ad and 5fad5e2 .
  • TxBuilder is now Clone.
  • Wallet is now Debug.

This opens up the door to fixing the following issues:

  • #144 - whether the tx has been confirmed or not can now be checked in add_utxo.
  • #153 - merging an existing transaction into the builder is now easy -- this is how build_fee_bump works internally.
  • #114 - adding a foreign utxo without losing the order is now easier since we normalize and populate UTXO data up front both local and foreign utxos can be placed in the same utxos Vec.

Notes to the reviewers

This is based of #255 so merge after that.
Sorry for the very large PR I wasn't able to nicely separate the external and internal changes easily.

Checklists

All Submissions:

  • I've signed all my commits
  • I followed the contribution guidelines
  • I ran cargo fmt and cargo clippy before committing

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature
  • I've updated CHANGELOG.md

@LLFourn LLFourn marked this pull request as draft January 1, 2021 04:01
@LLFourn
Copy link
Contributor Author

LLFourn commented Jan 1, 2021

Converted to a draft since I forgot to fix stuff behind feature flags!

@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch 2 times, most recently from 0ebce97 to 826b996 Compare January 2, 2021 02:24
@LLFourn
Copy link
Contributor Author

LLFourn commented Jan 2, 2021

Because I changed the testutils package I had to make the the dependency path relative (826b996) for it to work. Does this create a problem when publishing it?

This is ready for review now.

@LLFourn LLFourn marked this pull request as ready for review January 2, 2021 02:38
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch from e780c5d to 104b802 Compare January 5, 2021 00:47
@afilini
Copy link
Member

afilini commented Jan 5, 2021

I haven't finished reading the code yet, but here are just a few things I've noticed that might need some more attention:

  1. I wonder if it's really worth switching to &mut self in TxBuilder: other than the coin_selection() method, I think finish() would also benefit from capturing the value, because it could consume it and prevent any use of the "empty" builder afterwards.
    • Regarding the coin selection specifically, I like your idea of removing the Database type argument and giving the user the responsibility to keep a reference to it manually. Though considering it would be (at least in my opinion) somewhat common maybe I would try to provide at least an example and possibly even some utilities to make it easier to do, but that can be part of the following PR.
    • I was also thinking that it might make sense to change the return type of every method (regardless of whether it remains a &mut ref or not) to a Result, just to keep it consistent with other methods like add_utxo() that have to return results.
    • And again regarding the methods that can fail: we should really be careful not to make methods that can create an inconsistent state internally when they fail (unless we want to do this by design, in which case again taking ownership of the struct would help because we just don't return it in case of error).
  2. I would still keep the utxos() method on TxBuilder, just because I think again it would be pretty commonly used. As you said the same can be achieved by just calling add_utxo() repeatedly, but in some cases it's nice to have a shortcut available.
  3. I was thinking that maybe renaming build_tx() to build_new_tx() would make it easier to understand what it does compared to build_fee_bump() (since that's also technically "building a tx", but not a "new" one, not doing it from scratch). But then again, I generally like shorter names when possible and the fact that it's "new" is probably pretty easy to understand implicitly. So I don't know about this, I just thought to write it down in case other people like this other name better, but even the current name is totally fine for me.

Then another question, not because it's something that should be fixed but because I wanted to understand this better: is there a specific reason why you chose BTreeMap for the utxos instead of HashMap? I don't think the fact that the items are sorted inside BTreeMap matters in this context, or does it?

@LLFourn
Copy link
Contributor Author

LLFourn commented Jan 6, 2021

@afilini thanks for taking a look.

I wonder if it's really worth switching to &mut self in TxBuilder: other than the coin_selection() method, I think finish() would also benefit from capturing the value, because it could consume it and prevent any use of the "empty" builder afterwards.

Just to be clear, it is implausible to have an api that consumes self but returns a Result but somehow preserves self for the caller. The only way is to make the return value Result<Self, (Self, Error)> which is terrible for ergonomics. So methods that return Result must take &mut self and return Result<&mut Self, Error>.
Now if we want to call finish in an API chaining way we cannot make it consume self since all the other methods return &mut self.

Of course we could just say that you always have to assign the builder and make finish consume:

let mut builder = wallet.build_tx();
builder.add_recipient(addr.script_pubkey(), 50_000)
           .add_utxo(my_utxo)?
           .manually_selected_only();
let (psbt, details) = builder.finish()?;

I am fond of the efficiency of the current API but I do see that the above may be better -- it's not like your codebase is going to be full of build_tx anyway so the one assignment that is saved is perhaps not as important as the compile time error you would get for using a builder after calling finish on it.

And again regarding the methods that can fail: we should really be careful not to make methods that can create an inconsistent state internally when they fail (unless we want to do this by design, in which case again taking ownership of the struct would help because we just don't return it in case of error).

See above. If they return Result then they self should not be consumed. If something fails you don't want to lose the builder. For example, if you fail to add a UTXO you may want to recover in some way by adding a different one.

I was also thinking that it might make sense to change the return type of every method (regardless of whether it remains a &mut ref or not) to a Result, just to keep it consistent with other methods like add_utxo() that have to return results.

IMO as a user having methods that are infallible but return Result is annoying (unless you have some plan to make them fallible in the future which may apply to some of them).

I would still keep the utxos() method on TxBuilder, just because I think again it would be pretty commonly used. As you said the same can be achieved by just calling add_utxo() repeatedly, but in some cases it's nice to have a shortcut available.

As I now recall the problem with utxos is that is will now overwrite the utxos of a previous transaction in a fee bump because fee bump inputs are added to the TxBuilder now in build_fee_bump. This would obviously be wrong and come to think of it it's probably wrong to let you overwrite recipients too (but maybe that one makes more sense)?

Perhaps it could be changed to add_utxos(vec_utxos) which 1. validates all the utxos and 2. adds them all if all of them were valid.

I was thinking that maybe renaming build_tx() to build_new_tx() would make it easier to understand what it does compared to build_fee_bump() (since that's also technically "building a tx", but not a "new" one, not doing it from scratch). But then again, I generally like shorter names when possible and the fact that it's "new" is probably pretty easy to understand implicitly. So I don't know about this, I just thought to write it down in case other people like this other name better, but even the current name is totally fine for me.

I was thinking of actually removing build_fee_bump in favor of something like builder.bump_fee(txid)? which you could call multiple times and it would merge all the fee bumps into one and make sure that the feerate was enough to replace all of them (there is a TODO in the code to do something like this). I decided not to go that far in the PR though but after this PR it is certainly possible. So I would keep build_tx as I hope it will be the only way of creating a new tx.

@afilini
Copy link
Member

afilini commented Jan 6, 2021

Just to be clear, it is implausible to have an api that consumes self but returns a Result but somehow preserves self for the caller. The only way is to make the return value Result<Self, (Self, Error)> which is terrible for ergonomics.

Yeah I totally agree with that, my point was actually slightly different: I was thinking about the fact that fallible methods could (when they do fail) leave the TxBuilder in an inconsistent state, if not implemented properly. This can't really happen right now because there's only one method that can fail (add_utxo()) and that one is "atomic". Implementing add_utxos() (more on that later) in a naive way would already potentially introduce this issue: if we were to just call add_utxo() a bunch of times, if a utxo in the middle fails then you'll have added some of them but not the others.

So basically I was suggesting that switching back to using self instead of &mut self would make it impossible to use the builder afterwards, avoiding this problem altogether. On the other end, like you said, it would force you to start over, which is not great.

Having thought about it a bit more I think I agree with you now on the benefits of the mutable reference: yesterday I was just dumping out everything I had in mind and one thing was "if we capture the value and don't return it, we don't have to worry about inconsistencies", and it felt like a good tradeoff. In fact now I see this more as a downside, because I can definitely see a few use-cases where you'd like to try doing something, but keep going if that fails.

I am fond of the efficiency of the current API but I do see that the above may be better -- it's not like your codebase is going to be full of build_tx anyway so the one assignment that is saved is perhaps not as important as the compile time error you would get for using a builder after calling finish on it.

Yes, this seems like a good compromise. As an alternative I guess we could have two methods that do the same thing, one which is more suitable to be chained (takes a reference) and one to use when you've assigned the variable somewhere, which takes its ownership. But I don't know, maybe it would end up being even more confusing...

IMO as a user having methods that are infallible but return Result is annoying (unless you have some plan to make them fallible in the future which may apply to some of them).

Yeah I think it's likely that over time we'll keep adding more methods that can fail or even update some of the ones we have (version() could be one, also the methods that deal with nSequence and nLockTime) so we would get to a point where like half of the methods can fail and half can't and as a user you'd have to guess which one can fail and add the question mark at the end, which is probably even more frustrating.

Perhaps it could be changed to add_utxos(vec_utxos) which 1. validates all the utxos and 2. adds them all if all of them were valid.

Yes, I also like this design better compared to the old utxos()

I was thinking of actually removing build_fee_bump in favor of something like builder.bump_fee(txid)? which you could call multiple times and it would merge all the fee bumps into one and make sure that the feerate was enough to replace all of them (there is a TODO in the code to do something like this). I decided not to go that far in the PR though but after this PR it is certainly possible. So I would keep build_tx as I hope it will be the only way of creating a new tx.

Makes sense, I like this!

@LLFourn
Copy link
Contributor Author

LLFourn commented Jan 6, 2021

In my rush yesterday I forgot to respond to:

Then another question, not because it's something that should be fixed but because I wanted to understand this better: is there a specific reason why you chose BTreeMap for the utxos instead of HashMap? I don't think the fact that the items are sorted inside BTreeMap matters in this context, or does it?

This is actually a brain error. Thanks for catching it. I was trying to preserve insertion and make it a set. Obviously BTreeMap<OutPoint,_> does not give you the former property. TBH the problem with Vec was that there was a test that caused a test failure here:

https://github.com/bitcoindevkit/bdk/blob/master/src/wallet/mod.rs#L2855

Notice how the outpoint is added again in builder for the fee bump? But the outpoint is already in there because it was in the original transaction! I am not sure how the test actually worked before (I guess by accident). But with my internals restructuring this broke something.

So I suggest:

  1. Change it back to Vec and fix the test (remove re-adding outpoint).
  2. Open an issue about de-duplicating in add_utxo(s) (i.e. use a Vec + HashSet to preserve order and perhaps error if a UTXO already exists in there or ignore it -- I'm not sure).

Yes, this seems like a good compromise. As an alternative I guess we could have two methods that do the same thing, one which is more suitable to be chained (takes a reference) and one to use when you've assigned the variable somewhere, which takes its ownership. But I don't know, maybe it would end up being even more confusing...

There's actually one weird trick that let's you do this but keep the same name for the method: https://github.com/dtolnay/case-studies/tree/master/autoref-specialization#background-method-resolution

But I think keeping things simpler is better. I doubt that assigning the builder will turn out to be a real pain point for anyone.

Copy link
Contributor

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Just make coin_selection an argument to .finish() or build_tx() and build_fee_bump() so there is no ambiguity in the order.

We could also add a second build function, something like

        let (psbt, details) = wallet.build_tx_with_coin_selection(AlwaysSpendEverything)
            .add_recipient(to_address.script_pubkey(), 50_000)
            .finish()?;

src/database/memory.rs Show resolved Hide resolved
src/wallet/mod.rs Outdated Show resolved Hide resolved
@afilini
Copy link
Member

afilini commented Jan 7, 2021

  1. Open an issue about de-duplicating in add_utxo(s) (i.e. use a Vec + HashSet to preserve order and perhaps error if a UTXO already exists in there or ignore it -- I'm not sure).

Actually there is deduplication already, here it is in the current master:

bdk/src/wallet/mod.rs

Lines 703 to 712 in a95a9f7

let builder_extra_utxos = builder
.utxos
.iter()
.filter(|utxo| {
!original_txin
.iter()
.any(|txin| &&txin.previous_output == utxo)
})
.cloned()
.collect::<Vec<_>>();

As always, not sure if that's the best thing to do, but it seemed reasonable when I wrote it. If there aren't really any downsides to this I guess we should stick with this behavior (and maybe document it?).

There's actually one weird trick that let's you do this but keep the same name for the method: https://github.com/dtolnay/case-studies/tree/master/autoref-specialization#background-method-resolution

That's really neat! But yeah, I think I agree with you: let's keep it simple. If we see people opening bug reports or complaining in general because of the panic in finish() we'll think about alternative strategies.

@afilini
Copy link
Member

afilini commented Jan 7, 2021

We could also add a second build function, something like

        let (psbt, details) = wallet.build_tx_with_coin_selection(AlwaysSpendEverything)
            .add_recipient(to_address.script_pubkey(), 50_000)
            .finish()?;

I think I still prefer the coin_selection() method, especially considering that we are planning to make some changes so that it can then be used anywhere in the chain. Mainly because it can also be used on builders assigned to a variable : like, I can imagine building a UI where the user does things and those are reflected internally on the builder. If you force the coin selection to be chosen before calling build_tx(), then you can't change that anymore. Instead if it's also available as a method you can also change it later on without having to start over.

Though, in this example, I can also see that it'd be nice to be able to construct a tx after the builder is updated, to show things likes size, fees, etc. @LLFourn maybe we should make the builder Clone? It might not be very efficient, but at least you can "iterate" and see what kind of transaction you get by playing around with the settings.

LLFourn added a commit to LLFourn/bdk that referenced this pull request Jan 11, 2021
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch from c234be1 to d1f2dbe Compare January 11, 2021 03:17
LLFourn added a commit to LLFourn/bdk that referenced this pull request Jan 11, 2021
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch from d1f2dbe to c7e7b15 Compare January 11, 2021 04:16
LLFourn added a commit to LLFourn/bdk that referenced this pull request Jan 11, 2021
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch from c7e7b15 to 748882a Compare January 11, 2021 04:22
@LLFourn
Copy link
Contributor Author

LLFourn commented Jan 12, 2021

I've pushed changes to require the builder to be assigned and then make finish consume the builder. I think this is a good change :).

  1. Open an issue about de-duplicating in add_utxo(s) (i.e. use a Vec + HashSet to preserve order and perhaps error if a UTXO already exists in there or ignore it -- I'm not sure).

Actually there is deduplication already, here it is in the current master:

bdk/src/wallet/mod.rs

Lines 703 to 712 in a95a9f7

let builder_extra_utxos = builder
.utxos
.iter()
.filter(|utxo| {
!original_txin
.iter()
.any(|txin| &&txin.previous_output == utxo)
})
.cloned()
.collect::<Vec<_>>();

As always, not sure if that's the best thing to do, but it seemed reasonable when I wrote it. If there aren't really any downsides to this I guess we should stick with this behavior (and maybe document it?).

I also pushed changes to preserve this behavior but upon further reflection I don't really like it. I think we should consider changing the behavior. Consider the following rules:

  1. When you add_utxo the UTXO must be in your database otherwise it returns an error (i.e. it must be in list_unspent).
  2. When you see a tx in the mempool/chain, BDK deletes your UTXO from the database but adds the transaction.

The implication of these two rules is that add_utxo should error when the utxo in question does not appear in list_unspent (i.e. not in your database). The UTXO is already spent from add_utxo's point of view even if you are going to be adding it to the tx anyway because of the fee bump. We could make it not error for this specific case but I don't see a strong motivation for doing that and not doing it is internally much less complex.

What do you think?

@LLFourn maybe we should make the builder Clone?

yes.

LLFourn added a commit to LLFourn/bdk that referenced this pull request Jan 12, 2021
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch from 748882a to 7a3924e Compare January 12, 2021 03:28
LLFourn added a commit to LLFourn/bdk that referenced this pull request Jan 12, 2021
LLFourn added a commit to LLFourn/bdk that referenced this pull request Jan 22, 2021
LLFourn added a commit to LLFourn/bdk that referenced this pull request Jan 22, 2021
Due to brain malfunction I made utxos into a BTree. This made a test
pass but is incorrect. The test itself was incorrect as per comment in

bitcoindevkit#258 (comment)

So I (1) reverted utxos back to a Vec, (2) fixed the test and expanded
the comment in the test.
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch 3 times, most recently from e9a8633 to a1ac347 Compare January 22, 2021 04:04
Due to brain malfunction I made utxos into a BTree. This made a test
pass but is incorrect. The test itself was incorrect as per comment in

bitcoindevkit#258 (comment)

So I (1) reverted utxos back to a Vec, (2) fixed the test and expanded
the comment in the test.
And make Wallet Debug while I'm at it.
To replace the previously existing ".utxos"
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch from a1ac347 to ff10aa5 Compare January 22, 2021 04:08
@LLFourn
Copy link
Contributor Author

LLFourn commented Jan 22, 2021

I've pushed changes to:

  1. Stop using Option trickery and just make finish consume self in line with discussion above. 890d619
  2. Change utxos back to a Vec and fix the test inline with what I proposed. 10fcba9
  3. Make TxBuilder Clone and Debug (I had to make Wallet Debug too which involved making the address validator trait Debug too). 6fe3be0
  4. added "add_utxos" method back -- @afilini I feel like just removing add_utxo now (since it just calls add_utxos internally). ff10aa5

@afilini
Copy link
Member

afilini commented Jan 22, 2021

Thanks, that looks really good! Reviewing it now.

added "add_utxos" method back -- @afilini I feel like just removing add_utxo now (since it just calls add_utxos internally).

I'd actually just keep it as a shortcut, I think some people might expect it to exist.

src/wallet/tx_builder.rs Outdated Show resolved Hide resolved
@LLFourn LLFourn force-pushed the make_txbuilder_take_ref_to_wallet branch from 71f930b to dbf8cf7 Compare January 29, 2021 01:33
Copy link
Member

@afilini afilini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks really good, sorry I didn't notice those (minor) issues in the CHANGELOG before :(

CHANGELOG.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@LLFourn
Copy link
Contributor Author

LLFourn commented Jan 30, 2021

@afilini Thanks. Fixed nits and turned on trailing whitespace detection for the future :)

@afilini afilini self-requested a review February 1, 2021 14:47
@afilini afilini merged commit 6689384 into bitcoindevkit:master Feb 1, 2021
@LLFourn LLFourn deleted the make_txbuilder_take_ref_to_wallet branch February 2, 2021 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TxBuilder should validate add_utxo early
4 participants