Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wallet: Add CoinGrinder coin selection algorithm #27877

Merged
merged 15 commits into from
Feb 9, 2024

Conversation

murchandamus
Copy link
Contributor

@murchandamus murchandamus commented Jun 13, 2023

Please refer to the topic on Delving Bitcoin discussing Gutter Guard/Coingrinder simulation results.

Adds a coin selection algorithm that minimizes the weight of the input set while creating change.

Motivations

  • At high feerates, using unnecessary inputs can significantly increase the fees
  • Users are upset when fees are relatively large compared to the amount sent
  • Some users struggle to maintain a sufficient count of UTXOs in their wallet

Approach

So far, Bitcoin Core has used a balanced approach to coin selection, where it will generate multiple input set candidates using various coin selection algorithms and pick the least wasteful among their results, but not explicitly minimize the input set weight. Under some circumstances, we do want to minimize the weight of the input set. Sometimes changeless solutions require many or heavy inputs, and there is not always a changeless solution for Branch and Bound to find in the first place. This can cause expensive transactions unnecessarily. Given a wallet with sufficient funds, CoinGrinder will pick the minimal-waste input set for a transaction with a change output. The current implementation only runs CoinGrinder at feerates over 3×long-term-feerate-estimate (by default 30 ṩ/vB), which may be a decent compromise between our goal to reduce costs for the users, but still permit transactions at lower feerates to naturally reduce the wallet’s UTXO pool to curb bloat.

Trade-offs

Simulations for my thesis on coin selection (see Section 6.3.2.1 [PDF]) suggest that minimizing the input set for all transactions tends to grind a wallet’s UTXO pool to dust (pun intended): an approach selecting inputs per coin-age-priority (in effect similar to “largest first selection”) on average produced a UTXO pool with 15× the UTXO count as Bitcoin Core’s Knapsack-based Coin Selection then (in 2016). Therefore, I do not recommend running CoinGrinder under all circumstances, but only at extreme feerates or when we have another good reason to minimize the input set for other reasons. In the long-term, we should introduce additional metrics to score different input set candidates, e.g. on basis of their privacy and wallet health impact, to pick from all our coin selection results, but until then, we may want to limit use of CoinGrinder in other ways.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jun 13, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK achow101, sipa, sr-gi
Concept ACK jonatack, S3RK, kashifs, ishaanam

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28985 (Avoid changeless input sets when SFFO is active by murchandamus)
  • #28977 (Add Gutter Guard Selector by murchandamus)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@DrahtBot DrahtBot changed the title Add CoinGrinder coin selection algorithm wallet: Add CoinGrinder coin selection algorithm Jun 14, 2023
@murchandamus
Copy link
Contributor Author

murchandamus commented Jun 14, 2023

At this point, I’m mostly looking for Concept reviews.

Some ideas floating around in the context that I’d love comments on:

  • Also run CoinGrinder when the fees surpass some relative portion of the recipient output amounts
  • Only run CoinGrinder if no changeless solution was proposed

@t-bast
Copy link
Contributor

t-bast commented Jun 15, 2023

Thanks for exploring this, I think this is an interesting idea! I'll comment based on my specific usage of the wallet for lightning service providers, please bear in mind that this is a very specific use case that is quite different from regular users (see #24795 for early details about this).

My understanding of the current goals of coin selection are that it tries to find a good balance between minimizing fees and consolidating utxos. The latter actually helps the former: consolidating utxos is usually good for performance, and if you do it when fees are low, you will avoid paying more fees in the future when fees are higher because you'll be able to use less inputs in your future transactions.

In the case of a lightning service provider, we'd rather avoid consolidating utxos, and are ready to pay more fees to keep a large enough utxo set to satisfy our users' needs. The main reasons for that are that:

  • we are using 0-conf (in a setup where the user is trusting us to not double-spend)
  • we are using low fees for those transactions (it's ok to keep transactions unconfirmed until the mempool clears, this is how we can keep the cost low for users)
  • the change outputs from most of those transactions are unsafe because they may be double-spent by a commitment transaction (and thus generally unusable until the the transaction confirms, except when used for that same user)

That last point is a consequence of splicing, let me know if you want me to explain that point in more details.

If we run out of safe utxos, we cannot onboard new users, which may require us to transfer funds from a cold wallet while making sure we're creating many utxos, at a time where the on-chain fees may be high. Our ideal wallet would maintain a large pool of utxos of various sizes, to ensure we're always ready to onboard new users or sell liquidity to existing users.

I understand that this is the opposite of what most end users want and can't be achieved by the same algorithm: that's why I would find it very useful to have various coin selection algorithms in bitcoin core, that are tuned to different profiles/scenarios. I like that this algorithm feels like something we could use (at all feerates) to keep a wallet state that makes more sense for us.

@murchandamus
Copy link
Contributor Author

murchandamus commented Jun 15, 2023

Changes in last force-push: https://github.com/bitcoin/bitcoin/compare/10621c6c4499ba32ff3b7f0bd11b0351f36982b4..0446986f8dd87db29153618d380d159ec0016c1a

Thanks for your thoughts, @t-bast. Your comment inspired a small optimization.
CoinGrinder has been simplified by keeping only track of the input weight (instead of weight and waste), and minimizing input weight throughout. This reduced the number of parameters necessary to call CoinGrinder, simplified the input set traversal, and allows CoinGrinder to be used at all feerates instead of just high feerates. It is however still only called for feerates above 100 ṩ/vB from spend.cpp at this time.

@dergoegge
Copy link
Member

Ran the fuzzer for a bit:

fuzz: wallet/test/fuzz/coinselection.cpp:102: void wallet::coin_grinder_fuzz_target(FuzzBufferType): Assertion `result_knapsack->GetWeight() >= result_cg->GetWeight()' failed.

Input to reproduce:

$ echo "d3d3/wF383d3d3d3d3d3d3d3d3d5d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3d3fzd3d3d3d3d7BE
Wnd3d3d3d3V3d3d3d3d3d3l6d3d3d3d3d3f///8Fd3d3d3d3d3d3d3d3d1d3d3d3d3d3d3d3d3d3
d3d3d3d3d3d3d3d3d3d3d3d3Mnd3d3d3d3d5d3d3d3d3YS5hYWFhYWFhYWFhYWFhYWFhYWFhYUBh
YWEuAAAACmFhWWFhYWFhd3d3d3d3d3d3d3d3dzp3d3d3d3d3d3d3d3d3d3d3d/N3d3d3d3d3d3d3
d3d3eXd3d3d3d3d3d3d1d3d3d3d3d3d3dzJ3d3d3d3d3eXd3d3d3d3d3d3d1d3d3d3d3d3d3dzJ3
d3d3d3d3d3d3d3d3d3d3dyh3d3d3d3d3dxcC+xc=" | base64 -d > coingrinder_crash

@dergoegge
Copy link
Member

I would like to see if the CI catches the crash, now that #27919 has been merged. @xekyo would you mind rebasing?

@murchandamus
Copy link
Contributor Author

murchandamus commented Jun 21, 2023

@dergoegge: Thanks, I discovered the issue myself over the weekend and was exploring yesterday whether I could come up with another optimization. In the end, I fixed the fuzz test by only comparing the weight with SRD and Knapsack when CoinGrinder actually concluded searching the combination space exhaustively. Since finding the lowest weight input set is an NP-hard problem, it’s clearly expected that we can generate UTXO pools for which CoinGrinder will not be able to finish the search in 100k attempts, it was naïve to expect that the fuzz test wouldn’t eventually find such a case.

Your crash seed passes now.

I’m sorry I’m only seeing your comment after pushing the fix.

@maflcko
Copy link
Member

maflcko commented Jun 21, 2023

I think you can still rebase to unblock the CI in any case?

@murchandamus
Copy link
Contributor Author

Rebased on Master.

Changes since last rebase: SelectionResult now has a record if BnB or CoinGrinder run out of tries and thus have not found the optimal solution.

@@ -174,6 +174,10 @@ util::Result<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_pool
curr_selection_weight += utxo.m_weight;
}
}
if (curr_try >= TOTAL_TRIES - 1) {
// On last attempt and didn’t break due to full traversal: solution is non-optimal
result.SetAlgoCompleted(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cf14a72: Here we're setting it for BnB but seems we're not using GetAlgoCompleted for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I’ll have follow up on this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped tracking whether the algorithm was able to exhaustively search the UTXO pool from BnB for now

When two successive UTXOs match in effective value and weight, we can
skip the second if the prior is not selected: adding it would create an
equivalent input set to a previously evaluated.

E.g. if we have three UTXOs with effective values {5, 3, 3} of the same
weight each, we want to evaluate
{5, _, _}, {5, 3, _}, {5, 3, 3}, {_, 3, _}, {_, 3, 3},
but skip {5, _, 3}, and {_, _, 3}, because the first 3 is not selected,
and we therefore do not need to evaluate the second 3 at the same
position in the input set.

If we reach the end of the branch, we must SHIFT the previously selected
UTXO group instead.
When two successive UTXOs differ in weight but match in effective value,
we can skip the second if the first is not selected, because all input
sets we can generate by swapping out a lighter UTXOs with a heavier UTXO
of matching effective value would be strictly worse.
In situations where we have UTXO groups of various weight, we can CUT
rather than SHIFT when we exceeded the max_weight or the best
selection’s weight while the last step was equal to the minimum weight
in the lookahead.
Initialize `best_selection_weight` as `max_weight` allows us to skip the
separate `max_weight` check on every loop.
Given a lot of small amount UTXOs it is possible that the lookahead
indicates sufficient funds, but any combination of them would push us
beyond the current best_weight.
We can estimate a lower bound for the minimal necessary weight to reach
target from the maximal amount and minimal weight in the tail of the
UTXO pool: if adding a number of hypothetical UTXOs of this maximum
amount and minimum weight would not be able to beat `best_weight`, we
can SHIFT to the omission branch, and CUT if the last selected UTXO is
not heavier than the minimum weight of the remainder.
Copy link
Contributor Author

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed all comments by @sipa and @achow101. Thanks for the review!

src/wallet/test/coinselector_tests.cpp Show resolved Hide resolved
src/wallet/test/coinselector_tests.cpp Show resolved Hide resolved
src/wallet/coinselection.cpp Outdated Show resolved Hide resolved
src/wallet/test/coinselector_tests.cpp Outdated Show resolved Hide resolved
src/wallet/test/coinselector_tests.cpp Outdated Show resolved Hide resolved
src/wallet/coinselection.cpp Outdated Show resolved Hide resolved
src/wallet/test/fuzz/coinselection.cpp Show resolved Hide resolved
src/wallet/coinselection.cpp Outdated Show resolved Hide resolved
@achow101
Copy link
Member

achow101 commented Feb 9, 2024

ACK 13161ec

@DrahtBot DrahtBot requested review from ishaanam, sipa, kashifs and sr-gi and removed request for kashifs February 9, 2024 18:44
Copy link
Member

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 13161ec

@DrahtBot DrahtBot requested review from kashifs and removed request for kashifs February 9, 2024 20:20
@sr-gi
Copy link
Member

sr-gi commented Feb 9, 2024

ACK 13161ec

@DrahtBot DrahtBot requested review from kashifs and removed request for sr-gi and kashifs February 9, 2024 21:25
@achow101 achow101 merged commit 1d334d8 into bitcoin:master Feb 9, 2024
16 checks passed
achow101 added a commit that referenced this pull request Feb 22, 2024
9dae3b9 [fuzz] Avoid partial negative result (Murch)

Pull request description:

  May address the problem reported by maflcko in #27877 (review).

  For some values, `MAX_MONEY - max_spendable - max_output_groups` could result in a partial negative value. By putting the addition of `group_pos.size()` first, all partial results in this line will be strictly positive.

  I opened this as a draft, since I was unable to reproduce the issue, so I’m waiting for confirmation whether this in fact mitigates the problem.

ACKs for top commit:
  maflcko:
    ACK 9dae3b9
  sipa:
    utACK 9dae3b9
  achow101:
    ACK 9dae3b9
  brunoerg:
    crACK 9dae3b9

Tree-SHA512: 744b4706268d8dfd77538b99492ecf3cf77d229095f9bcd416a412131336830e2f134f2b2846c79abd3d193426f97c1f71eeaf68b16ab00e76318d57ee3673c7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet