wallet: Remove redundant waste calculation #26061

yancyribbens · 2022-09-11T09:52:49Z

The SelectCoinsBnB algorithm computes the waste score while it's trying to find a match. Then, if it finds a match, it computes the waste score a second time here. To simplify the code (particularly since this is a complex section) its easier to digest and understand if it does one thing, to find a possible selection. I don't think having it calculate the waste score twice adds much and we can depend on the test harness to make sure the waste score is correct.

DrahtBot · 2022-09-12T16:59:19Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#25932 (refactor: Simplify backtrack logic by yancyribbens)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

S3RK · 2022-09-14T06:13:53Z

There are two uses for waste metric in coin selection:

BnB algorithm is designed to optimise for the waste. This is why keeping track of the current and best waste is required to make algorithm work
waste is used to compare solutions produced by different algorithms. This is why there is a generic waste calculation for the SelectionResult

Also, the two definitions of waste are coincidentally equal right now, but they don't have to. There is a possibility that we will use additional heuristics for 2) without changing 1).

S3RK · 2022-09-14T06:15:02Z

src/wallet/coinselection.cpp

        bool backtrack = false;
        if (curr_value + curr_available_value < selection_target || // Cannot possibly reach target with the amount remaining in the curr_available_value.
            curr_value > selection_target + cost_of_change || // Selected value is out of range, go back and try other branch
-            (curr_waste > best_waste && (utxo_pool.at(0).fee - utxo_pool.at(0).long_term_fee) > 0)) { // Don't select things which we know will be more wasteful if the waste is increasing


By removing this criteria for backtracking you significantly expand the search space that will lead a) to longer execution time b) overall less less solutions found due to hard cap of TOTAL_TRIES.

This doesn't change the search space. With this extra condition removed, this code path is used instead. And in the case where curr_waste > best_waste using the code path L98, nothing different happens because of this condition. Since best_waste is better than curr_waste, nothing happens and in both branches, backtrack=true is set. Running the benchmarks will show there's no difference.

@S3RK The search space is not changed. The code is now simpler, however, the change could be pulled from this PR and added to a new PR since it's independent of other improvements.

I agree that without limitation to the number of iterations you'd get the same result with your algorithm. But we actually have a hard limit of TOTAL_TRIES. That means if there are more combinations of coins than TOTAL_TRIES we're not going to explore all of them. This is why we want to exit as early as possible, in particular this condition allows to cut off branches where we didn't reach the target yet but the solution is already worse than our current best.

S3RK · 2022-09-14T06:16:53Z

src/wallet/coinselection.cpp

-            curr_waste += (curr_value - selection_target); // This is the excess value which is added to the waste for the below comparison
+
+            // This is the excess waste for the current iteration.
+            curr_waste = (curr_value - selection_target);


This is just the excess not the waste. Check line42 in this file or GetSelectionWaste for the waste formula.

@S3RK shouldn't the variable be called excess_val or something instead of curr_waste then? To me either would make sense although as the variable is named right now, it doesn't seem to be called excess.

No, waste Contains excess but there are other components to it

yancyribbens · 2022-09-14T15:59:46Z

@S3RK thanks for taking a look and your comment. I understand the idea to formalize a scoring system so we can have a metric that's algorithm independent. However, the boolean check done here is all that's needed to ensure we minimize our UTXO set when fees are expensive:

(utxo_pool.at(0).fee - utxo_pool.at(0).long_term_fee) > 0

Which as I pointed out in a previous PR is a more confusing way of saying:
utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee

I guess we could leave the code as is where it independently computes the waste, although as you can see by this PR, it's not actually needed. IE if you notice, all tests still pass even though it no longer tracks the "waste score". Instead I suppose we get rid of the other waste calulation.

Also, the two definitions of waste are coincidentally equal right now, but they don't have to. There is a possibility that we will use additional heuristics for 2) without changing 1).

I think what you're saying is we don't acctually need this assert then to check for equality?

S3RK · 2022-09-14T18:51:10Z

However, the boolean check done here is all that's needed to ensure we minimize our UTXO set when fees are expensive:

I addressed the question about the Boolean check in the inline thread. Here however, I want to warn against mis-representing the BnB algo. BnB doesn't just minimize UTXO set when fees are high, it optimizes for the waste metric that consists of two parts 1) excess 2) opportunity cost of spending inputs at long_term_fee_rate

I guess we could leave the code as is where it independently computes the waste, although as you can see by this PR, it's not actually needed. IE if you notice, all tests still pass even though it no longer tracks the "waste score".

If I remember the tests correctly, then this rather shows deficiency in tests rather than the correctness of implemented approach. 1) There is only one test where we hit TOTAL_TRIES limit and in this test we just verify there is no solution found 2) There are no tests asserting waste that has both components excess and input opportunity costs

I'd recommend you contribute these tests first before modifying the algorithm.

Instead I suppose we get rid of the other waste calulation.

The other waste calculation is used to set waste score for the BnB SelectionResult for the purpose of comparing it to other solution. See use-case 2 from my previous comment.

I think what you're saying is we don't acctually need this assert then to check for equality?

While I'd say we don't really need this assert, others might disagree and there's no harm in it.

achow101 · 2022-09-14T19:34:55Z

NACK

@S3RK is correct and this change would be detrimental. Waste in BnB is used as an optimizing parameter - it allows us to prune branches when searching. Removing it means that we are searching more branches (which impacts the iteration limit), and the change which makes it only the excess means that we may be pruning branches that we were not pruning previously, in addition to not pruning many branches that we would have.

murchandamus · 2022-09-15T19:44:29Z

Yeah, @S3RK and @achow101 are correct. In a high-feerate situation (i.e. feerate > long_term_feerate_estimate) adding more inputs will only increase the waste. So if a partial input set already has a higher waste than the current best solution we cannot find a better solution (one with a lower waste) by adding more inputs. Thus, if the waste of a partial input set is higher than the best waste, we can skip the entire subtree from the search space. I believe this backtrack clause to be a major speed-up, because otherwise evaluating all possible subsets of the cut subtree would take up to exponential work in the size of the subtree.

I think that you were onto this optimization though:
#26104

DrahtBot added the Wallet label Sep 11, 2022

yancyribbens force-pushed the remove-redundant-waste-calculation branch 3 times, most recently from eeb4b5e to d785893 Compare September 12, 2022 08:57

wallet: Remove redundant waste calculation

d785893

glozow requested a review from murchandamus September 12, 2022 13:32

DrahtBot mentioned this pull request Sep 13, 2022

refactor: Simplify backtrack logic #25932

Closed

S3RK reviewed Sep 14, 2022

View reviewed changes

achow101 closed this Sep 14, 2022

bitcoin locked and limited conversation to collaborators Sep 15, 2023

wallet: Remove redundant waste calculation #26061

wallet: Remove redundant waste calculation #26061

Uh oh!

Conversation

yancyribbens commented Sep 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Sep 12, 2022

Conflicts

Uh oh!

S3RK commented Sep 14, 2022

Uh oh!

S3RK Sep 14, 2022

Choose a reason for hiding this comment

Uh oh!

yancyribbens Sep 14, 2022

Choose a reason for hiding this comment

Uh oh!

yancyribbens Sep 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

S3RK Sep 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

S3RK Sep 14, 2022

Choose a reason for hiding this comment

Uh oh!

yancyribbens Sep 14, 2022

Choose a reason for hiding this comment

Uh oh!

S3RK Sep 14, 2022

Choose a reason for hiding this comment

Uh oh!

yancyribbens commented Sep 14, 2022

Uh oh!

S3RK commented Sep 14, 2022

Uh oh!

achow101 commented Sep 14, 2022

Uh oh!

murchandamus commented Sep 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yancyribbens commented Sep 11, 2022 •

edited

Loading

yancyribbens Sep 14, 2022 •

edited

Loading

S3RK Sep 14, 2022 •

edited

Loading