Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix waste calculation in SelectionResult #28366

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

murchandamus
Copy link
Contributor

@murchandamus murchandamus commented Aug 29, 2023

PR #26152 moved waste calculation into SelectionResult to be able to correct the waste score on basis of the bump_fee_group_discount for overlapping ancestries. This left two functions with largely overlapping purpose, where one was simply a wrapper of the other. This PR cleans up the overlap, and fixes the double-meaning of change_cost where the GetChange() function assumed that no change was created when change_cost was set to 0. This behavior was exploited in a bunch of tests, but is problematic, because a change_cost of 0 is permitted with custom settings for feerate and discard_feerate (i.e. when they’re both 0).

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 29, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK ismaelsadeeq
Stale ACK achow101, S3RK

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #30080 (wallet: add coin selection parameter add_excess_to_recipient_position for changeless txs with excess that would be added to fees by remyers)
  • #29523 (Wallet: Add max_tx_weight to transaction funding options (take 2) by ismaelsadeeq)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@murchandamus
Copy link
Contributor Author

Rebased to address conflict

@murchandamus
Copy link
Contributor Author

Rebased on latest #26152

@S3RK
Copy link
Contributor

S3RK commented Sep 14, 2023

Approach ACK

Commit message in 0a31864 says that GetChange() relied to special meaning of change_cost == 0, but actually it was GetSelectionWaste() function which was in fault.

@murchandamus murchandamus marked this pull request as ready for review September 14, 2023 20:25
@murchandamus
Copy link
Contributor Author

Approach ACK

Commit message in 0a31864 says that GetChange() relied to special meaning of change_cost == 0, but actually it was GetSelectionWaste() function which was in fault.

Thanks, I updated the commit message and rebased on master. Ready for review.

@aureleoules
Copy link
Member

The CoinSelection benchmarks fails when I run it with -min-time=1000

./src/bench/bench_bitcoin -filter=CoinSelection -min-time=1000
bench_bitcoin: bench/coin_selection.cpp:84: CoinSelection(ankerl::nanobench::Bench&)::<lambda()>: Assertion `result->GetSelectedValue() == 1003 * COIN' failed.
Aborted

@murchandamus murchandamus marked this pull request as draft January 16, 2024 19:53
@murchandamus
Copy link
Contributor Author

murchandamus commented Jan 16, 2024

Fixed the coin_selection benchmark, thanks @aureleoules.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/20547490347

@maflcko
Copy link
Member

maflcko commented Apr 5, 2024

Are you still working on this?

@murchandamus murchandamus force-pushed the 2023-08-fix-waste-calculation branch from 6a8dfa5 to 5aed48a Compare April 5, 2024 16:42
@murchandamus
Copy link
Contributor Author

murchandamus commented Apr 5, 2024

Are you still working on this?

Thanks, rebased and fixed the call from adding CoinGrinder that was using the removed function.

Copy link
Member

@achow101 achow101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK e95b915

} else {
m_waste = GetSelectionWaste(0, m_target, m_use_effective);
}
bool makes_change = (0 != GetChange(min_viable_change, change_fee));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 44db79b "Refactor ComputeAndSetWaste()"

nit: Unnecessary parentheses, also usually the check of the value is after the function. Could also use brace initialization.

Suggested change
bool makes_change = (0 != GetChange(min_viable_change, change_fee));
bool makes_change{GetChange(min_viable_change, change_fee) != 0};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted the proposed change, but those lines get removed in the next commit anyway.

@DrahtBot DrahtBot requested a review from S3RK April 24, 2024 18:00
Copy link
Member

@ismaelsadeeq ismaelsadeeq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

This is a nice cleanup 👍🏾

Comment on lines 382 to 384
* @param[in] change_cost The cost of creating change and spending it in the future.
* Only used if there is change, in which case it must be positive.
* Must be 0 if there is no change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the commit message in e95b915 you mention we can have change_cost of 0 and also a change output, so this is no longer the case?

Suggested change
* @param[in] change_cost The cost of creating change and spending it in the future.
* Only used if there is change, in which case it must be positive.
* Must be 0 if there is no change.
* @param[in] change_cost The cost of creating change and spending it in the future.
* Only used if there is change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the "must be 0 if there is no change" is indeed outdated. I removed it.

/*tx_noinputs_size=*/ 0,
/*avoid_partial=*/ false,
};
auto group = wallet::GroupOutputs(wallet, available_coins, coin_selection_params, {{filter_standard}})[filter_standard];
bench.run([&] {
auto result = AttemptSelection(wallet.chain(), 1003 * COIN, group, coin_selection_params, /*allow_mixed_output_types=*/true);
auto result = AttemptSelection(wallet.chain(), 1002.99 * COIN, group, coin_selection_params, /*allow_mixed_output_types=*/true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you reducing 0.1 here?

And above why are effective_feerate, long_term_feerate, and discard_feerate now having values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

It has been a while since I made this change, but if I remember correctly, my fixing the duplicate meaning of change_cost broke this bench test due to the following:

Before my change, we would set change_cost to 0 as a signal that a transaction does not require a change output. This would happen when a coin selection algorithm found a changeless solution and after creating the recipient outputs there would not be sufficient funds to make a viable change output. However, change_cost will also be computed to be zero when effective_feerate, long_term_feerate, and discard_feerate are zero.

IIRC, this lead to some unintuitive behavior in coin selection:

  1. One coin selection algorithm may propose a changeless input set with the two UTXOs 1000+3 BTC. Its waste score would be calculated to be 0, as both the effective_feerate and the long_term_feerate are set to 0 and there is no excess.
  2. Another coin selection algorithm might propose a solution that should lead to a change output, e.g. two UTXOs 1000+1000 BTC. Because change_cost was computed to be 0, the transaction building would not create a change output, drop the excess 997 BTC to fees, and therefore assess the waste score to be 997 BTC of waste.

The changeless solution would be preferred, because of its waste score, and the bench would always succeed in finding a two input solution that uses exactly 1000+3 BTC in the inputs.

After my rewrite, change_cost only contains the computed value and is no longer used as a signal on whether we have sufficient budget to create a change output.

  1. As before, the changeless solution would have a waste score of 0, as both the effective_feerate and the long_term_feerate are set to 0 and there is no excess.
  2. However, the solution that creates change due to picking two UTXOs with 1000+1000 BTC, would now recognize that it has a ton of money left and that it should create a change output. Instead of dropping the excess 997 BTC to fees, it would create a change output of 997 BTC. Since the effective_feerate and discard_feerate is 0, the change_cost is 0. Because the effective_feerate and long_term_feerate are 0, the waste score on the inputs is also 0. Since effective_feerate is 0, the fees for transaction header and recipient outputs are also 0. Therefore, the transaction with a change output now also gets a waste score of 0, correctly.

All of the proposed input sets get presented to ChooseSelectionResult(…) which picks the lowest waste score, but all solutions are tied with the same waste score of zero. The tie-breaker is to prefer the input set with more inputs, but all solutions have two inputs and are tied on that as well. Finally, it picks the first element from the result vector. The bench test expects that 1003 BTC get selected, but when it selects 2000 BTC, the bench fails.

Therefore, I fixed the bench to not depend on using feerates that are do not appear in standard transactions and lead to quirky coin selection behavior, which required that I reduce the target to leave a money for fees. I would argue that we could probably do much better for a bench that tests coin selection in general.

Copy link
Contributor

@S3RK S3RK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK e95b915

I'm not sure whether 44db79b is useful since it's fully replaced by the following commit.

Also added a bunch of nits in the tests about comments and potential simplifications, but nothing blocking.


// The following tests that the waste is calculated correctly in various scenarios.
// ComputeAndSetWaste will first determine the size of the change output. We don't really
// RecalculateWaste will first determine the size of the change output. We don't really
// care about the change and just want to use the variant that always includes the change_cost,
// so min_viable_change and change_fee are set to 0 to ensure that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment needs to be adjusted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks, I adjusted the comment.

BOOST_CHECK_EQUAL(excess, selection.GetWaste());
}

{
// No Waste when fee == long_term_fee, no change, and no excess
// Waste is 0 when fee == long_term_fee, no change, and no excess
const CAmount exact_target{in_amt - fee * 2};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could be pulled to the begging of the test next to other declarations and then reused in the tests above (and below as well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced an exact_target variable above this test block and used it everywhere it applies

BOOST_CHECK_EQUAL(0, selection.GetWaste());
}

{
// Negative waste when the long term fee is greater than the current fee and the selected value == target
const CAmount exact_target{3 * COIN - 2 * fee};
const CAmount exact_target{in_amt - 2 * fee};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same variable is already defined within another local scope above, could be reused. See another comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used exact_target as proposed

@@ -993,7 +994,7 @@ BOOST_AUTO_TEST_CASE(waste_test)
const CAmount target_waste2{-2 * large_fee_diff + change_cost}; // = (2 * fee) - (2 * (fee + large_fee_diff)) + change_cost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do you want to add assert for the comment above? It's not obvious that the statement holds given variable declarations are hundreds of lines above.

the comment:
"Negative waste when the long term fee is greater than the current fee and change_cost < - (inputs * (fee - long_term_fee))"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a calculation comment to illustrate the situation:

        // = (2 * fee) - (2 * (fee + large_fee_diff)) + change_cost
        // = (2 * 100) - (2 * (100 + 90)) + 125
        // = 200 - 380 + 125 = -55

and asserted that my calculation is correct.

// No Waste when (fee - long_term_fee) == (-excess), no change cost
const CAmount new_target{in_amt - fee * 2 - fee_diff * 2};
// Waste is 0 when (fee - long_term_fee) == (-excess), no change cost
const CAmount new_target{in_amt - fee * 2 - /*excess=*/fee_diff * 2};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could use exact_target - excess if you introduce exact_target var

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled exact_target out as another variable

Copy link
Contributor Author

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @achow101, @S3RK, and @ismaelsadeeq, I hope I addressed all of your comments satisfactorily.

} else {
m_waste = GetSelectionWaste(0, m_target, m_use_effective);
}
bool makes_change = (0 != GetChange(min_viable_change, change_fee));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted the proposed change, but those lines get removed in the next commit anyway.

/*tx_noinputs_size=*/ 0,
/*avoid_partial=*/ false,
};
auto group = wallet::GroupOutputs(wallet, available_coins, coin_selection_params, {{filter_standard}})[filter_standard];
bench.run([&] {
auto result = AttemptSelection(wallet.chain(), 1003 * COIN, group, coin_selection_params, /*allow_mixed_output_types=*/true);
auto result = AttemptSelection(wallet.chain(), 1002.99 * COIN, group, coin_selection_params, /*allow_mixed_output_types=*/true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

It has been a while since I made this change, but if I remember correctly, my fixing the duplicate meaning of change_cost broke this bench test due to the following:

Before my change, we would set change_cost to 0 as a signal that a transaction does not require a change output. This would happen when a coin selection algorithm found a changeless solution and after creating the recipient outputs there would not be sufficient funds to make a viable change output. However, change_cost will also be computed to be zero when effective_feerate, long_term_feerate, and discard_feerate are zero.

IIRC, this lead to some unintuitive behavior in coin selection:

  1. One coin selection algorithm may propose a changeless input set with the two UTXOs 1000+3 BTC. Its waste score would be calculated to be 0, as both the effective_feerate and the long_term_feerate are set to 0 and there is no excess.
  2. Another coin selection algorithm might propose a solution that should lead to a change output, e.g. two UTXOs 1000+1000 BTC. Because change_cost was computed to be 0, the transaction building would not create a change output, drop the excess 997 BTC to fees, and therefore assess the waste score to be 997 BTC of waste.

The changeless solution would be preferred, because of its waste score, and the bench would always succeed in finding a two input solution that uses exactly 1000+3 BTC in the inputs.

After my rewrite, change_cost only contains the computed value and is no longer used as a signal on whether we have sufficient budget to create a change output.

  1. As before, the changeless solution would have a waste score of 0, as both the effective_feerate and the long_term_feerate are set to 0 and there is no excess.
  2. However, the solution that creates change due to picking two UTXOs with 1000+1000 BTC, would now recognize that it has a ton of money left and that it should create a change output. Instead of dropping the excess 997 BTC to fees, it would create a change output of 997 BTC. Since the effective_feerate and discard_feerate is 0, the change_cost is 0. Because the effective_feerate and long_term_feerate are 0, the waste score on the inputs is also 0. Since effective_feerate is 0, the fees for transaction header and recipient outputs are also 0. Therefore, the transaction with a change output now also gets a waste score of 0, correctly.

All of the proposed input sets get presented to ChooseSelectionResult(…) which picks the lowest waste score, but all solutions are tied with the same waste score of zero. The tie-breaker is to prefer the input set with more inputs, but all solutions have two inputs and are tied on that as well. Finally, it picks the first element from the result vector. The bench test expects that 1003 BTC get selected, but when it selects 2000 BTC, the bench fails.

Therefore, I fixed the bench to not depend on using feerates that are do not appear in standard transactions and lead to quirky coin selection behavior, which required that I reduce the target to leave a money for fees. I would argue that we could probably do much better for a bench that tests coin selection in general.

Comment on lines 382 to 384
* @param[in] change_cost The cost of creating change and spending it in the future.
* Only used if there is change, in which case it must be positive.
* Must be 0 if there is no change.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the "must be 0 if there is no change" is indeed outdated. I removed it.


// The following tests that the waste is calculated correctly in various scenarios.
// ComputeAndSetWaste will first determine the size of the change output. We don't really
// RecalculateWaste will first determine the size of the change output. We don't really
// care about the change and just want to use the variant that always includes the change_cost,
// so min_viable_change and change_fee are set to 0 to ensure that.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks, I adjusted the comment.

// No Waste when (fee - long_term_fee) == (-excess), no change cost
const CAmount new_target{in_amt - fee * 2 - fee_diff * 2};
// Waste is 0 when (fee - long_term_fee) == (-excess), no change cost
const CAmount new_target{in_amt - fee * 2 - /*excess=*/fee_diff * 2};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled exact_target out as another variable

BOOST_CHECK_EQUAL(0, selection.GetWaste());
}

{
// Negative waste when the long term fee is greater than the current fee and the selected value == target
const CAmount exact_target{3 * COIN - 2 * fee};
const CAmount exact_target{in_amt - 2 * fee};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used exact_target as proposed

@@ -993,7 +994,7 @@ BOOST_AUTO_TEST_CASE(waste_test)
const CAmount target_waste2{-2 * large_fee_diff + change_cost}; // = (2 * fee) - (2 * (fee + large_fee_diff)) + change_cost
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a calculation comment to illustrate the situation:

        // = (2 * fee) - (2 * (fee + large_fee_diff)) + change_cost
        // = (2 * 100) - (2 * (100 + 90)) + 125
        // = 200 - 380 + 125 = -55

and asserted that my calculation is correct.

BOOST_CHECK_EQUAL(excess, selection.GetWaste());
}

{
// No Waste when fee == long_term_fee, no change, and no excess
// Waste is 0 when fee == long_term_fee, no change, and no excess
const CAmount exact_target{in_amt - fee * 2};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced an exact_target variable above this test block and used it everywhere it applies

@murchandamus murchandamus force-pushed the 2023-08-fix-waste-calculation branch from e95b915 to 620cc6e Compare May 24, 2024 18:48
Both `GetSelectionWaste()` and `ComputeAndSetWaste()` now are part of
`SelectionResult`. Instead of `ComputeAndSetWaste()` being a wrapper for
`GetSelectionWaste()`, we combine them to a new function
`RecalculateWaste()`.

As I was combining the logic of the two functions, I noticed that
`GetSelectionWaste()` was making the odd assumption that the
`change_cost` being set to zero means that no change is created.
However, if we build transactions at a feerate of zero with the
`discard_feerate` also set to zero, we'd organically have a
`change_cost` of zero, even when we create change on a transaction.

This commit cleans up this duplicate meaning of `change_cost` and relies
on `GetChange()` to figure out whether there is change on basis of the
`min_viable_change` and whatever is left after deducting fees.

Since this broke a bunch of tests that relied on the double-meaning of
`change_cost` a bunch of tests had to be fixed.
@murchandamus murchandamus force-pushed the 2023-08-fix-waste-calculation branch from 620cc6e to 2c18b86 Compare May 24, 2024 18:54
@murchandamus
Copy link
Contributor Author

I'm not sure whether 44db79b is useful since it's fully replaced by the following commit.

Ah oops, yeah I guess the first commit was a vestige of my first attempt at this. I squashed it into the second commit.

Diff of before reviews to latest: https://github.com/bitcoin/bitcoin/compare/e95b9159380f2de7f9a6e7a202cc171ad285ee6c..2c18b86

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/25393161083

@S3RK
Copy link
Contributor

S3RK commented May 27, 2024

Happy to reack, but you need to fix clang-tidy first

wallet/test/coinselector_tests.cpp:893:43: error: argument name 'current_fee' in comment does not match parameter name 'fee' [bugprone-argument-comment,-warnings-as-errors]
  893 |         add_coin(1 * COIN, 1, selection1, /*current_fee=*/fee, /*long_term_fee=*/fee - fee_diff);
      |                                           ^
wallet/test/coinselector_tests.cpp:61:90: note: 'fee' declared here
   61 | static void add_coin(const CAmount& nValue, int nInput, SelectionResult& result, CAmount fee, CAmount long_term_fee)
      | 

@murchandamus murchandamus force-pushed the 2023-08-fix-waste-calculation branch from 2c18b86 to bd34dd8 Compare May 28, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants