Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coin Selection with Murch's algorithm #10637

Merged
merged 13 commits into from Mar 14, 2018

Conversation

@achow101
Copy link
Member

achow101 commented Jun 20, 2017

This is an implementation of the Branch and Bound coin selection algorithm written by Murch (@Xekyo). I have it set so this algorithm will run first and if it fails, it will fall back to the current coin selection algorithm. The coin selection algorithms and tests have been refactored to separate files instead of having them all in wallet.cpp.

I have added some tests for the new algorithm and a test for all of coin selection in general. However, more tests may be needed, but I will need help with coming up with more test cases.

This PR uses some code borrowed from #10360 to use effective values when selecting coins.

@achow101 achow101 force-pushed the achow101:bnb-coin-select branch to 0246f1f Jun 20, 2017

@Xekyo
Copy link
Contributor

Xekyo left a comment

Concept ACK, looks good to me. AFAICT, there's an issue with the lookahead that should be addressed still.

// Calculate remaining
CAmount remaining = 0;
for (CInputCoin utxo : utxo_pool) {
remaining += utxo.txout.nValue;

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 20, 2017

Contributor

Have you filtered utxo_pool to exclude utxo's that have a net-neg value? Otherwise you're underestimating the lookahead here. To get an accurate figure for what you may still collect downtree, you should only add utxo.txout.nValue >=0

This comment has been minimized.

Copy link
@instagibbs

instagibbs Jun 20, 2017

Member

@gmaxwell has concerns that Core wallet is only doing semi-sane utxo handling by spending these. With exact match + sane backoff algorithm this concern may be alleviated?

This comment has been minimized.

Copy link
@achow101

achow101 Jun 20, 2017

Author Member

Indeed, that may be a problem. I will add that in as it is still good to have additional checks here even if done elsewhere.

This comment has been minimized.

Copy link
@gmaxwell

gmaxwell Jun 20, 2017

Member

I don't have much of a concern here about the 0/negative effective value inputs: Failing to select negative effective value inputs for an exact match won't lead to a UTXO count inflation, because changeless transactions are by definition strictly UTXO reducing.

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 20, 2017

Contributor

@instagibbs: I'm not completely opposed to spending net-negative UTXO, my concern here is primarily that it actually may cause the lookahead to be underestimated causing valid solutions not to be found.

I realize now that the knapsack algorithm would also not select uneconomic UTXO anymore, as if it had selected enough value before it reached them it would have already returned the set, and if it actually starts exploring them, cannot add more value in the first place.

Advocatus Diaboli: Would it be that terrible though, if UTXO were only considered when they actually have a net positive value? During times of low fees, they'd be used both during BnB and knapsack, during times of high fees, they wouldn't bloat the blocks and lose their owner money.

This comment has been minimized.

Copy link
@instagibbs

instagibbs Jun 20, 2017

Member

I am not so concerned, was making sure concerns are brought up.

This comment has been minimized.

Copy link
@gmaxwell

gmaxwell Jun 22, 2017

Member

@Xekyo we should assume that it would be terrible unless someone can show that it will not cause another massive UTXO bloat event... but thats offtopic here, as I don't think anyone has this concern with exact matches.

This comment has been minimized.

Copy link
@karel-3d

karel-3d Jul 1, 2017

Contributor

The utxos with negative effective values are filtered anyway in wallet/wallet.cpp, which is the only place (except for tests) from where SelectCoinsBnB is called.


// Select 5 Cent
add_coin(4 * CENT, 4, actual_selection);
add_coin(1 * CENT, 1, actual_selection);

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 20, 2017

Contributor

AFAICT utxo_pool has : 4, 3, 2, & 1. Since you're exploring randomly selecting 5 then has two possible solutions: 4+1, 3+2.

This comment has been minimized.

Copy link
@achow101

achow101 Jun 20, 2017

Author Member

It is forced to be include first in these tests so the solution is deterministic.

add_coin(5 * CENT, 5, utxo_pool);
add_coin(5 * CENT, 5, actual_selection);
add_coin(4 * CENT, 4, actual_selection);
add_coin(1 * CENT, 1, actual_selection);

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 20, 2017

Contributor

Under above assumptions, there is two solutions here as well: 5+4+1, or 5+3+2.

This comment has been minimized.

Copy link
@achow101

achow101 Jun 20, 2017

Author Member

It is forced to be include first in these tests so the solution is deterministic.

LogPrint(BCLog::SELECTCOINS, "total %s\n", FormatMoney(nBest));
}
CInputCoin coin(pcoin, i);
coin.txout.nValue -= (output.nInputBytes < 0 ? 0 : effective_fee.GetFee(output.nInputBytes));

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 20, 2017

Contributor

It seems to me that you're also collecting coins that have a net-negative here. This will cause your lookahead to be underestimated, unless you cater to that case when calculating the remainder.

CoinSet actual_selection;
CAmount value_ret = 0;

/////////////////////////

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 20, 2017

Contributor

I would perhaps add a test that checks what happens if the utxo_pool includes a UTXO that is more costly to spend than its own value. As far as I can tell, this would currently reduce your lookahead and may cause a premature search failure.

@Xekyo

This comment has been minimized.

Copy link
Contributor

Xekyo commented Jun 20, 2017

Have you tested the effect of random exploration vs largest first exploration?

  • Either way, BranchAndBound already guarantees that the global utxo set doesn't grow (for one output transactions) due to saving the change output.
  • LFE guarantees the creation of a minimal input set, and purposefully finds a possible solution. This should minimize the input set size variance. In my simulations BranchAndBound with LFE already caused a smaller average UTXO footprint than legacy Core selection.
  • Random Exploration could find a larger input set by skipping a key UTXO higher up in the tree. This could lead to the selection of a larger number of inputs, or in an edge case could even cause tries to be exhausted before a solution is found. This may increase input set variance, or could perhaps even exhaust small UTXOs too quickly for BnB to often find a viable solution.

I am not sure there is a significant privacy benefit for Random Exploration as for either selection method an attacker would already need to know about another eligible input that would achieve an exact match when switched out for one of the input set.

What benefit do you expect from using Random Exploration?

@achow101

This comment has been minimized.

Copy link
Member Author

achow101 commented Jun 20, 2017

@Xekyo I was thinking that Random Exploration would be better for privacy but I see that it probably wouldn't help. If you think it would be better to change to LFE, I can certainly do that.

@Xekyo

This comment has been minimized.

Copy link
Contributor

Xekyo commented Jun 20, 2017

@achow101: I don't know how strong the effect is, but I'd expect Random Exploration to increase the required computational effort.

@instagibbs

This comment has been minimized.

Copy link
Member

instagibbs commented Jun 20, 2017

Noting that this PR has fairly heavy overlap with #10360 .

From chatting with @achow101 the intention of this PR is to touch as little as possible while still getting BranchNBound coin selection.

To make this successful it should really only be run on first iteration of the loop in CreateTransaction, when nFeeRet == 0 and only use effective value for the BnB coin selection step, rather than the knapsack as well. Once nFeeRet becomes more than zero, interactions start to get strange without a more complete overhaul like #10360.

@instagibbs

This comment has been minimized.

Copy link
Member

instagibbs commented Jun 20, 2017

This PR I believe will still create just-over-dust change outputs when BnB finds an exact match. Whenever we are allowing BnB matches(first iteration) we should not make change outputs less than the exact match slack value.

// Calculate cost of change
// TODO: In the future, we should use the change output actually made for the transaction and calculate the cost
// requred to spend it.
CAmount cost_of_change = effective_fee.GetFee(148+34); // 148 bytes for the input, 34 bytes for making the output

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 20, 2017

Contributor

This assumes that the input will be spent at a feerate at least as high as the current. This was a valid assumption in my thesis, as I was using a fixed fee rate. I'm not sure whether this a valid assumption for realnet transaction selection, as we've literally seen fees between 8-540 sat/byte in the past two weeks. We might want to consider discounting the cost of the input slightly.

This comment has been minimized.

Copy link
@instagibbs

instagibbs Jun 20, 2017

Member

Depends on user time preferences. Could be an option that is set for those who regularly consolidate.

This comment has been minimized.

Copy link
@achow101

achow101 Jun 20, 2017

Author Member

For now I think it is fine to use the current feerate.

@Xekyo

This comment has been minimized.

Copy link
Contributor

Xekyo commented Jun 20, 2017

@instagibbs: In fact, BnB is designed to only work when creating a transaction without a change output. If we were creating a change in the first place, the extensive search pattern would be unnecessarily wasteful.

@instagibbs

This comment has been minimized.

Copy link
Member

instagibbs commented Jun 20, 2017

To append onto my previous comments, any effective value match attempt should account for the fees just obtained by SelectCoins. Currently it completely ignores the newly-obtained fees, keeping the previous loop's value, and then asks if nFeeRet >= nFeeRequired to break from the loop(which currently is 0 on the first go-around).

@fanquake fanquake added the Wallet label Jun 20, 2017

@achow101

This comment has been minimized.

Copy link
Member Author

achow101 commented Jun 21, 2017

I have made the BnB selector to be only run on the first pass of the coin selection loop. It is now set so that effective value is only used for the BnB selector and not the knapsack one. I have also added the negative effective value check and test just as a belt-and-suspenders thing. I also made BnB use Largest First Exploration instead of Random Exploration.

backtrack = true;
} else if (value_ret >= target_value) { // Selected value is within range
done = true;
} else if (tries <= 0) { // Too many tries, exit

This comment has been minimized.

Copy link
@Xekyo

Xekyo Jun 21, 2017

Contributor

Here's a unexpected behavior in my algorithm: if there is a number of input combinations whose value_ret all exceed the target_value when tries == 0 is passed, tries can go into the negative.

The tries check should be moved to the top of the checks.

This comment has been minimized.

Copy link
@achow101

achow101 Jun 21, 2017

Author Member

Done

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Jun 21, 2017

Perhaps generically, we should never create change if the amount is less than the cost of creating + spending it (regardless of whether BnB was used to find the inputs or not)?

@instagibbs

This comment has been minimized.

Copy link
Member

instagibbs commented Jun 22, 2017

@sipa one question is if we should allow the wallet to consider consolidation-level prices for that change. Perhaps the user is in a hurry now, but would consider spending that change at a much slower pace.

Maybe for a first pass only consider the selected feerate, then Future Work allow a parameter which has more aggressive change protection given longer timescales.

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Jun 22, 2017

@instagibbs Yes, I agree; we should use long-eatimates for the spend part of change rather than the actual feerate the user is willing to pay now. Perhaps we can make it more conservative without doing that by using a factor 2 or 3 reduction?

// Calculate cost of change
// TODO: In the future, we should use the change output actually made for the transaction and calculate the cost
// requred to spend it.
CAmount cost_of_change = effective_fee.GetFee(148+34); // 148 bytes for the input, 34 bytes for making the output

This comment has been minimized.

Copy link
@gmaxwell

gmaxwell Jun 22, 2017

Member

not correct for segwit. If this code ends up being changed to follow pieter's suggestion of dividing the rate by two or three it should be bounded by the min relay fee. (I'm not super fond of that suggestion).

@gmaxwell

This comment has been minimized.

Copy link
Member

gmaxwell commented Jun 22, 2017

@sipa @achow101 it would be very very easy in the current PR to ask for another estimate for the change, I think ~two loc addition, and minor addition to the selectcoins arguments to pass down a second fee. I think this would be much more desirable than a fixed division. Future work could do things like make that second confirmation target configurable.

@@ -2562,7 +2562,7 @@ bool CWallet::CreateTransaction(const std::vector<CRecipient>& vecSend, CWalletT
}

const CAmount nChange = nValueIn - nValueToSelect;
if (nChange > 0)
if (nChange > 0 && (!first_pass || nFeeRet == 0)) // nFeeRet is only 0 on the first pass if BnB was not used.

This comment has been minimized.

Copy link
@gmaxwell

gmaxwell Jun 22, 2017

Member

Using nFeeRet to signal BNB usage is ugly. I think you shouldn't pass in nFeeRet at all, but have some explicit signal (e.g. boolean return) for BNB usage and if its set; after select coins set nFeeRet to nChange and use the same signal to bypass this branch.

I also think this condition is slightly incorrect but benign in the current code, lets say our configured feerate were zero: now BNB could find a solution and leave nFeeRet==0. (though nChange would currently be zero too, so it would be harmless but seems to me like the kind of thing to be brittle in future changes)

@achow101

This comment has been minimized.

Copy link
Member Author

achow101 commented Jun 23, 2017

Travis failure seems to be unrelated

@karel-3d

This comment has been minimized.

Copy link
Contributor

karel-3d commented Jul 2, 2017

just fyi, I have used your code as a reference for this code

bitcoinjs/coinselect#13

@karel-3d

This comment has been minimized.

Copy link
Contributor

karel-3d commented Jul 2, 2017

I have to say, I don't understand the target size; maybe there is a bug there.

In wallet.cpp, in CWallet::CreateTransaction, you create nValue, which seems to be the sum of all the outputs. Because the BnB is used only at the first pass, nFeeRet is 0 and nValueToSelect is just the sum of all the outputs.

This is then used as the exact target in the BnB algorithm.

However, you should add the cost of the outputs + the small cost of the tx overhead into the target (done here for the simple case on 1 output - https://github.com/Xekyo/CoinSelectionSimulator/blob/master/src/main/scala/one/murch/bitcoin/coinselection/StackEfficientTailRecursiveBnB.scala#L28 )

Maybe it's done somewhere, but I don't see it.

@achow101

This comment has been minimized.

Copy link
Member Author

achow101 commented Jul 2, 2017

@Runn1ng BnB uses effective values for the inputs so the fee is accounted for when coins are selected. The effective values are calculated in SelectCoinsMinConf

@karel-3d

This comment has been minimized.

Copy link
Contributor

karel-3d commented Jul 2, 2017

That eff. value accounts for the inputs of the new transaction, but not for the outputs (plus the overhead of the tx itself, but that is only about 10 bytes).

In SelectCoinsMinConf, you already have the target, which does not account for that.

@achow101

This comment has been minimized.

Copy link
Member Author

achow101 commented Jul 2, 2017

Ah, yes. That is a bug. Thanks for finding that!

@achow101 achow101 force-pushed the achow101:bnb-coin-select branch Jul 2, 2017

@ryanofsky
Copy link
Contributor

ryanofsky left a comment

utACK c3136ea7e936714886b1329402931a3ec514afaf with squash. Changes since last review: adding bnb_used variable, getting rid of depth variable, using coin effective value instead of nValue to check for dust and testing this, moving effective value assert out of search loop.

@instagibbs
Copy link
Member

instagibbs left a comment

utACK

src/wallet/test/coinselector_tests.cpp Outdated
bool bnb_used;
empty_wallet();
add_coin(1);
vCoins.at(0).nInputBytes = 40; // Make sure that it has a egative effective value. The next check should assert if this somehow got through. Otherwise it will fail

This comment has been minimized.

Copy link
@instagibbs

instagibbs Mar 13, 2018

Member

egative

This comment has been minimized.

Copy link
@achow101

achow101 Mar 13, 2018

Author Member

Fixed

achow101 added some commits Mar 5, 2018

Implement Branch and Bound coin selection in a new file
Create a new file for coin selection logic and implement the BnB algorithm in it.
Remove coinselection.h -> wallet.h circular dependency
Changes CInputCoin to coinselection and to use CTransactionRef in
order to avoid a circular dependency. Also moves other coin selection
specific variables out of wallet.h to coinselectoin.h
Move current coin selection algorithm to coinselection.{cpp,h}
Moves the current coin selection algorithm out of SelectCoinsMinConf
and puts it in coinselection.{cpp,h}. The new function, KnapsackSolver,
instead of taking a vector of COutputs, will take a vector of CInputCoins
that is prepared by SelectCoinsMinConf.
Use a struct for output eligibility
Instead of specifying 3 parameters, use a struct for those parameters
in order to reduce the number of arguments to SelectCoinsMinConf.
Have SelectCoinsMinConf and SelectCoins use BnB or Knapsack and use it
Allows SelectCoinsMinConf and SelectCoins be able to switch between
using BnB or Knapsack for choosing coins.

Has SelectCoinsMinConf do the preprocessing necessary to support either
BnB or Knapsack. This includes calculating the filtering the effective
values for each input.

Uses BnB in CreateTransaction to find an exact match for the output.
If BnB fails, it will fallback to the Knapsack solver.

@achow101 achow101 force-pushed the achow101:bnb-coin-select branch to 73b5bf2 Mar 13, 2018

@achow101

This comment has been minimized.

Copy link
Member Author

achow101 commented Mar 13, 2018

Squashed

@ryanofsky
Copy link
Contributor

ryanofsky left a comment

utACK 73b5bf2. Only change since last review is comment fix and squash.

@instagibbs

This comment has been minimized.

Copy link
Member

instagibbs commented Mar 13, 2018

utACK 73b5bf2

@laanwj laanwj merged commit 73b5bf2 into bitcoin:master Mar 14, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

laanwj added a commit that referenced this pull request Mar 14, 2018

Merge #10637: Coin Selection with Murch's algorithm
73b5bf2 Add a test to make sure that negative effective values are filtered (Andrew Chow)
76d2f06 Benchmark BnB in the worst case where it exhausts (Andrew Chow)
6a34ff5 Have SelectCoinsMinConf and SelectCoins use BnB or Knapsack and use it (Andrew Chow)
fab0488 Add a GetMinimumFeeRate function which is wrapped by GetMinimumFee (Andrew Chow)
cd927ff Move original knapsack solver tests to coinselector_tests.cpp (Andrew Chow)
fb716f7 Move current coin selection algorithm to coinselection.{cpp,h} (Andrew Chow)
4566ab7 Add tests for the Branch and Bound algorithm (Andrew Chow)
4b2716d Remove coinselection.h -> wallet.h circular dependency (Andrew Chow)
7d77eb1 Use a struct for output eligibility (Andrew Chow)
ce7435c Move output eligibility to a separate function (Andrew Chow)
0185939 Implement Branch and Bound coin selection in a new file (Andrew Chow)
f84fed8 Store effective value, fee, and long term fee in CInputCoin (Andrew Chow)
12ec29d Calculate and store the number of bytes required to spend an input (Andrew Chow)

Pull request description:

  This is an implementation of the [Branch and Bound coin selection algorithm written by Murch](http://murch.one/wp-content/uploads/2016/11/erhardt2016coinselection.pdf) (@Xekyo). I have it set so this algorithm will run first and if it fails, it will fall back to the current coin selection algorithm. The coin selection algorithms and tests have been refactored to separate files instead of having them all in wallet.cpp.

  I have added some tests for the new algorithm and a test for all of coin selection in general. However, more tests may be needed, but I will need help with coming up with more test cases.

  This PR uses some code borrowed from #10360 to use effective values when selecting coins.

Tree-SHA512: b0500f406bf671e74984fae78e2d0fbc5e321ddf4f06182c5855e9d1984c4ef2764c7586d03e16fa4b578c340b21710324926f9ca472d5447a0d1ed43eb4357e
@ghost

This comment has been minimized.

Copy link

ghost commented Mar 15, 2018

noice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.