Coin Selection with Murch's algorithm #10637

Open
wants to merge 26 commits into
from

Conversation

Projects
None yet
7 participants
Contributor

achow101 commented Jun 20, 2017

This is an implementation of the Branch and Bound coin selection algorithm written by Murch (@Xekyo). I have it set so this algorithm will run first and if it fails, it will fall back to the current coin selection algorithm. The coin selection algorithms and tests have been refactored to separate files instead of having them all in wallet.cpp.

I have added some tests for the new algorithm and a test for all of coin selection in general. However, more tests may be needed, but I will need help with coming up with more test cases.

This PR uses some code borrowed from #10360 to use effective values when selecting coins.

@Xekyo

Xekyo suggested changes Jun 20, 2017 edited

Concept ACK, looks good to me. AFAICT, there's an issue with the lookahead that should be addressed still.

+ // Calculate remaining
+ CAmount remaining = 0;
+ for (CInputCoin utxo : utxo_pool) {
+ remaining += utxo.txout.nValue;
@Xekyo

Xekyo Jun 20, 2017

Contributor

Have you filtered utxo_pool to exclude utxo's that have a net-neg value? Otherwise you're underestimating the lookahead here. To get an accurate figure for what you may still collect downtree, you should only add utxo.txout.nValue >=0

@instagibbs

instagibbs Jun 20, 2017

Member

@gmaxwell has concerns that Core wallet is only doing semi-sane utxo handling by spending these. With exact match + sane backoff algorithm this concern may be alleviated?

@achow101

achow101 Jun 20, 2017

Contributor

Indeed, that may be a problem. I will add that in as it is still good to have additional checks here even if done elsewhere.

@gmaxwell

gmaxwell Jun 20, 2017

Member

I don't have much of a concern here about the 0/negative effective value inputs: Failing to select negative effective value inputs for an exact match won't lead to a UTXO count inflation, because changeless transactions are by definition strictly UTXO reducing.

@Xekyo

Xekyo Jun 20, 2017 edited

Contributor

@instagibbs: I'm not completely opposed to spending net-negative UTXO, my concern here is primarily that it actually may cause the lookahead to be underestimated causing valid solutions not to be found.

I realize now that the knapsack algorithm would also not select uneconomic UTXO anymore, as if it had selected enough value before it reached them it would have already returned the set, and if it actually starts exploring them, cannot add more value in the first place.

Advocatus Diaboli: Would it be that terrible though, if UTXO were only considered when they actually have a net positive value? During times of low fees, they'd be used both during BnB and knapsack, during times of high fees, they wouldn't bloat the blocks and lose their owner money.

@instagibbs

instagibbs Jun 20, 2017

Member

I am not so concerned, was making sure concerns are brought up.

@gmaxwell

gmaxwell Jun 22, 2017

Member

@Xekyo we should assume that it would be terrible unless someone can show that it will not cause another massive UTXO bloat event... but thats offtopic here, as I don't think anyone has this concern with exact matches.

@runn1ng

runn1ng Jul 1, 2017 edited

The utxos with negative effective values are filtered anyway in wallet/wallet.cpp, which is the only place (except for tests) from where SelectCoinsBnB is called.

src/wallet/test/coinselector_tests.cpp
+ CoinSet actual_selection;
+ CAmount value_ret = 0;
+
+ /////////////////////////
@Xekyo

Xekyo Jun 20, 2017

Contributor

I would perhaps add a test that checks what happens if the utxo_pool includes a UTXO that is more costly to spend than its own value. As far as I can tell, this would currently reduce your lookahead and may cause a premature search failure.

src/wallet/test/coinselector_tests.cpp
+
+ // Select 5 Cent
+ add_coin(4 * CENT, 4, actual_selection);
+ add_coin(1 * CENT, 1, actual_selection);
@Xekyo

Xekyo Jun 20, 2017

Contributor

AFAICT utxo_pool has : 4, 3, 2, & 1. Since you're exploring randomly selecting 5 then has two possible solutions: 4+1, 3+2.

@achow101

achow101 Jun 20, 2017

Contributor

It is forced to be include first in these tests so the solution is deterministic.

+ add_coin(5 * CENT, 5, utxo_pool);
+ add_coin(5 * CENT, 5, actual_selection);
+ add_coin(4 * CENT, 4, actual_selection);
+ add_coin(1 * CENT, 1, actual_selection);
@Xekyo

Xekyo Jun 20, 2017

Contributor

Under above assumptions, there is two solutions here as well: 5+4+1, or 5+3+2.

@achow101

achow101 Jun 20, 2017

Contributor

It is forced to be include first in these tests so the solution is deterministic.

src/wallet/wallet.cpp
- LogPrint(BCLog::SELECTCOINS, "total %s\n", FormatMoney(nBest));
- }
+ CInputCoin coin(pcoin, i);
+ coin.txout.nValue -= (output.nInputBytes < 0 ? 0 : effective_fee.GetFee(output.nInputBytes));
@Xekyo

Xekyo Jun 20, 2017

Contributor

It seems to me that you're also collecting coins that have a net-negative here. This will cause your lookahead to be underestimated, unless you cater to that case when calculating the remainder.

Contributor

Xekyo commented Jun 20, 2017 edited

Have you tested the effect of random exploration vs largest first exploration?

  • Either way, BranchAndBound already guarantees that the global utxo set doesn't grow (for one output transactions) due to saving the change output.
  • LFE guarantees the creation of a minimal input set, and purposefully finds a possible solution. This should minimize the input set size variance. In my simulations BranchAndBound with LFE already caused a smaller average UTXO footprint than legacy Core selection.
  • Random Exploration could find a larger input set by skipping a key UTXO higher up in the tree. This could lead to the selection of a larger number of inputs, or in an edge case could even cause tries to be exhausted before a solution is found. This may increase input set variance, or could perhaps even exhaust small UTXOs too quickly for BnB to often find a viable solution.

I am not sure there is a significant privacy benefit for Random Exploration as for either selection method an attacker would already need to know about another eligible input that would achieve an exact match when switched out for one of the input set.

What benefit do you expect from using Random Exploration?

Contributor

achow101 commented Jun 20, 2017

@Xekyo I was thinking that Random Exploration would be better for privacy but I see that it probably wouldn't help. If you think it would be better to change to LFE, I can certainly do that.

Contributor

Xekyo commented Jun 20, 2017

@achow101: I don't know how strong the effect is, but I'd expect Random Exploration to increase the required computational effort.

Member

instagibbs commented Jun 20, 2017

Noting that this PR has fairly heavy overlap with #10360 .

From chatting with @achow101 the intention of this PR is to touch as little as possible while still getting BranchNBound coin selection.

To make this successful it should really only be run on first iteration of the loop in CreateTransaction, when nFeeRet == 0 and only use effective value for the BnB coin selection step, rather than the knapsack as well. Once nFeeRet becomes more than zero, interactions start to get strange without a more complete overhaul like #10360.

Member

instagibbs commented Jun 20, 2017

This PR I believe will still create just-over-dust change outputs when BnB finds an exact match. Whenever we are allowing BnB matches(first iteration) we should not make change outputs less than the exact match slack value.

src/wallet/wallet.cpp
+ // Calculate cost of change
+ // TODO: In the future, we should use the change output actually made for the transaction and calculate the cost
+ // requred to spend it.
+ CAmount cost_of_change = effective_fee.GetFee(148+34); // 148 bytes for the input, 34 bytes for making the output
@Xekyo

Xekyo Jun 20, 2017

Contributor

This assumes that the input will be spent at a feerate at least as high as the current. This was a valid assumption in my thesis, as I was using a fixed fee rate. I'm not sure whether this a valid assumption for realnet transaction selection, as we've literally seen fees between 8-540 sat/byte in the past two weeks. We might want to consider discounting the cost of the input slightly.

@instagibbs

instagibbs Jun 20, 2017

Member

Depends on user time preferences. Could be an option that is set for those who regularly consolidate.

@achow101

achow101 Jun 20, 2017

Contributor

For now I think it is fine to use the current feerate.

Contributor

Xekyo commented Jun 20, 2017

@instagibbs: In fact, BnB is designed to only work when creating a transaction without a change output. If we were creating a change in the first place, the extensive search pattern would be unnecessarily wasteful.

Member

instagibbs commented Jun 20, 2017

To append onto my previous comments, any effective value match attempt should account for the fees just obtained by SelectCoins. Currently it completely ignores the newly-obtained fees, keeping the previous loop's value, and then asks if nFeeRet >= nFeeRequired to break from the loop(which currently is 0 on the first go-around).

fanquake added the Wallet label Jun 20, 2017

Contributor

achow101 commented Jun 21, 2017

I have made the BnB selector to be only run on the first pass of the coin selection loop. It is now set so that effective value is only used for the BnB selector and not the knapsack one. I have also added the negative effective value check and test just as a belt-and-suspenders thing. I also made BnB use Largest First Exploration instead of Random Exploration.

src/wallet/coinselection.cpp
+ backtrack = true;
+ } else if (value_ret >= target_value) { // Selected value is within range
+ done = true;
+ } else if (tries <= 0) { // Too many tries, exit
@Xekyo

Xekyo Jun 21, 2017 edited

Contributor

Here's a unexpected behavior in my algorithm: if there is a number of input combinations whose value_ret all exceed the target_value when tries == 0 is passed, tries can go into the negative.

The tries check should be moved to the top of the checks.

@achow101

achow101 Jun 21, 2017

Contributor

Done

Owner

sipa commented Jun 21, 2017

Perhaps generically, we should never create change if the amount is less than the cost of creating + spending it (regardless of whether BnB was used to find the inputs or not)?

Member

instagibbs commented Jun 22, 2017

@sipa one question is if we should allow the wallet to consider consolidation-level prices for that change. Perhaps the user is in a hurry now, but would consider spending that change at a much slower pace.

Maybe for a first pass only consider the selected feerate, then Future Work allow a parameter which has more aggressive change protection given longer timescales.

Owner

sipa commented Jun 22, 2017

@instagibbs Yes, I agree; we should use long-eatimates for the spend part of change rather than the actual feerate the user is willing to pay now. Perhaps we can make it more conservative without doing that by using a factor 2 or 3 reduction?

src/wallet/wallet.cpp
+ // Calculate cost of change
+ // TODO: In the future, we should use the change output actually made for the transaction and calculate the cost
+ // requred to spend it.
+ CAmount cost_of_change = effective_fee.GetFee(148+34); // 148 bytes for the input, 34 bytes for making the output
@gmaxwell

gmaxwell Jun 22, 2017

Member

not correct for segwit. If this code ends up being changed to follow pieter's suggestion of dividing the rate by two or three it should be bounded by the min relay fee. (I'm not super fond of that suggestion).

Member

gmaxwell commented Jun 22, 2017 edited

@sipa @achow101 it would be very very easy in the current PR to ask for another estimate for the change, I think ~two loc addition, and minor addition to the selectcoins arguments to pass down a second fee. I think this would be much more desirable than a fixed division. Future work could do things like make that second confirmation target configurable.

src/wallet/wallet.cpp
@@ -2562,7 +2562,7 @@ bool CWallet::CreateTransaction(const std::vector<CRecipient>& vecSend, CWalletT
}
const CAmount nChange = nValueIn - nValueToSelect;
- if (nChange > 0)
+ if (nChange > 0 && (!first_pass || nFeeRet == 0)) // nFeeRet is only 0 on the first pass if BnB was not used.
@gmaxwell

gmaxwell Jun 22, 2017

Member

Using nFeeRet to signal BNB usage is ugly. I think you shouldn't pass in nFeeRet at all, but have some explicit signal (e.g. boolean return) for BNB usage and if its set; after select coins set nFeeRet to nChange and use the same signal to bypass this branch.

I also think this condition is slightly incorrect but benign in the current code, lets say our configured feerate were zero: now BNB could find a solution and leave nFeeRet==0. (though nChange would currently be zero too, so it would be harmless but seems to me like the kind of thing to be brittle in future changes)

Contributor

achow101 commented Jun 23, 2017

Travis failure seems to be unrelated

runn1ng commented Jul 2, 2017

just fyi, I have used your code as a reference for this code

bitcoinjs/coinselect#13

runn1ng commented Jul 2, 2017

I have to say, I don't understand the target size; maybe there is a bug there.

In wallet.cpp, in CWallet::CreateTransaction, you create nValue, which seems to be the sum of all the outputs. Because the BnB is used only at the first pass, nFeeRet is 0 and nValueToSelect is just the sum of all the outputs.

This is then used as the exact target in the BnB algorithm.

However, you should add the cost of the outputs + the small cost of the tx overhead into the target (done here for the simple case on 1 output - https://github.com/Xekyo/CoinSelectionSimulator/blob/master/src/main/scala/one/murch/bitcoin/coinselection/StackEfficientTailRecursiveBnB.scala#L28 )

Maybe it's done somewhere, but I don't see it.

Contributor

achow101 commented Jul 2, 2017 edited

@runn1ng BnB uses effective values for the inputs so the fee is accounted for when coins are selected. The effective values are calculated in SelectCoinsMinConf

runn1ng commented Jul 2, 2017 edited

That eff. value accounts for the inputs of the new transaction, but not for the outputs (plus the overhead of the tx itself, but that is only about 10 bytes).

In SelectCoinsMinConf, you already have the target, which does not account for that.

Contributor

achow101 commented Jul 2, 2017

Ah, yes. That is a bug. Thanks for finding that!

@instagibbs

In general the semantics of first_run and used_bnb seem tightly linked, and are seemingly used interchangeably. Perhaps something to revisit.

src/wallet/wallet.cpp
@@ -2517,6 +2532,9 @@ bool CWallet::CreateTransaction(const std::vector<CRecipient>& vecSend, CWalletT
fFirst = false;
txout.nValue -= nFeeRet % nSubtractFeeFromAmount;
}
+ } else if (first_pass){
+ // On the first pass BnB selector, include the fee cost for outputs
+ output_fees += nFeeRateNeeded.GetFee(recipient.scriptPubKey.size());
@instagibbs

instagibbs Jul 3, 2017

Member

I think it may be better to directly check on serialized size of an output based on that pubkey

src/wallet/wallet.cpp
bool CWallet::SelectCoinsMinConf(const CAmount& nTargetValue, const int nConfMine, const int nConfTheirs, const uint64_t nMaxAncestors, std::vector<COutput> vCoins,
- std::set<CInputCoin>& setCoinsRet, CAmount& nValueRet) const
+ std::set<CInputCoin>& setCoinsRet, CAmount& nValueRet, CAmount& fee_ret, const CFeeRate effective_fee, bool& used_bnb, bool only_knapsack, int change_size) const
@instagibbs

instagibbs Jul 3, 2017

Member

right now it only uses one or the other, so !only_knapsack means used_bnb. I assume this interface is future-looking to where we may try multiple strategies?

@achow101

achow101 Jul 3, 2017

Contributor

The idea behind this was to have BnB be just strictly on top of the current behavior, and separating it like this makes that possible. The first time through the loop uses BnB, but then every time after that uses only the current selector. The loop behavior also stays the same since nFeeRet will remain 0 if the BnB fails.

src/wallet/wallet.cpp
+ // Get the fee rate to use for the change fee rate
+ CFeeRate change_feerate;
+ FeeCalculation feeCalc;
+ change_feerate = GetMinimumFeeRate(1008, ::mempool, ::feeEstimator, &feeCalc);
@instagibbs

instagibbs Jul 3, 2017

Member

just set it when declaring the variable two lines above and make it const

src/wallet/wallet.cpp
@@ -2544,6 +2490,7 @@ bool CWallet::CreateTransaction(const std::vector<CRecipient>& vecSend, CWalletT
AvailableCoins(vAvailableCoins, true, coinControl);
nFeeRet = 0;
+ bool first_pass = true;
@instagibbs

instagibbs Jul 3, 2017

Member

Add a comment saying this triggers BnB to be the only type tried when true

src/wallet/wallet.cpp
@@ -2556,7 +2503,22 @@ bool CWallet::CreateTransaction(const std::vector<CRecipient>& vecSend, CWalletT
CAmount nValueToSelect = nValue;
if (nSubtractFeeFromAmount == 0)
nValueToSelect += nFeeRet;
+
+ // Get the fee rate to use effective values in coin selection
@instagibbs

instagibbs Jul 3, 2017

Member

Since we're moving it already, there's no reason to not just move this block outside the loop, right? See: https://github.com/bitcoin/bitcoin/pull/10360/files#diff-b2bb174788c7409b671c46ccc86034bdR2476

src/wallet/wallet.cpp
+ return false;
+ }
+ }
+ if (first_pass) {
@instagibbs

instagibbs Jul 3, 2017

Member

this should be used_bnb? Kind of unclear what the difference is currently.

@@ -837,7 +850,8 @@ class CWallet : public CCryptoKeyStore, public CValidationInterface
* completion the coin set and corresponding actual target value is
* assembled
*/
- bool SelectCoinsMinConf(const CAmount& nTargetValue, int nConfMine, int nConfTheirs, uint64_t nMaxAncestors, std::vector<COutput> vCoins, std::set<CInputCoin>& setCoinsRet, CAmount& nValueRet) const;
+ // TODO: Change the hard coded change_size when we aren't only using P2PKH change outputs
@instagibbs

instagibbs Jul 3, 2017

Member

if we're going to change it later to something without a default/dynamic value, maybe just get rid of the default arg and pass it each time.

@@ -962,11 +976,23 @@ class CWallet : public CCryptoKeyStore, public CValidationInterface
*/
static CAmount GetMinimumFee(unsigned int nTxBytes, unsigned int nConfirmTarget, const CTxMemPool& pool, const CBlockPolicyEstimator& estimator, FeeCalculation *feeCalc = nullptr, bool ignoreGlobalPayTxFee = false);
/**
+ * Estimate the minimum fee rate considering user set parameters
+ * and the required fee
@instagibbs

instagibbs Jul 3, 2017

Member

perhaps note it doesn't have the maxtxfee check inside it, making it slightly asymmetrical to the total fee one.

runn1ng commented Jul 3, 2017

@achow101 for some reason, when I do simulations either on @xekyo set (in scala) or on bitcoinjs randomly generated data (with the algo rewritten into javascript), the total fees are actually lower when I make the target lower (that is, when I do not include the output cost in the target). So maybe tightening the target rejects more transactions and then the fallbacks somehow make better results.

I will investigate more when I have the time and write results here Xekyo/CoinSelectionSimulator#5

instagibbs and others added some commits May 5, 2017

@instagibbs @achow101 instagibbs add FeeRate companions to Get_Fee wallet helpers 6ace285
@achow101 achow101 Coin selection class with function for Branch and Bound coin selectio…
…n algo

Created a new class for handling coin selection. Added a function to this class
for the Branch and Bound coin selection algorithm described by Murch.
33fb5b4
@achow101 achow101 Test for coin selector 73562e8
@achow101 achow101 add Lookahead optimization
Optimization which will cut a branch of the remaining amount of the utxo
pool plus the amount already selected is less than the target. Removes
unnecessary iterations.
102b92c
@achow101 achow101 Move coin selection to wallet
Move the new coin selection code to wallet code. Change from using pair<CAmount, COutPoint> to CInputCoin
2af7aac
@achow101 achow101 Move original coin selector and tests
Moved the original coin selection algorithm to coinselection.cpp and its tests
to coin_selection_test.cpp
3ec4b08
@achow101 achow101 Use BnB selector before using original selector
Set SelectCoinsMinConf to use the new selector algorithm first before
falling back to the original selector algorithm.
7929ed8
@achow101 achow101 Calculate and use effective values for coin selection 6e032eb
@instagibbs @achow101 instagibbs Have COutput calculate size of the input required to spend it.
Note: This code was copied and pasted from a PR by @instagibbs
43fe0c8
@achow101 achow101 Add iteration count exhaustion test case
Added a test for iteration count exhaustion and fixed an issue where iteration
exhaustion did not cause failure.
1e4c48e
@achow101 achow101 Use the feerate to calculate the cost of change
Use the feerate passed into SelectCoins to calculate the cost of creating
a change output to be used in the branch and bound searcher.
b8ecbd1
@achow101 achow101 Use an actual fee rate for calculating effective values 1014949
@achow101 achow101 Test full coinselection algorithms
Test the full coin selection algorithm which begins with the branch and bound
and falls back to the original algo
ad09ecd
@achow101 achow101 Use BnB selector only on first pass, knapsack selector on rest
Separate SelectCoinsMinConf to allow for either BnB selector or
knapsack selector. This allows CreateTransaction to run the BnB
selector on the first pass and the knapsack selector on the rest
to keep the original selection algorithm as a fallback.
80556f8
@achow101 achow101 Ignore negative effective values and test for that
Do not add negative effective values to the utxo_pool for BnB selector.
Added a test to make sure that this happens
ab01003
@achow101 achow101 Use LFE instead of RE; no change with BnB
Instead of RE, use LFE.

Also don't make a change output if BnB works
4f9a8af
@achow101 achow101 Use CAmount instead of long; at() instead of []
Apparently don't use long :/

Switched to using at() instead of [] for better range checking
ee2cfdc
@achow101 achow101 assert that values in BnB are not negative
Added an assertion to check that the value being examined in the BnB
algo is never negative. The effective value calculation will remove
all of the negative values so this assert should never happen.
b9e9eab
@achow101 achow101 Have BnB selector calculate the transaction fee and then return that
Actually calculate the transaction fee with the effective values and
then return that so that the fee is correct.
f85ee5d
@achow101 achow101 Check for iteration exhaustion first before other checks 93227e2
@achow101 achow101 Do not make change if BnB selector was used ebd15e8
@achow101 achow101 Add a parameter to know that BnB was used for selection
Add a return parameter to selectcoins and selectoinsminconf to allow
the caller to know whether bnb was used for the coinselection or not.
This allows us to cleanly skip change output creation since BnB
would only be successful if there is no change.
a087cca
@achow101 achow101 long target fee estimation for change cost and change size param
Use the largest possible fee esitmation target for calculating the
cost of change.

Also pass in the size of the change in a parameter instead of a
having it be a constant inside the function.
a3c1117
@achow101 achow101 Include output fee cost in the target value
Make sure to account for the cost of outputs in the target value.
3a38a00
@achow101 achow101 Various fixes 1ea8e55
@achow101 achow101 Unify use of used_bnb and first_pass; redefined selectcoins used_knap…
…sack

Changed used_bnb and first_pass to be just first_pass renamed as use_bnb

Redefined selections and selectcoinsminconf to have a use_bnb parameter
71e9d68
Member

instagibbs commented Jul 6, 2017

@runn1ng if you wouldn't mind, I'd like to know what the difference in rate of change creation for each of those experiments as well.

Contributor

Xekyo commented Jul 6, 2017 edited

[…]the total fees are actually lower when I make the target lower (that is, when I do not include the output cost in the target).

@runn1ng: Um wait. "Target" is the amount to be selected. We are talking about the "cost of change" parameter that gives the leniency window for the exact match, right? Also, do you mean "input cost" instead of "output cost"?

It would be lovely if you could post your experiment's results somewhere, so we all have the same dataset to discuss.

runn1ng commented Jul 11, 2017

@Xekyo The problem with your experiment is that it's non-deterministic... but maybe I could put there some pre-set random seed

runn1ng commented Jul 13, 2017 edited

edit: ignore the graphs, see comment below

I am not sure if I should discuss the experiments here, or on murch's repo PRs :)

Anyway. I tried changing the cost of change on your scala code, as I wrote here - Xekyo/CoinSelectionSimulator#8 . Now I tried values from 0 to 100 as percent, and this is the result (note that left axis doesn't start at 0)

chart

google spreadsheet link

x axis is how much percent of the current cost of change is used; y axis is total cost on the big honeypot data set.

On the small random test cases there is no difference (what matters more there is the fallback, but that's for another experiment).

Note that there was a typo in the original experiments for the paper, which makes the cost of change "factor" 83%

runn1ng commented Jul 13, 2017 edited

edit: ignore the graphs, see comment below

There is still the danger of overfitting to this one case though

If you I try the same at the bitcoinjs example - defined here - small random examples with relatively few utxos - the graph looks completely different, and very dependent on what is a "backup plan" in the case of not found match

chart 1

google sheet

"rand" is random, "min" is sorting the utxos from the biggest to the lowest and starting from the biggest. (Both are total cost.) I am not sure what "min" strategy does on the big data.

(Just for interest, when I tried 100-200%, the graph goes up again, but not that quickly)

runn1ng commented Jul 14, 2017 edited

edit: ignore the graphs, see comment below

When I added bnb+min to the moneypot example, I got this result (x is still percentage of money cost)

chart 2

Again, it's always better to take the "minimal" strategy (that is, to sort utxos by value size descending, and then take from the start until it's enough).

If you want to replicate this experiment - note that it takes, for reasons I don't understand, terribly long time, and you will have to parallelize the simulation - luckily, that's trivial with scala paralel collections - see the commits at https://github.com/runn1ng/CoinSelectionSimulator/tree/exp_multi

I would like to hear @xekyo opinions :)

Also I would like to try this PR strategy, that is, to take the current core strategy as a backup.... but that is too complicated (especially in the javascript code), so I won't do it.

Contributor

achow101 commented Jul 14, 2017

@runn1ng would you be able to try the strategy with Core's current selector as fallback? The easiest way to do that would be to add/modify the test cases for coin selection.

Contributor

Xekyo commented Jul 14, 2017 edited

@runn1ng: re random data: I'd surmise that BnB doesn't perform well on small datasets as there are too few possible combinations. That could easily cause the fallback algorithm to dominate.

re moneypot:
What I do find confusing is that your total cost is so much higher than my result with Branch and Bound + Single Random Draw of 58,940,772.30. Were you still running with fixed fees of 10000 satoshi/kB?

I haven't comprehensively tested all possible fallback algorithms, it is possible that Largest First selection as a fallback to BnB is more efficient as it doesn't take away as many small utxo that can be used to create combinations.

Do I understand correctly that you calculated "cost of change" and then took a percentage of that, or is this percentage only on the cost of the input? If you did the former, it appears that using just the cost of an additional output as "cost of change" leads to a minimum, considering that 34 bytes is 18.7% of what I proposed as "cost of change" with output+input.

runn1ng commented Jul 15, 2017 edited

What I do find confusing is that your total cost is so much higher than my result with Branch and Bound + Single Random Draw of 58,940,772.30. Were you still running with fixed fees of 10000 satoshi/kB?

That is weird indeed.

I am running code from your repo. To be sure I reverted all my local changes and I still get 72506973.

When I made the correction here Xekyo/CoinSelectionSimulator#9 , I get total cost 70858076

I use only StackEfficientTailRecursiveBnB, should I try the other BnBs? edit: well, they get stack overflow, so I won't. :D

runn1ng commented Jul 15, 2017 edited

I am running the code through sbt run in the main directory. I look just at the total cost in the resulting csv.

runn1ng commented Jul 15, 2017 edited

I get totally different numbers than in your paper with the other scenarios too. The numbers don't correspond to neither of the three tables, unfortunately.

edit: oooh, that's because I am running "MoneyPot After LF", which was the default scenario, but it's actually with additional UTXOs from a previous run. The actual scenario from the paper (the first one) is TestCaseMoneyPotEmpty, right.

Contributor

Xekyo commented Jul 15, 2017

Yes correct. The Moneypot after LF, is running the MoneyPot scenario starting with the resulting UTXO pool of running it with Largest First selection before.

runn1ng commented Jul 17, 2017

I found out that the two repos for coinselect simulation returned different results for the same strategy, so I painstakingly went through both of them and found where they differ... and put tons of of PRs to both, so they now both return the same results with the same fees + setup

The differences in simulations were:

  • how they deal with insufficient funds (and how they detect it)
  • what is minimum change
  • what are the sizes of "defaut" input/output
  • some small bugs

...and those added to significant differences. Anyway, when I fixed all the issues, those are the results/graphs I get:

this is for the moneypot scenario, with the fees 10 sat/bitcoin

chart 3
google sheet

this is for the moneypot scenario, when I increase the fee to 200 sat/bitcoin (but I left the values, so more utxos become unspendable)

chart 6
google sheet

In both, rand is slightly better. I am not sure what happened, if it's because the scenario is different (without the large UTXO set) or because of the subtle differences in benchmarking. The shape is similar though.

This is the scenario of small randomly generated wallets

chart 5
google sheet

BnB+LF performs better, optimum about 50% cost of change.

So, different strategies/parameters are better at different scenarios. Again, there is danger of overfitting on one scenario - plus there might be some more subtle bugs in the benchmark code... in my wallet code, I will probably just use BnB+LF with 50% of cost and call it a day :)

I haven't shown this in graphs, but having BnB is always better than not having it. :)

If you want to repeat the tests, my forks of the repos are here 1 2

runn1ng commented Jul 17, 2017

@xekyo

Do I understand correctly that you calculated "cost of change" and then took a percentage of that, or is this percentage only on the cost of the input? If you did the former, it appears that using just the cost of an additional output as "cost of change" leads to a minimum, considering that 34 bytes is 18.7% of what I proposed as "cost of change" with output+input.

Hm, that doesn't seem to be the case, the minimum is not 18%, but 30% to 50% on these two scenarios.

Member

instagibbs commented Jul 17, 2017

@runn1ng is there a plausible explanation why not accounting for the full cost of the change is cheaper overall?

runn1ng commented Jul 18, 2017

@achow101

@runn1ng would you be able to try the strategy with Core's current selector as fallback? The easiest way to do that would be to add/modify the test cases for coin selection.

Hm, I already spent too much time on this... :/ I will see if I have time to look into the bitcoin coin selection tests and how to add benchmarks there, but not promising anything.

runn1ng commented Jul 18, 2017 edited

@instagibbs

I think - and that's a speculation - that it's because the target is "tighter", so the BnB will reject more "lose" matches and will continue searching until it finds better match. So less fee is spent then, even when some matches are rejected that didn't have to be (and those spend more on fees).

Btw. An interesting thing I just noticed - in the "small random" example, there is not that many BnB matches in the first place! Around 30 (out of 10.000 transactions). It still has an effect on the result...

+ while (selection.at(depth).second) {
+ // Reset this utxo's selection
+ if (selection.at(depth).first) {
+ value_ret -= utxo_pool.at(depth).txout.nValue;
@runn1ng

runn1ng Jul 18, 2017 edited

This line never fires.

It never happens, that an utxo is at the same time in an exclusion branch (which is what .second does) and is also selected (what .first does). Which makes sense; you never at the same time select and not select an utxo :)

With all my simulations, this line never seems to fire (when I rewrote this to JS).

So the other line after if can also be deleted.

@runn1ng

runn1ng Jul 18, 2017

I also think that .second is not needed at all; all the information necessary is in the first and depth; the only situation where .first != !(.second) is after we backtrack here, but the information in .second is useless anyway (since we will change it anyway before we backtrack to it again).

@achow101

achow101 Jul 18, 2017

Contributor

Right. That appears to be a relic of when this randomly selected which branch to try first before I changed it to always try including first.

@runn1ng runn1ng added a commit to runn1ng/coinselect that referenced this pull request Jul 18, 2017

@runn1ng runn1ng Removing useless array 3d806ec
+ }
+ }
+
+ if (!done) {
@runn1ng

runn1ng Jul 18, 2017

This is never true here. done is never true when backtrack is true. (Istanbul caught that :))

@runn1ng

runn1ng Jul 18, 2017

I mean done is never true and this block always happens

@Xekyo

Xekyo Jul 18, 2017

Contributor

This block doesn't happen when backtrack is false and done is true which happens when a solution is found.

@runn1ng

runn1ng Jul 18, 2017

in that case, the while cycle terminates before that

@runn1ng

runn1ng Jul 18, 2017

In the if at the start of the while cycle, either backtrack or done is set, never both. We got here only when backtrack == true, so done cannot be true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment