Revisit Coin Selection #7664

morcos · 2016-03-09T23:53:14Z

There is concern that the current coin selection algorithm is too biased towards creating small inputs. We should try to look at some data after the change for 0.12 and see if that is actually the case and regardless work at improving the coin selection algorithm to avoid that concern before the 0.13 release.

See #7657 and #4906 for back story.

Tag with 0.13 milestone please

jonasschnelli · 2016-03-10T08:09:17Z

Agree with @morcos.
Something we might want to work towards is a flexible/modular Coinselection. Coinselection somehow depends on the use cases and could be modular therefore.

We could allow different coin selection classes (some OO magic / interfacing) and allow to select them in fundrawtransaction or over a global state only adjustable on startup -coinselection=*class*.

Bitcoinj has a similar approach.

gmaxwell · 2016-03-10T08:42:37Z

In my experience the biggest impediment to doing anything with coin selection is the current testing approach that simply fail if anything changes at all. So when you intend to change it you just get a pile of failures and nothing that tells you if the change is behaving sanely.

I don't see any problem to having a selection, but I'm somewhat doubtful of the utility. Having a selection doesn't replace having a good default-- and right now we don't even have a good default; and most of the time the user actually doesn't really much useful information here (nothing the user would call a "use case" probably directly translates into the right strategy)-- keep in mind that most users have no idea about the underlying "coins" behavior in the system.

Perhaps the primary strategy distinguisher would be the importance of privacy-- e.g. should sweeping be kept to "do no privacy harm" levels, or more aggressive-- but the user is often poorly equipt to weigh the privacy considerations (e.g. don't what their future risks are, and may not know how powerful transaction analysis is), nor are they usually able to weigh the privacy impacts on their counterparties or the ecosystem as a whole.

Another thing to keep in mind is that implementation of GROUPCHECKSIGVERIFY would make strategies that try to spend many inputs at once much more attractive. (beyond the argument I've made before that you generally want to pay fees sooner rather than later, since fees tend to go up over time)

RHavar · 2016-03-10T19:40:52Z

One untested idea that I haven't seen suggested before (quite possibly because I'm overlooking something): try minimize the value of the change output. If it's less than the dust-threshold, it can be safely omitted and donated to miners.

It should help keep the unspent set in check, while generating outputs that are actually useful for the next transaction(s). It also has the actively trying to avoid sourcing pointlessly large inputs (which are best saved for another transaction, especially if the user is sending multiple transactions before the next block confirms), and will also help create outputs which are useful for sending exact-change.

The biggest disadvantage I can see, is it'll tend to make the change very distinguishable from the destination, but that's a problem that already exists.

gmaxwell · 2016-03-10T20:11:36Z

@RHavar technically thats what Bitcoin has always done. The subset sum tries to minimize the change (which generally minimizes the input set), and tiny amounts of change are just turned to fees. This has a bad behavior though of often resulting in small change (though above the threshold), which is not very useful for spending in the future; so big coins get spent and ground down to near-dust.

See ApproximateBestSubset in wallet.cpp.

laanwj · 2016-03-11T08:05:23Z

#1643 should also be mentioned here. Bad coin selection with lots of small outputs has been a long-running issue.

maflcko · 2016-03-14T00:35:10Z

which generally minimizes the input set

I doubt this is true. The algorithm will generally start by selecting some appropriately sized large coins (vValues is sorted) and then "fill up the gaps" with small coins such that it either hits the target value or overshoots the target value + MIN_CHANGE by the smallest amount possible. (see e.g. the issue @laanwj linked to)

RHavar · 2016-03-14T01:12:28Z

One parameter you could add without introducing a huge amount of complexity would be an "ideal" amount of outputs in your wallet, lets call it N. Having a low N would mean you can easily/cheaply sweep your wallet (at the expense of linking your coins more heavily, and big transaction chains). Having a high N would give you the option of more coins to select from, hopefully resulting in an more ideal match.

N could be a configuration option, and when the amount of outputs is less than N it could actively try to increase it (spending from the largest coins first is a good way), and when the outputs is greater than N it could actively try to decrease it (sourcing extraneous inputs is a good way).

murchandamus · 2016-04-03T10:24:14Z

Would it perhaps make sense to increase the limit of what gets converted to fees? It would seem to me that this could reduce the number of tiny UTXO being created.

E.g. If the change is a magnitude smaller than the fee and a small portion of the total value of the transaction, add it to the fee:

if(change < 0.1* fee && change < 0.001 * target):
    fee = fee + change

That way, it would only apply to transactions where the change is only a tiny portion of the total value transacted, but still never donate more than transaction fee amounts. I.e. that way really large transactions wouldn't suddenly donate significant money, and micro transactions wouldn't lose a significant portion of the transacted value.

I haven't tested whether it would have significant impact, and these numbers are just from the top of my head as well.

maflcko · 2016-06-15T14:07:24Z

Changed the milestone to 0.14

This reverts PR bitcoin#4906, "Coinselection prunes extraneous inputs from ApproximateBestSubset". Apparently the previous behavior of slightly over-estimating the set of inputs was useful in cleaning up UTXOs. See also bitcoin#7664, bitcoin#7657, as well as 2016-07-01 discussion on #bitcoin-core-dev IRC.

dooglus · 2016-07-04T22:12:58Z

@gmaxwell Re "technically thats what Bitcoin has always done. The subset sum tries to minimize the change (which generally minimizes the input set), and tiny amounts of change are just turned to fees" I don't think that's true. From CWallet::SelectCoinsMinConf():

ApproximateBestSubset(vValue, nTotalLower, nTargetValue, vfBest, nBest);
if (nBest != nTargetValue && nTotalLower >= nTargetValue + MIN_CHANGE)
    ApproximateBestSubset(vValue, nTotalLower, nTargetValue + MIN_CHANGE, vfBest, nBest);

The code is trying to make exactly the nTargetValue, and if it is unable to make that exact amount it instead tries to make nTargetValue + MIN_CHANGE instead.

So most transactions end up with a change amount of 0.010xxxxx and you end up with a wallet full of these individual bitcents which are expensive to spend. See this tx for an example of a high fee caused by this 0.01* pollution.

[Edit: this one is even crazier]

In the tx linked above, notice that the change output is 0.01000109 BTC, while two of the inputs are 0.01000112 BTC. So by donating just three satoshis to the miner we could have avoided the change output completely in this example. I suspect (based on how change outputs are very often less than a thousand satoshis over 0.01 BTC) that the majority of change outputs could be eliminated if the coin selection code was changed to donate up to 1000 satoshis to the miner if it allows change to be avoided.

gmaxwell · 2016-07-05T00:15:58Z

@dooglus indeed, I forgot that. A legacy of old code that overly prioritized free transactions. If there is no change it should treat up to some threshold over as being just as good as an exact match, and turn it into change. The threshold could be something like half the cost of spending a change output at a current feerate-- e.g. at 50 base units per byte that threshold would be about 3600 base units.

dooglus · 2016-07-05T01:03:58Z

@gmaxwell "and turn it into change" .. you mean turn it into fee, right? We're talking about avoiding creating change in this instance.

How are we to determine the current feerate? Do we just use CWallet::GetMinimumFee() for some fixed nTxBytes size? Using a hardcoded '3600 base units' seems like the wrong way to go about it.

gmaxwell · 2016-07-05T04:20:01Z

Fee yes. :)

The size should depend on the kind of keys being used for change and reflect the marginal size increase from adding the input e.g. it would be ~ 32+4+1+73+34 bytes for a regular p2pkh change address. Effectively coin worth less than the fee for that size could currently costs more to spend than its worth-- so we really shouldn't be making outputs smaller than that, and preferably not smaller than 100 times that value, since presumably people would like to keep their fees as small percentage of the funds moved. :)

dooglus · 2016-07-11T20:16:49Z

we really shouldn't be making outputs smaller than that, and preferably not smaller than 100 times that value, since presumably people would like to keep their fees as small percentage of the funds moved. :)

I'm not sure this logic is sound. We're looking for the threshold at which we decide to just include the 'change' in the miner fee instead of making a new output. That is effectively giving 100% of the change to the miner.

Suppose the cost (in fees) of spending an output is 0.0001 and the value of the output is 0.0010. We are choosing between:

A) create a change output; spending it will involve a 10% fee, but the user will still get a 0.0009 value from spending it
B) don't create a change output, just immediately give it all in extra fees.

While A) involves a 10% fee when the output is spent, B) involves an immediate 100% fee. Neither choice lets the user keep their fee as a small percentage. Best for the user would be if we simply never created a change output worth less than the expected cost of spending it.

gmaxwell · 2016-07-13T01:35:37Z

There are two ways to handle change too small, give it as fees or add more inputs to grow it. Whichever we do, it should result in making change only if it is a large multiple of the cost to spend it. If we're unable to address it some other way, I think it wouldn't be unreasonable to turn it into fees, even right up to that threshold: better to spend it and get faster confirmation, then to end up with a worthless output that you'll just abandon and bloat up the txo set.

luke-jr · 2016-07-14T03:37:49Z

I don't see how bitcent change is "expensive to spend". Seems like a reasonable decision to make it configurable, though - ideally you'd want to target your average spend amount, which varies from user to user. Another possibility there is to try to guess based on historical spends.

RHavar · 2016-07-14T03:45:22Z

I don't see how bitcent change is "expensive to spend".

Well, even if you only have 10 BTC, and that gets churned into 1000 outputs over time (as they currently do in 0.12 depending on your spend:receive habits), it's pretty painful to spend from, regardless if your normal spend amount is 0.01 btc. For instance, even though I normally send tiny transactions, I'm in a position where trying to sweep 50% of my wallet results in "transaction too large" error. And when I lower that amount so I can actually make a transaction, it costs $8 (?) in transaction fees or something.

I really do think that if it was just tiny bit more aggressive in create change-less transactions, most of the problems would go away (and privacy improve).

rebroad · 2016-08-01T12:30:12Z

Surely the maths behind this has already existed for years - it feels like a wheel that has already been invented. I'd have thought a variety of input sizes would be the ideal, perhaps binary based, i.e. 16, 8, 4, 2, 1 etc, but stopping at the point (creating them) where they are close to the current tx fees. The wallet therefore should perhaps take into account all inputs currently tracked in deciding what sized inputs to create next. No longer does the age of an input play such a part (i.e. the priority) as I suspect more miners are simply focused on getting the max for their blocks these days.

murchandamus · 2016-11-04T01:47:04Z

I've finished my thesis on coin selection on Monday.

I've described an algorithm on pages 46-49 for coin selection which I call "Branch'n'Bound" in the document. (I only came up with it after Scaling Bitcoin, so it wasn't included in my presentation there.)

It is a two step scheme.

A randomized depth-first-search on a binary tree generated from omitting or including the effective value of UTXOs to find an exact match.
If no exact match is found, equi-probable selection on the UTXO pool. (Which provides different sized outputs for a better chance of exact matches, and has performed pretty nicely by itself in my simulations.)

Simulating with the dataset provided by RHavar in the other thread and compared to Bitcoin Core's current strategy it:

has a smaller tendency to produce large input sets
increases exact matches (transactions without a change output) by more than ×50
maintains a smaller number of UTXO in the wallet in face of 2.06 incoming transactions per outgoing transaction
has a smaller average input set size (and thus lower user fees)
is less computationally intensive
fixes an issue with Bitcoin Core's fee estimation

This algorithm could be basis for a new coin selection algorithm of Bitcoin Core and I would like to invite questions and commentary. (If this thread is perhaps not the best medium for such a discussion, I'd be open to another format, e.g. the mailing list?)

laanwj · 2016-11-11T09:48:12Z

@xekyo Very interesting work! So that algorithm does coin selection better (at least on that dataset) and implementing it would unentangle the current coin selection code mess and is also less computationally intensive. Sounds great to me. Looking forward to it.

To avoid having a pull request wither forever I think we should get some people to commit to testing it. Also at least one large site that uses bitcoin core for wallet.

RHavar · 2016-11-11T14:32:05Z

To avoid having a pull request wither forever I think we should get some people to commit to testing it. Also at least one large site that uses bitcoin core for wallet.

I'd be happy to test a PR on a production site after it's reasonably stable. My site has made 2583 outgoing transactions in the last 7 days, (and received 4774 in the same time period) so it's probably a reasonable stress test. (if someone could just ping or mention me on it, if there's a PR so I don't miss it)

rebroad · 2016-11-13T07:27:28Z

@xekyo Just reading your paper. It says "Transaction malleability was most notably blamed to have contributed to the demise of MtGox" - I think you'll find that the only person who blamed the demise on this was the owner of MtGox - and it was later proved to be not a significant contributor by various researchers with greater credibility than the person who made the claim you refer to. ;) Anyway, sorry, off-topic.

murchandamus · 2016-11-13T16:15:42Z

@RHavar: Thank you very much, I'll be happy to ping you. Thank you also for providing the dataset that I used to evaluate my thesis on. :)

@rebroad: Yeah, I'm aware of that, but I agree that I should have been clearer in that paragraph.

murchandamus · 2017-01-08T12:11:38Z

I just saw the proposed changes in #9404, sorry I didn't get my proposal done for 0.14. :(

The proposed fix is much better than the previous behavior (#4082), also especially for computational effort, because unnecessary repetitions are forestalled.

However, the issue is due to the order that a transaction is created: Since the fee is being estimated before the inputs are selected, when the number of selected inputs deviates from the initial estimate, the fee estimate is off. This problem stems from the effective target of the selection increasing with each collected input, because the effective target is increased by the cost of one input for every input we select.

Instead we should estimate the fee rate first, before the inputs are selected, and deduct the cost of spending an input from each input's value upon selection.

effectiveValue = utxo.value - feePerByte × bytesPerInput

We'd measure the "effective contribution" of the input toward the fixed cost of target + transaction overhead + output costs. That way, we always get the fee right.

murchandamus · 2017-06-30T19:31:05Z

@RHavar: The Branch and Bound algorithm is being implemented in this patch by Andrew Chow: #10637

morcos · 2017-07-06T19:36:15Z

This can bump to 0.16

This reverts PR bitcoin#4906, "Coinselection prunes extraneous inputs from ApproximateBestSubset". Apparently the previous behavior of slightly over-estimating the set of inputs was useful in cleaning up UTXOs. See also bitcoin#7664, bitcoin#7657, as well as 2016-07-01 discussion on #bitcoin-core-dev IRC.

jonasschnelli added the Wallet label Mar 10, 2016

laanwj added this to the 0.13.0 milestone Mar 11, 2016

RHavar mentioned this issue Mar 14, 2016

bitcoin 0.12's coinselection #7657

Closed

maflcko modified the milestones: 0.13.0, 0.14 Jun 15, 2016

laanwj mentioned this issue Jul 1, 2016

wallet: Revert input selection post-pruning #8298

Merged

maflcko modified the milestones: 0.15.0, 0.14.0 Jan 6, 2017

laanwj mentioned this issue Feb 27, 2017

Add a greedy coin selection option #9870

Closed

RHavar mentioned this issue May 3, 2017

Stop creating change outputs when not necessary #10329

Closed

instagibbs mentioned this issue May 8, 2017

[WIP] [Wallet] Target effective value during transaction creation #10360

Closed

This was referenced Jun 30, 2017

Do you recommend to use BnB in real wallet apps? murchandamus/CoinSelectionSimulator#3

Closed

Implement BnB strategy bitcoinjs/coinselect#10

Open

laanwj modified the milestones: 0.16.0, 0.15.0 Jul 6, 2017

SomberNight mentioned this issue Dec 11, 2017

CoinChooser: privacy prefers confirmed and is default spesmilo/electrum#3496

Merged

laanwj modified the milestones: 0.16.0, 0.17.0 Jan 11, 2018

Sjors mentioned this issue Mar 16, 2018

High level road map for coin selection changes #12605

Closed

maflcko closed this as completed Mar 22, 2018

bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit Coin Selection #7664

Revisit Coin Selection #7664

morcos commented Mar 9, 2016

jonasschnelli commented Mar 10, 2016

gmaxwell commented Mar 10, 2016

RHavar commented Mar 10, 2016

gmaxwell commented Mar 10, 2016

laanwj commented Mar 11, 2016

maflcko commented Mar 14, 2016

RHavar commented Mar 14, 2016

murchandamus commented Apr 3, 2016

maflcko commented Jun 15, 2016

dooglus commented Jul 4, 2016 •

edited

Loading

gmaxwell commented Jul 5, 2016

dooglus commented Jul 5, 2016

gmaxwell commented Jul 5, 2016

dooglus commented Jul 11, 2016

gmaxwell commented Jul 13, 2016 •

edited

Loading

luke-jr commented Jul 14, 2016

RHavar commented Jul 14, 2016 •

edited

Loading

rebroad commented Aug 1, 2016

murchandamus commented Nov 4, 2016 •

edited

Loading

laanwj commented Nov 11, 2016

RHavar commented Nov 11, 2016 •

edited

Loading

rebroad commented Nov 13, 2016

murchandamus commented Nov 13, 2016 •

edited

Loading

murchandamus commented Jan 8, 2017

murchandamus commented Jun 30, 2017

morcos commented Jul 6, 2017

Revisit Coin Selection #7664

Revisit Coin Selection #7664

Comments

morcos commented Mar 9, 2016

jonasschnelli commented Mar 10, 2016

gmaxwell commented Mar 10, 2016

RHavar commented Mar 10, 2016

gmaxwell commented Mar 10, 2016

laanwj commented Mar 11, 2016

maflcko commented Mar 14, 2016

RHavar commented Mar 14, 2016

murchandamus commented Apr 3, 2016

maflcko commented Jun 15, 2016

dooglus commented Jul 4, 2016 • edited Loading

gmaxwell commented Jul 5, 2016

dooglus commented Jul 5, 2016

gmaxwell commented Jul 5, 2016

dooglus commented Jul 11, 2016

gmaxwell commented Jul 13, 2016 • edited Loading

luke-jr commented Jul 14, 2016

RHavar commented Jul 14, 2016 • edited Loading

rebroad commented Aug 1, 2016

murchandamus commented Nov 4, 2016 • edited Loading

laanwj commented Nov 11, 2016

RHavar commented Nov 11, 2016 • edited Loading

rebroad commented Nov 13, 2016

murchandamus commented Nov 13, 2016 • edited Loading

murchandamus commented Jan 8, 2017

murchandamus commented Jun 30, 2017

morcos commented Jul 6, 2017

dooglus commented Jul 4, 2016 •

edited

Loading

gmaxwell commented Jul 13, 2016 •

edited

Loading

RHavar commented Jul 14, 2016 •

edited

Loading

murchandamus commented Nov 4, 2016 •

edited

Loading

RHavar commented Nov 11, 2016 •

edited

Loading

murchandamus commented Nov 13, 2016 •

edited

Loading