Issue#1643: Coinselection prunes extraneous inputs from ApproximateBestSubset #4906

Merged
merged 3 commits into from Dec 8, 2015

Conversation

Projects
None yet
7 participants
@Xekyo
Contributor

Xekyo commented Sep 13, 2014

Improvement for Issue#1643:
A further pass over the available inputs has been added to ApproximateBestSubset after a candidate set has been found. It will prune any extraneous inputs in the selected subset, in order to decrease the number of inputs and the resulting change.

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Sep 13, 2014

Contributor

This is the first time I am trying to collaborate on an open-source project, please feel free to point out when I am doing something wrong.

Contributor

Xekyo commented Sep 13, 2014

This is the first time I am trying to collaborate on an open-source project, please feel free to point out when I am doing something wrong.

@BitcoinPullTester

This comment has been minimized.

Show comment
Hide comment
@BitcoinPullTester

BitcoinPullTester Sep 13, 2014

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4906_3b92ce7ad950ceba14f1cc6ad637923f71ca61be/ for binaries and test log.
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4906_3b92ce7ad950ceba14f1cc6ad637923f71ca61be/ for binaries and test log.
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Sep 13, 2014

Contributor

I was made aware by @sipa that a candidate set that ended up matching the targetValue with the addition of a dust UTXO would still be picked immediately. That seems to be correct.

However, unless one is trying to make a minuscule payment, it seems that the likelihood to add a bigger UTXO at some point that would exceed the target should be sufficiently larger, than the chance that you'd only add inputs that inch closer and closer to the target to match it finally without exceeding it. It should rather be unlikely to even happen in 1000 iterations (gut feeling), so the patch should work fine in the majority of situations.

Contributor

Xekyo commented Sep 13, 2014

I was made aware by @sipa that a candidate set that ended up matching the targetValue with the addition of a dust UTXO would still be picked immediately. That seems to be correct.

However, unless one is trying to make a minuscule payment, it seems that the likelihood to add a bigger UTXO at some point that would exceed the target should be sufficiently larger, than the chance that you'd only add inputs that inch closer and closer to the target to match it finally without exceeding it. It should rather be unlikely to even happen in 1000 iterations (gut feeling), so the patch should work fine in the majority of situations.

@gmaxwell

This comment has been minimized.

Show comment
Hide comment
@gmaxwell

gmaxwell Sep 15, 2014

Member

Have you simulated this, e.g. how a wallet progressed over time? I would expect this to result in grinding down the wallet into lots of dust change over time even worse than the current approach.

Generally, so long as it doesn't result in bloating things up to the point where the transaction confirms slowly, we should generally prefer to make transactions bigger (under the rational that transaction fees will increase over time or at worst stay constant). E.g. If you already have a change output, a pass that looks at the scriptpubkeys you're spending and keeps adding more inputs assigned to the same pubkeys until the fee increase is some threshold or the change exceeds 2x the payment value (or something like that) would probably result in lower total transaction fees over time. (and better privacy)

Member

gmaxwell commented Sep 15, 2014

Have you simulated this, e.g. how a wallet progressed over time? I would expect this to result in grinding down the wallet into lots of dust change over time even worse than the current approach.

Generally, so long as it doesn't result in bloating things up to the point where the transaction confirms slowly, we should generally prefer to make transactions bigger (under the rational that transaction fees will increase over time or at worst stay constant). E.g. If you already have a change output, a pass that looks at the scriptpubkeys you're spending and keeps adding more inputs assigned to the same pubkeys until the fee increase is some threshold or the change exceeds 2x the payment value (or something like that) would probably result in lower total transaction fees over time. (and better privacy)

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Sep 18, 2014

Contributor

@gmaxwell I had not simulated it in advance. I will do so though, just haven't gotten around to it yet. Perhaps I'll get around to it this evening, otherwise I will probably get around to it tomorrow.

Contributor

Xekyo commented Sep 18, 2014

@gmaxwell I had not simulated it in advance. I will do so though, just haven't gotten around to it yet. Perhaps I'll get around to it this evening, otherwise I will probably get around to it tomorrow.

@Xekyo Xekyo closed this Sep 18, 2014

@Xekyo Xekyo reopened this Sep 18, 2014

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Sep 23, 2014

Contributor

@gmaxwell So, I have done some simulations.
My approach was:
• To recreate the wallet logic in Python.
• To add the pruning feature conditionally
• To have a wallet of each type, starting wtih the same utxopool, execute and receive a number of payments.
• They obviously end up with the same amount, and I can compare the number and sizes of final utxo.

So far I have tried a few different scenarios, but I ended up being limited at around 10k utxo and 1000 operations, as my python implementation was not as fast as I had hoped.
My initial experiments suggest that the pruning wallet does end up with a slightly bigger number of utxo, but it's only slightly bigger than the regular wallet. (I'll be adding the results later, when I am on the right computer.)
I have started implementing the simulation in a compiled language to speed it up in order to do some more and bigger experiments.

Contributor

Xekyo commented Sep 23, 2014

@gmaxwell So, I have done some simulations.
My approach was:
• To recreate the wallet logic in Python.
• To add the pruning feature conditionally
• To have a wallet of each type, starting wtih the same utxopool, execute and receive a number of payments.
• They obviously end up with the same amount, and I can compare the number and sizes of final utxo.

So far I have tried a few different scenarios, but I ended up being limited at around 10k utxo and 1000 operations, as my python implementation was not as fast as I had hoped.
My initial experiments suggest that the pruning wallet does end up with a slightly bigger number of utxo, but it's only slightly bigger than the regular wallet. (I'll be adding the results later, when I am on the right computer.)
I have started implementing the simulation in a compiled language to speed it up in order to do some more and bigger experiments.

@laanwj laanwj added the Wallet label Sep 25, 2014

@gmaxwell

This comment has been minimized.

Show comment
Hide comment
@gmaxwell

gmaxwell Oct 8, 2014

Member

Please keep us updated on your progress. If you'd like some more eyes on your testing code, feel free to point it out.

Member

gmaxwell commented Oct 8, 2014

Please keep us updated on your progress. If you'd like some more eyes on your testing code, feel free to point it out.

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Oct 11, 2014

Contributor

@gmaxwell The code for my simulation can be found here: https://github.com/Xekyo/CoinSelectionSimulator

It is working now, and can be executed in bearable time, but the code could be a bit clearer. I am currently in my exam period and haven't had as much time to work on it, as I would like.

I have tried a few different strategies to select coins, and have been experimenting with different distributions of incoming and outgoing payments.

Different strategies:

  • "Regular Wallet" implements the algorithm currently in use as I understand it, to be used as a baseline.
  • "Pruning Wallet" selects the coins just like the regular wallet, but afterwards discards inputs that are smaller than the remaining change in the order they were selected.
  • "Pruning Wallet with minimum number of Inputs", like PW, but when pruning keeps at least the miminum number of Inputs.
  • "Double Wallet": Like PW, but when unable to directly match the target, aims for twice the target, in order to create change in the magnitude of what the user was trying to transfer instead of dust.

Some results:

Experiment 1:
Start: 5000 random UTXO from range(2500, 250000) satoshis
10000 transactions (randomly 50/50 distributed incoming/outgoing) in the range(540,250000) satoshis
Regular Wallet: 4853 changes created, in range(1,1591) satoshis, average change 63.01±97.78 satoshi, spent (1,12) inputs per transaction sent, average 1.44±1.15

... writing up my results I realize that I will want to create a .csv instead of the current text-form result, sorry gonna go back to studying now, but I'll probably be able to do it in the next break.


Some questions I haven't been able to answer satisfyingly:

  • What would constitute realistic data for incoming and outgoing transactions of one wallet (how many incoming/outgoing transactions, what average size and distributions for each direction, is it necessary to regard the change in value over time)?
  • Haven't researched yet: When does the required transaction fee increase?
Contributor

Xekyo commented Oct 11, 2014

@gmaxwell The code for my simulation can be found here: https://github.com/Xekyo/CoinSelectionSimulator

It is working now, and can be executed in bearable time, but the code could be a bit clearer. I am currently in my exam period and haven't had as much time to work on it, as I would like.

I have tried a few different strategies to select coins, and have been experimenting with different distributions of incoming and outgoing payments.

Different strategies:

  • "Regular Wallet" implements the algorithm currently in use as I understand it, to be used as a baseline.
  • "Pruning Wallet" selects the coins just like the regular wallet, but afterwards discards inputs that are smaller than the remaining change in the order they were selected.
  • "Pruning Wallet with minimum number of Inputs", like PW, but when pruning keeps at least the miminum number of Inputs.
  • "Double Wallet": Like PW, but when unable to directly match the target, aims for twice the target, in order to create change in the magnitude of what the user was trying to transfer instead of dust.

Some results:

Experiment 1:
Start: 5000 random UTXO from range(2500, 250000) satoshis
10000 transactions (randomly 50/50 distributed incoming/outgoing) in the range(540,250000) satoshis
Regular Wallet: 4853 changes created, in range(1,1591) satoshis, average change 63.01±97.78 satoshi, spent (1,12) inputs per transaction sent, average 1.44±1.15

... writing up my results I realize that I will want to create a .csv instead of the current text-form result, sorry gonna go back to studying now, but I'll probably be able to do it in the next break.


Some questions I haven't been able to answer satisfyingly:

  • What would constitute realistic data for incoming and outgoing transactions of one wallet (how many incoming/outgoing transactions, what average size and distributions for each direction, is it necessary to regard the change in value over time)?
  • Haven't researched yet: When does the required transaction fee increase?
@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Oct 19, 2014

Contributor

Hi,
Finished with my exams for this semester, finally had time to pick this up again.

(When I created the .csv files with my output, I realized that my random models would sometimes generate spending instructions that asked for more than the wallet's current value. While those were were ignored as impossible, they still got into the statistics, so I wanted to fix that first.)

I've been thinking about what one might want to be optimizing for. This is what I got so far:

  1. Non-dust change: The creation of small change outputs should be avoided. They bloat the blockchain and are expensive to spend.
    -> If a change output has to be created, it would be preferable to create a change output of the magnitude of the payment value. (Which would also help obscure recipient output and change output.)
  2. Privacy: UTXO should be picked non-deterministically, and as few different pubkeys as possible should be involved.
    -> Whenever the CoinSelection selects a UTXO, any UTXO assigned to the same pubkey should be preferred in selection.
    -> Small inputs that are significantly smaller than the size of the change and don't share their pubkey with a larger input should be pruned, as they increase transaction fee and decrease privacy.
  3. Minimize transaction fee A transaction input set should be preferred when it costs less to send.
    -> If spending some input costs more than what it contributes to the outputs it should not be added. Also, inputs that are extraneous should be pruned if that would lower the transaction fee.
  4. Consolidate small UTXO?
    Once created, very small UTXO are in the blockchain anyway.
    Is it preferable to spend very small UTXO, in order to remove them from the UTXO-pool, or to ignore them until they become valuable enough to spend in their own right?

Here are the statistics from my latest experiment.
ex19

Contributor

Xekyo commented Oct 19, 2014

Hi,
Finished with my exams for this semester, finally had time to pick this up again.

(When I created the .csv files with my output, I realized that my random models would sometimes generate spending instructions that asked for more than the wallet's current value. While those were were ignored as impossible, they still got into the statistics, so I wanted to fix that first.)

I've been thinking about what one might want to be optimizing for. This is what I got so far:

  1. Non-dust change: The creation of small change outputs should be avoided. They bloat the blockchain and are expensive to spend.
    -> If a change output has to be created, it would be preferable to create a change output of the magnitude of the payment value. (Which would also help obscure recipient output and change output.)
  2. Privacy: UTXO should be picked non-deterministically, and as few different pubkeys as possible should be involved.
    -> Whenever the CoinSelection selects a UTXO, any UTXO assigned to the same pubkey should be preferred in selection.
    -> Small inputs that are significantly smaller than the size of the change and don't share their pubkey with a larger input should be pruned, as they increase transaction fee and decrease privacy.
  3. Minimize transaction fee A transaction input set should be preferred when it costs less to send.
    -> If spending some input costs more than what it contributes to the outputs it should not be added. Also, inputs that are extraneous should be pruned if that would lower the transaction fee.
  4. Consolidate small UTXO?
    Once created, very small UTXO are in the blockchain anyway.
    Is it preferable to spend very small UTXO, in order to remove them from the UTXO-pool, or to ignore them until they become valuable enough to spend in their own right?

Here are the statistics from my latest experiment.
ex19

@gmaxwell

This comment has been minimized.

Show comment
Hide comment
@gmaxwell

gmaxwell Oct 20, 2014

Member

Very interesting results!

Member

gmaxwell commented Oct 20, 2014

Very interesting results!

@gmaxwell

This comment has been minimized.

Show comment
Hide comment
@gmaxwell

gmaxwell Oct 20, 2014

Member

Consolidate small UTXO? Once created, very small UTXO are in the blockchain anyway. Is it preferable to spend very small UTXO, in order to remove them from the UTXO-pool

It's preferable to spend them: since it reduces the storage for a minimal full node (see the pruning patches in #4701)... subject to the restriction that you don't want someone to be able to gratitiously increase your transaction costs by sending you tiny utxo.

Member

gmaxwell commented Oct 20, 2014

Consolidate small UTXO? Once created, very small UTXO are in the blockchain anyway. Is it preferable to spend very small UTXO, in order to remove them from the UTXO-pool

It's preferable to spend them: since it reduces the storage for a minimal full node (see the pruning patches in #4701)... subject to the restriction that you don't want someone to be able to gratitiously increase your transaction costs by sending you tiny utxo.

@RHavar

This comment has been minimized.

Show comment
Hide comment
@RHavar

RHavar Feb 18, 2015

Contributor

Some questions I haven't been able to answer satisfyingly:

What would constitute realistic data for incoming and outgoing transactions of one wallet (how many incoming/outgoing transactions, what average size and distributions for each direction, is it necessary to regard the change in value over time)?

Perhaps I can help. I have data from MoneyPot.com's hot wallet which you can use for simulation: https://gist.github.com/RHavar/7cd6f3fcf2bd3e485458

The positive amounts are amounts deposited, the negative amounts are sends. The data is sorted over time (oldest at the top, newest at the bottom) denominated in bitcoins.

Things to note:

  • deposits are only added to the list when they get 1 confirmation (so they tend to come in batches) and send amounts are added to the list instantly
  • At various times, the cumulative total dips below 0. This happens when people have won enough money to be able to withdraw more than the total people have put in the site. For the purpose of simulation, you might want to assume there is an additional 50 BTC input at the very start, to avoid this case.
Contributor

RHavar commented Feb 18, 2015

Some questions I haven't been able to answer satisfyingly:

What would constitute realistic data for incoming and outgoing transactions of one wallet (how many incoming/outgoing transactions, what average size and distributions for each direction, is it necessary to regard the change in value over time)?

Perhaps I can help. I have data from MoneyPot.com's hot wallet which you can use for simulation: https://gist.github.com/RHavar/7cd6f3fcf2bd3e485458

The positive amounts are amounts deposited, the negative amounts are sends. The data is sorted over time (oldest at the top, newest at the bottom) denominated in bitcoins.

Things to note:

  • deposits are only added to the list when they get 1 confirmation (so they tend to come in batches) and send amounts are added to the list instantly
  • At various times, the cumulative total dips below 0. This happens when people have won enough money to be able to withdraw more than the total people have put in the site. For the purpose of simulation, you might want to assume there is an additional 50 BTC input at the very start, to avoid this case.
@jgarzik

This comment has been minimized.

Show comment
Hide comment
@jgarzik

jgarzik Jul 23, 2015

Contributor

This PR has been open for a while, but garnered no ACKs. The author seems to have put a fair amount of time and thought into it. However, this definitely needs more review and testing.

Ping, maintainers/testers/helpers?

Contributor

jgarzik commented Jul 23, 2015

This PR has been open for a while, but garnered no ACKs. The author seems to have put a fair amount of time and thought into it. However, this definitely needs more review and testing.

Ping, maintainers/testers/helpers?

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Jul 23, 2015

Contributor

@jgarzik: Hi Jeff,
Thanks for the ping.
I had lost track of this, but I see now that @RHavar posted data that might help to explore what the patch would do with some realistic data for transaction sizes. I'll have a look this weekend.

Contributor

Xekyo commented Jul 23, 2015

@jgarzik: Hi Jeff,
Thanks for the ping.
I had lost track of this, but I see now that @RHavar posted data that might help to explore what the patch would do with some realistic data for transaction sizes. I'll have a look this weekend.

@gmaxwell gmaxwell added the Privacy label Aug 18, 2015

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Sep 1, 2015

Contributor

I've created another testcase using the MoneyPot.com data. On that one, I get different results yet again, this time the Regular wallet has the least UTXO in the end.

Looking over the results, I noticed that all wallets created a change output in the single digit satoshis. Do I remember correctly that the wallet shouldn't create Dust Outputs? Shouldn't the change be at least 540 satoshis? If so, I should probably fix that behavior still and then have another look.

However, if it ends up looking as promising as today, I would propose to just expire this Pull-Request.

Contributor

Xekyo commented Sep 1, 2015

I've created another testcase using the MoneyPot.com data. On that one, I get different results yet again, this time the Regular wallet has the least UTXO in the end.

Looking over the results, I noticed that all wallets created a change output in the single digit satoshis. Do I remember correctly that the wallet shouldn't create Dust Outputs? Shouldn't the change be at least 540 satoshis? If so, I should probably fix that behavior still and then have another look.

However, if it ends up looking as promising as today, I would propose to just expire this Pull-Request.

@MarcoFalke

This comment has been minimized.

Show comment
Hide comment
@MarcoFalke

MarcoFalke Sep 23, 2015

Member

@Xekyo:

prune extraneous inputs

Wouldn't "extraneous inputs" imply an issue in the coin selection algorithm which should be fixed instead? I invite you to look at #6696 which is a different approach. I am happy to shoot up some simulations if anyone is interested. (Ping me at the other PR)

Member

MarcoFalke commented Sep 23, 2015

@Xekyo:

prune extraneous inputs

Wouldn't "extraneous inputs" imply an issue in the coin selection algorithm which should be fixed instead? I invite you to look at #6696 which is a different approach. I am happy to shoot up some simulations if anyone is interested. (Ping me at the other PR)

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Nov 20, 2015

Member

From what I understand ApproximateBestSubset is an approximate algorithm for the following:

Input: [sizes], nLow, nTarget
Output: X=subset([sizes]) for which nTargetValue <= sum(X) < nLow
        and sum(X)-nTargetValue as small as possible

The "approximate" part means that it may select a solution which is not the optimal one (e.g. sum(X)-nTargetValue is not really as small as possible).

Your fix drops elements i from the result X for which sum(X)-i is still >= nTargetvalue, which if possible leads to a trivially better solution.

From what I understand this can improve the result because vValue is sorted from low to high, and elements are added in that order, it can happen that an element is added to get the sum above nTargetValue, but makes earlier-added small elements redundant.

  • What about moving the post-processing step to the end? This removes the performance impact per checked subset, and still makes sure redundant outputs are removed from the final result.

I've managed to reproduce this as well. For e.g. these input values:

vValues = [10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 1000]
nTargetValue = 251

ApproximateBestSubset sometimes selects [10, 10, 1000]. Postprocessing like this can drop the redundant 10 inputs.

Concept ACK - this cannot make things worse. Would be nice to have a unit test.

Member

laanwj commented Nov 20, 2015

From what I understand ApproximateBestSubset is an approximate algorithm for the following:

Input: [sizes], nLow, nTarget
Output: X=subset([sizes]) for which nTargetValue <= sum(X) < nLow
        and sum(X)-nTargetValue as small as possible

The "approximate" part means that it may select a solution which is not the optimal one (e.g. sum(X)-nTargetValue is not really as small as possible).

Your fix drops elements i from the result X for which sum(X)-i is still >= nTargetvalue, which if possible leads to a trivially better solution.

From what I understand this can improve the result because vValue is sorted from low to high, and elements are added in that order, it can happen that an element is added to get the sum above nTargetValue, but makes earlier-added small elements redundant.

  • What about moving the post-processing step to the end? This removes the performance impact per checked subset, and still makes sure redundant outputs are removed from the final result.

I've managed to reproduce this as well. For e.g. these input values:

vValues = [10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 1000]
nTargetValue = 251

ApproximateBestSubset sometimes selects [10, 10, 1000]. Postprocessing like this can drop the redundant 10 inputs.

Concept ACK - this cannot make things worse. Would be nice to have a unit test.

@MarcoFalke

This comment has been minimized.

Show comment
Hide comment
@MarcoFalke

MarcoFalke Nov 20, 2015

Member

@laanwj Great that you had a look at this.

vValue is sorted from low to high ...

Guess I missed that during review.

unit test

@Xekyo are you still working on this? Should I take over?

Member

MarcoFalke commented Nov 20, 2015

@laanwj Great that you had a look at this.

vValue is sorted from low to high ...

Guess I missed that during review.

unit test

@Xekyo are you still working on this? Should I take over?

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Nov 21, 2015

Contributor

Hi,

Am 20.11.2015 11:11 schrieb "MarcoFalke" notifications@github.com:

@Xekyo are you still working on this? Should I take over?

I'd be interested to take another look, but I'm currently traveling. Please
feel free to take over. Otherwise, I'll check it out in December.

Xekyo

Contributor

Xekyo commented Nov 21, 2015

Hi,

Am 20.11.2015 11:11 schrieb "MarcoFalke" notifications@github.com:

@Xekyo are you still working on this? Should I take over?

I'd be interested to take another look, but I'm currently traveling. Please
feel free to take over. Otherwise, I'll check it out in December.

Xekyo

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Dec 6, 2015

Contributor

@laanwj I've edited my fork to move the post-processing step to the end of ApproximateBestSubset. However, this patch may cause fewer dust outputs to be spent which contradicts Greg's assessment above. Are you sure about "cannot make things worse"? I feel my simulations have been somewhat inconclusive on that point.

Contributor

Xekyo commented Dec 6, 2015

@laanwj I've edited my fork to move the post-processing step to the end of ApproximateBestSubset. However, this patch may cause fewer dust outputs to be spent which contradicts Greg's assessment above. Are you sure about "cannot make things worse"? I feel my simulations have been somewhat inconclusive on that point.

Coinselection prunes extraneous inputs from ApproximateBestSubset
A further pass over the available inputs has been added to ApproximateBestSubset after a candidate set has been found. It will prune any extraneous inputs in the selected subset, in order to decrease the number of input and the resulting change.
@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Dec 6, 2015

Contributor

@MarcoFalke I realized my mistake and fixed it. After all my commits, I did the rebase as you suggested, and pushed to "Fix-#1643". I hope I did that right. :)

Contributor

Xekyo commented Dec 6, 2015

@MarcoFalke I realized my mistake and fixed it. After all my commits, I did the rebase as you suggested, and pushed to "Fix-#1643". I hope I did that right. :)

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Dec 6, 2015

Contributor

I've added a simple test with

   vValues = [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100]
   nTargetValue = 222

with an expected nBest == 230 on one iteration.

Sorry, I'm not sure how I would run the test on my machine right now, so I figured I'd just push it. I'll check back tomorrow.

Contributor

Xekyo commented Dec 6, 2015

I've added a simple test with

   vValues = [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 100, 100]
   nTargetValue = 222

with an expected nBest == 230 on one iteration.

Sorry, I'm not sure how I would run the test on my machine right now, so I figured I'd just push it. I'll check back tomorrow.

@MarcoFalke

View changes

src/test/coinselection_tests.cpp
+
+BOOST_FIXTURE_TEST_SUITE(coinselection_tests, BasicTestingSetup)
+
+BOOST_AUTO_TEST_CASE(sanity)

This comment has been minimized.

@MarcoFalke

MarcoFalke Dec 7, 2015

Member

I think you can just add this to src/wallet/test/wallet_tests.cpp.

@MarcoFalke

MarcoFalke Dec 7, 2015

Member

I think you can just add this to src/wallet/test/wallet_tests.cpp.

This comment has been minimized.

@Xekyo

Xekyo Dec 7, 2015

Contributor

Ah thanks, I had been wondering why there was no wallet tests in the regular test directory.

@Xekyo

Xekyo Dec 7, 2015

Contributor

Ah thanks, I had been wondering why there was no wallet tests in the regular test directory.

@laanwj

View changes

src/wallet/wallet.cpp
+ //Reduces the approximate best subset by removing any inputs that are smaller than the surplus of nTotal beyond nTargetValue.
+ for (unsigned int i = 0; i < vValue.size(); i++)
+ {
+ if (vfBest[i] && (nBest - vValue[i].first) >= nTargetValue )

This comment has been minimized.

@laanwj

laanwj Dec 7, 2015

Member

Nit: indentation

@laanwj

laanwj Dec 7, 2015

Member

Nit: indentation

This comment has been minimized.

@Xekyo

Xekyo Dec 7, 2015

Contributor

I've replaced the tab with four spaces.

@Xekyo

Xekyo Dec 7, 2015

Contributor

I've replaced the tab with four spaces.

@laanwj

View changes

src/wallet/wallet.cpp
@@ -1625,13 +1625,21 @@ static void ApproximateBestSubset(vector<pair<CAmount, pair<const CWalletTx*,uns
nBest = nTotal;
vfBest = vfIncluded;
}
- nTotal -= vValue[i].first;

This comment has been minimized.

@laanwj

laanwj Dec 7, 2015

Member

This (along with && !fReachedTarget is a new, different change to the algorithm than the post-processing step initially proposed.

I think we should separate these into different PRs, the postprocessing to remove extraneous outputs below is obviously correct, but I'm not sure about this.

@laanwj

laanwj Dec 7, 2015

Member

This (along with && !fReachedTarget is a new, different change to the algorithm than the post-processing step initially proposed.

I think we should separate these into different PRs, the postprocessing to remove extraneous outputs below is obviously correct, but I'm not sure about this.

This comment has been minimized.

@Xekyo

Xekyo Dec 7, 2015

Contributor

@laanwj Fair enough, I'll take it out and create another PR later.

@Xekyo

Xekyo Dec 7, 2015

Contributor

@laanwj Fair enough, I'll take it out and create another PR later.

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Dec 7, 2015

Member

ACK

Member

laanwj commented Dec 7, 2015

ACK

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Dec 7, 2015

Contributor

I rebased and used fixup on the commits that only fixed previous mistakes.

Contributor

Xekyo commented Dec 7, 2015

I rebased and used fixup on the commits that only fixed previous mistakes.

@Xekyo

This comment has been minimized.

Show comment
Hide comment
@Xekyo

Xekyo Dec 7, 2015

Contributor

@laanwj Travis had cleared my patch just when I realized that I had also messed up the indentation on the test. I've rebased it into two commits, one commit for the code, and one for the test. I expect this to turn green again momentarily. Sorry for keeping Travis so busy. :(

Is there anything else that needs to be done about this PR?

Contributor

Xekyo commented Dec 7, 2015

@laanwj Travis had cleared my patch just when I realized that I had also messed up the indentation on the test. I've rebased it into two commits, one commit for the code, and one for the test. I expect this to turn green again momentarily. Sorry for keeping Travis so busy. :(

Is there anything else that needs to be done about this PR?

@laanwj laanwj merged commit fc0f52d into bitcoin:master Dec 8, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

laanwj added a commit that referenced this pull request Dec 8, 2015

Merge pull request #4906
fc0f52d Added a test for the pruning of extraneous inputs after ApproximateBestSet (Murch)
af9510e Moved set reduction to the end of ApproximateBestSubset to reduce performance impact (Murch)
5c03483 Coinselection prunes extraneous inputs from ApproximateBestSubset (AlSzacrel)
@MarcoFalke

This comment has been minimized.

Show comment
Hide comment
@MarcoFalke

MarcoFalke Dec 8, 2015

@Xekyo This diff is in the wrong commit. But now it's too late to fix.

You may have used

git reset HEAD~2

and then commit the two files separately.

@Xekyo This diff is in the wrong commit. But now it's too late to fix.

You may have used

git reset HEAD~2

and then commit the two files separately.

@MarcoFalke

This comment has been minimized.

Show comment
Hide comment
@MarcoFalke

MarcoFalke Dec 9, 2015

Member

Wallet code itself looks good! Post-merge Tested ACK.

@Xekyo Thanks for sticking with this so long! There seems to be a small issue with the test code but I will create another PR for this...

Note @laanwj: vValue is not sorted from low to high but from high to low but I think you meant it the right way. ;)

Member

MarcoFalke commented Dec 9, 2015

Wallet code itself looks good! Post-merge Tested ACK.

@Xekyo Thanks for sticking with this so long! There seems to be a small issue with the test code but I will create another PR for this...

Note @laanwj: vValue is not sorted from low to high but from high to low but I think you meant it the right way. ;)

@MarcoFalke MarcoFalke referenced this pull request Dec 9, 2015

Merged

[wallet] Cleanup tests #7193

@MarcoFalke

This comment has been minimized.

Show comment
Hide comment
@MarcoFalke

MarcoFalke Dec 9, 2015

Member

Oh, and you mentioned this will improve cases such as #1643. Which is not true, at least for this very transaction mentioned in #1643.

Still, I assume pruning will help for really large wallets (like the ones of exchanges or heavy users) when they have a odd distribution of input amount sizes.

Member

MarcoFalke commented Dec 9, 2015

Oh, and you mentioned this will improve cases such as #1643. Which is not true, at least for this very transaction mentioned in #1643.

Still, I assume pruning will help for really large wallets (like the ones of exchanges or heavy users) when they have a odd distribution of input amount sizes.

laanwj added a commit that referenced this pull request Dec 10, 2015

Coinselection prunes extraneous inputs from ApproximateBestSubset
This is a combination of 3 commits.

- Coinselection prunes extraneous inputs from ApproximateBestSubset
  A further pass over the available inputs has been added to ApproximateBestSubset after a candidate set has been found. It will prune any extraneous inputs in the selected subset, in order to decrease the number of input and the resulting change.
- Moved set reduction to the end of ApproximateBestSubset to reduce performance impact
- Added a test for the pruning of extraneous inputs after ApproximateBestSet

Github-Pull: #4906
Rebased-From: 5c03483 af9510e fc0f52d

@morcos morcos referenced this pull request Mar 9, 2016

Closed

Revisit Coin Selection #7664

laanwj added a commit to laanwj/bitcoin that referenced this pull request Jul 1, 2016

wallet: Revert input selection post-pruning
This reverts PR #4906, "Coinselection prunes extraneous inputs from
ApproximateBestSubset".

Apparently the previous behavior of slightly over-estimating the set of
inputs was useful in cleaning up UTXOs.

See also #7664, #7657, as well as 2016-07-01 discussion on #bitcoin-core-dev IRC.

lateminer added a commit to lateminer/bitcoin that referenced this pull request Jan 7, 2018

wallet: Revert input selection post-pruning
This reverts PR #4906, "Coinselection prunes extraneous inputs from
ApproximateBestSubset".

Apparently the previous behavior of slightly over-estimating the set of
inputs was useful in cleaning up UTXOs.

See also #7664, #7657, as well as 2016-07-01 discussion on #bitcoin-core-dev IRC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment