Make the `Largest-First` algorithm pay for outputs collectively. #73

jonathanknowles · 2020-05-07T09:28:21Z

Related Issue

Summary

This PR changes the reference implementation of the Largest-First coin selection algorithm to match the specification.

After applying this PR, the algorithm should always be successful at paying for a given set of outputs of total value v provided that the total value u of available inputs satisfies u ≥ v. (Unless the maximum input count has been exceeded.)

Old implementation (before applying this PR)

Rather than considering the total collective value of all outputs when attempting to find a selection of inputs, the old implementation considered each output independently, and for each output, attempted to find a selection of inputs to cover just that output.

This created an unnecessary cardinality restriction, namely that a single unique input could only be used to pay for at most one unique output, which meant that any set of n outputs could only be paid for if there were at least n inputs available. Under this restriction, the algorithm could fail even if the total value of inputs was sufficient to cover the required outputs.

New implementation (after applying this PR)

The updated algorithm now considers the total value of all outputs collectively, rather than considering them individually.

Under this relaxed scheme, it will always be possible to pay for a given set of outputs of total value v if the total value u of available inputs satisfies u ≥ v. (Unless the maximum input count has been exceeded.)

This change fixes a subtle problem with our implementation of Largest-First. == Old implementation (before applying this change) Rather than considering the total collective value of all outputs when attempting to find a selection of inputs, the old implementation considered each output independently, and for each output, attempted to find a selection of inputs to cover just that output. This created an unnecessary cardinality restriction, namely that a single unique input could only be used to pay for at most one unique output, which meant that any set of n outputs could only be paid for if there were at least n inputs available. Under this restriction, the algorithm could fail even if the total value of inputs was sufficient to cover the required outputs. == New implementation (after applying this change) The updated algorithm now considers the total value of all outputs collectively, rather than considering them individually. Under this relaxed scheme, it will always be possible to pay for a given set of outputs of total value v if the total value u of available inputs satisfies u >= v.

Previously, Largest-First would fail to produce a result if the number of available inputs was lower than the number of requested outputs. However, under the new relaxed scheme, Largest-First will always succeed in paying for a given set of outputs of total value v as long as the total value u of the UTxO satisfies u >= v, regardless of how many inputs there are. This change updates several unit tests that previously expected failure to now expect a successful result.

This test is no longer required, as Largest-First no longer imposes the cardinality restriction that the number of inputs must be greater than or equal to the number of outputs.

This function indicates whether or not a given error value is one that can be returned by the Largest-First algorithm.

Random-Improve and Largest-First now no longer fail for exactly the same cases. We need to be a bit more careful when comparing failures.

Update documentation for `largestFirst` to reflect that the cardinality restriction has been removed.

- Group similar tests together. - Give each test a unique number that allows the test to be easily identified on failure.

src/test/Cardano/CoinSelection/Algorithm/LargestFirstSpec.hs

paweljakubas

looks fine to me

Make the field names self-documenting.

To avoid too many instances of discarded data in properties that make use of the CoinSelectionData type, incorporate a bias towards generating sets of inputs that can completely pay for all the outputs. Additionally, generate coin sizes that have a reasonably high chance of collision with one another. This makes it more likely that coin selection algorithms are able to select just enough inputs to pay for the requested outputs (with zero excess value), which is an extremely useful edge case.

We confirm that a selection is minimal by removing the smallest entry from the inputs and verifying that the reduced input total is no longer enough to pay for the total value of all outputs.

paweljakubas

added tests and properties look good.

Q: So after collectively addressing all inputs to be covered (instead of one to one) we have more robust algo. Let's assume we want to cover [2, 1, 2] inputs. and have [4, 1, 1] utxos. In old implementation:
2 could be attributed 4, but the next one 1 would be not enough to cover the next2 and we would end up with error. Right now, 4 will cover [2,2] and [1] will cover the rest?

So now we can have smaller number of utxos than inputs to be covered and find covering coin selection?

jonathanknowles · 2020-05-08T07:12:15Z

Q: So after collectively addressing all inputs to be covered (instead of one to one) we have more robust algo. Let's assume we want to cover [2, 1, 2] inputs. and have [4, 1, 1] utxos. In old implementation:
2 could be attributed 4, but the next one 1 would be not enough to cover the next2 and we would end up with error. Right now, 4 will cover [2,2] and [1] will cover the rest?

If we have inputs of [1, 1, 4] and want to pay for outputs of [2, 1, 2], the algorithm will:

Calculate the total value of all outputs: 5
Place the inputs into a list of descending order: [4, 1, 1].
Take just enough inputs from the head of the list to pay for the total value of all outputs. In this case, it will be [4, 1].
Since we have paid for the outputs exactly, there will be no change.

If we have inputs of [1, 2, 4] and want to pay for outputs of [2, 1, 2], the algorithm will:

Calculate the total value of all outputs: 5
Place the inputs into a list of descending order: [4, 2, 1].
Take just enough inputs from the head of the list to pay for the total value of all outputs. In this case, it will be [4, 2], with a total selected input value of 6.
Since there's an excess (6 - 5 = 1), the algorithm generates a change value of 1.

src/test/Cardano/CoinSelection/Algorithm/LargestFirstSpec.hs

These specifically cover cases where the number of available inputs is smaller than the number of required outputs.

jonathanknowles added 10 commits May 7, 2020 08:30

Remove unnecessary Ord constraints from properties.

878d0a4

Remove redundant cardinality test.

49c463d

This test is no longer required, as Largest-First no longer imposes the cardinality restriction that the number of inputs must be greater than or equal to the number of outputs.

Fix a test bug that mixed up inputs and outputs.

cb9cf94

Add Eq and Show instances for CoinSelectionResult.

9bc2702

Add indicator function isValidLargestFirstError.

34e18a4

This function indicates whether or not a given error value is one that can be returned by the Largest-First algorithm.

Revise the Random-Improve propErrors test.

245df86

Random-Improve and Largest-First now no longer fail for exactly the same cases. We need to be a bit more careful when comparing failures.

Update documentation for largestFirst.

e1adfb2

Update documentation for `largestFirst` to reflect that the cardinality restriction has been removed.

Reorganize Largest-First unit tests.

079f14f

- Group similar tests together. - Give each test a unique number that allows the test to be easily identified on failure.

jonathanknowles requested a review from paweljakubas May 7, 2020 09:28

paweljakubas reviewed May 7, 2020

View reviewed changes

src/test/Cardano/CoinSelection/Algorithm/LargestFirstSpec.hs Show resolved Hide resolved

paweljakubas approved these changes May 7, 2020

View reviewed changes

Revise record type used for coin selection properties.

1f7bcdc

Make the field names self-documenting.

jonathanknowles force-pushed the jonathanknowles/remove-largest-first-cardinality-restriction branch from 469b042 to c83a9c3 Compare May 8, 2020 06:07

jonathanknowles added 3 commits May 8, 2020 06:14

Test that Largest-First produces minimal selections.

22f9d33

We confirm that a selection is minimal by removing the smallest entry from the inputs and verifying that the reduced input total is no longer enough to pay for the total value of all outputs.

Test that Largest-First generates the correct change.

5543454

jonathanknowles force-pushed the jonathanknowles/remove-largest-first-cardinality-restriction branch from c83a9c3 to 5543454 Compare May 8, 2020 06:15

paweljakubas approved these changes May 8, 2020

View reviewed changes

paweljakubas reviewed May 8, 2020

View reviewed changes

src/test/Cardano/CoinSelection/Algorithm/LargestFirstSpec.hs Show resolved Hide resolved

Add extra unit tests for Largest-First.

19edd71

These specifically cover cases where the number of available inputs is smaller than the number of required outputs.

jonathanknowles force-pushed the jonathanknowles/remove-largest-first-cardinality-restriction branch from 885ea1f to 19edd71 Compare May 8, 2020 07:48

jonathanknowles self-assigned this May 8, 2020

jonathanknowles merged commit aae26dd into master May 8, 2020

jonathanknowles deleted the jonathanknowles/remove-largest-first-cardinality-restriction branch May 8, 2020 08:06

jonathanknowles mentioned this pull request May 8, 2020

API docs for Largest-First include impossible error conditions. #75

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the `Largest-First` algorithm pay for outputs collectively. #73

Make the `Largest-First` algorithm pay for outputs collectively. #73

jonathanknowles commented May 7, 2020 •

edited

Loading

paweljakubas left a comment

paweljakubas left a comment •

edited

Loading

jonathanknowles commented May 8, 2020

Make the Largest-First algorithm pay for outputs collectively. #73

Make the Largest-First algorithm pay for outputs collectively. #73

Conversation

jonathanknowles commented May 7, 2020 • edited Loading

Related Issue

Summary

Old implementation (before applying this PR)

New implementation (after applying this PR)

paweljakubas left a comment

Choose a reason for hiding this comment

paweljakubas left a comment • edited Loading

Choose a reason for hiding this comment

jonathanknowles commented May 8, 2020

Make the `Largest-First` algorithm pay for outputs collectively. #73

Make the `Largest-First` algorithm pay for outputs collectively. #73

jonathanknowles commented May 7, 2020 •

edited

Loading

paweljakubas left a comment •

edited

Loading