Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add types skeleton of coin selection and largestFirst algo implementation #121

Merged
merged 11 commits into from
Apr 1, 2019

Conversation

paweljakubas
Copy link
Contributor

Issue Number

#92

Overview

  • I have added coin selection proposal for coin selection and fee calculation

Comments

@paweljakubas paweljakubas self-assigned this Mar 25, 2019
@paweljakubas paweljakubas added this to the Submit Transactions milestone Mar 25, 2019
@paweljakubas paweljakubas force-pushed the paweljakubas/92/adding-types branch 2 times, most recently from e6b4d5a to 81e955f Compare March 25, 2019 16:51
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved
src/Cardano/Wallet/CoinSelection.hs Outdated Show resolved Hide resolved

estimateMaxTxInputs
:: ByteString
-- ^ Maximum size of a transaction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size of tx in bytestring - how should one interpret this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we are ok with using serokell-utils or better define Bytes ourselves?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to use libraries (from Serokell or anyone) when they are sound and solve an actual problem. Here, serokell-util isn't really sound nor useful for most of the types or functions it provides IMO.

Here, a size could just be a Quantity "byte" Word32 if you really want to disambiguate the type. Yet, I believe it's not needed to implement largest first 🤔 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is not needed in the end I will definitely remove it. Otherwise, will rely on Quantity "byte" Word32

@paweljakubas paweljakubas force-pushed the paweljakubas/92/adding-types branch 2 times, most recently from 7591d95 to 7e7514b Compare March 27, 2019 13:32
@paweljakubas paweljakubas changed the title Add types skeleton of coin selection and fee calculation Add types skeleton of coin selection and largestFirst algo implementation Mar 27, 2019
@paweljakubas
Copy link
Contributor Author

just rebased to update the branch with master

Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some good ideas in the test scenarios that would be better expressed as plain hand-written unit tests IMO;

let txOutputsSorted = NE.toList
$ NE.sortBy (flip $ comparing coin) txOutputs

-- Step 2. (TO-DO or not) we need to check if the transaction outputs are not redeemable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what "redeemable" means here?

when (utxoBalance < moneyRequested)
$ throwE $ UtxoExhausted utxoBalance moneyRequested

-- FIXME ? we need to check if the transaction outputs are not redeemable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered yesterday already, but what "redeemable" means in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a check in the legacy code https://github.com/input-output-hk/cardano-wallet-legacy/blob/develop/src/Cardano/Wallet/Kernel/CoinSelection/FromGeneric.hs#L292
Basically, checking if the address in transaction output is redeemable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Okay. I think we don't care here:

1/ We can't generate redeem address anyway
2/ If, through manual selection or whatever process we end up constructing a transaction to a redeem address, the node will reject it.

I am really not sure what's the rational is here 🤔 ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed unneeded comment

$ throwE $ UtxoNotEnoughFragmented numberOfUtxoEntries numberOfTransactionOutputs

when (utxoBalance < moneyRequested)
$ throwE $ UtxoExhausted utxoBalance moneyRequested
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually swap those two checks, such that, in case there's no UTxO available, it still returns a UTxOExhausted error (which we could rename to NotEnoughMoney basically? Don't you think? The utxo can be exhausted in two ways, as described in a comment, so perhaps, it's better to have this error reflecting this as well).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think that, those checks where done in the Nothing branch previously, such that they will only occur after the coin selection has been tried once. I think the rational was that successful calls to this function are more likely than error calls and thus, rather than doing those checks on every call, we simply do it in case of error and save some CPU cycles along the way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swapped the errors, checked it in unit tests, and changed the name of the error

(inp, out):utxo' = utxo
target' =
(fromIntegral (getCoin (coin txout)))
- (fromIntegral (getCoin (coin out)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong? The new target should be based on the current target and not on the initial one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have foldM atLeast utxos outputs and in the atLeast we process every output per go. And inside, we are looking which first coin from utxo (previously taken the biggest maxnumberofinputs of them and then taken from the smallest) can cover this target. To my best knowledge, the target within go (I changed inner loop that searches for this to pickTheFirst) my stay intact and the above expression does its job.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... that isn't correct. The go may actually picks more than one UTxO (possibly depleting the whole list if there aren't enough to cover for the target), that's why it's an iterative approach.

The target starts being the txout's value. And then, after every pick, we makes it smaller by the amount of the corresponding UTxO entry we've picked, and iterate, until the target is covered. So, in the last branch, we should have:

targets' = target - fromIntegral (getCoin (coin out))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, finally got it - corrected the algorithm and included boundary and exemplary cases in unit tests to show the implementation is right now

let check =
invariant "utxo must cover all transaction outputs"
(NE.length txOuts)
(\_ -> condCoinsCovering txOuts utxo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use a precondition for this using QuickCheck's (==>)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

it "works as expected for zero-fee and not covered by utxo coin selection"
(property propLargestFirstNotCovered)
it "works as expected for zero-fee and fully covered by utxo coin selection when maximumNumberOfInputs is small"
(property propLargestFirstFullyCoveredSmallMaxInput)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a puzzled by the properties here 🤔 ? What properties are we actually looking for? I think you're mis-using property testing here. The testing logic is almost as complex as the implementation itself which is a good sign that tests are maybe doing too much. A good indicator is the way you can express the property in prose, what means "works as expected" here, if you're checking for a property, then you should be able to express it.

I'd suggest the following approach:

  • Reword these test cases into plain unit tests, where, given some inputs (that you can construct by hand, in a easy-to-read fashion), you obtain some expected output / behavior.

  • Add some property-based tests for actual properties of that largest first heuristic, for instance, I can see a few properties that I'd expect on such function:

    • forall (UTxO, NonEmpty TxOut), running largest first twice one the same UTxO yields exactly the same result
    • forall (UTXO, NonEmpty TxOut), for all selected input, there's no bigger input in the UTxO that is not already in the selected inputs.
    • forall (UTxO, NonEmpty TxOut), there's at least as many selected inputs as there are requested outputs

So in practice, the properties are very short to express, and it's really about performing the coin selection, and then, verifying that the output satisfies some conditions. All properties above assume that there are enough inputs to cover for all of the NonEmpty TxOut, so likely, they'll need preconditions (which could just be isRight $ runExceptT $ largestFirst ... ==> ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair argument. Added unit tests already. Properties will be added in the next commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is I do not understand : forall (UTXO, NonEmpty TxOut), for all selected input, there's no bigger input in the UTxO that is not already in the selected inputs.

algo goes : we take maxNumberOfInputs biggest utxo entries, and then take from the smallest to cover transaction outputs taking from the smallest. It seems, but I may be wrong, that it contradicts ^^^

Copy link
Contributor Author

@paweljakubas paweljakubas Mar 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sticked to CoveringCase to run it quicker, as correct subset is significantly smaller to whole set of (Utxo, Empty TxOut) - that's my intuition to be verified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

algo goes : we take maxNumberOfInputs biggest utxo entries,

Yes.

and then take from the smallest to cover transaction outputs taking from the smallest

No. We do cover biggest outputs first, and use the biggest inputs first to do so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right !!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the property. in the lights of correct implementation it makes sense


-- FIXME ? we need to check if the transaction outputs are not redeemable

case foldM atLeast (L.reverse $ nLargest utxo, mempty) txOutputsSorted of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the list reversed here 🤔 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we start to consume utxo from the smallest cut obtained by maximumNumberInputs (which takes those numbers of biggest utxos entries) . taken from : https://github.com/input-output-hk/cardano-wallet-legacy/blob/develop/src/Cardano/Wallet/Kernel/CoinSelection/FromGeneric.hs#L215

-> NonEmpty Word64
-- ^ transaction outputs
-> Either CoinSelectionError [Word64]
-- ^ expecting CoinSelectionError or coins selected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.
What about passing all the arguments above in a record type (instead of having comments here), so that at the call-site, it becomes even more readable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

atLeast (utxo0, selection) txout =
pickTheFirst (fromIntegral $ getCoin $ coin txout, mempty) utxo0
where
pickTheFirst
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be called cover probably to better reflect intent

Map.toList (getUTxO utxo) L.\\ utxoEntriesInCoinSelection
let getExtremumValue f = f . map (getCoin . coin . snd)

if L.null utxo' then utxo' `shouldBe` utxo' else
Copy link
Contributor Author

@paweljakubas paweljakubas Mar 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there Expectation that is "True"/expectationSatisfied? why this checking? sometimes utxo is completely depleted after coin selection and minimum from [] does not make sense...maybe secure min with default 0...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. since you're in a Monad, you could use when or unless here (from Control.Monad)

Copy link
Member

@KtorZ KtorZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good tests. I haven't yet re-reviewed the implementation, as I wanted to put the emphasizes on the tests first.
Some remarks on the form and coding style however.

100
[10,10,17]
(17 :| [])
(Right [17])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 ... the suggestion of introducing a record type earlier was to use the named fields exactly here!

            coinSelectionUnitTest UnitTestInput
                { maxNumOfInputs = 100
                , utxoInputs = [10,10,17]
                , txOutputs = 10 :| []
                , expectedResult = Right [17]
                }

This is where we get most of the readability (at the call-site) :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, corrected

[12,10,17]
(1 :| [])
(Right [17])
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the added value for this case compare to the previous one 🤔 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one is smaller than any input coin and the second is the biggest possible to cover by one input - it help me when checking algo impl

spec :: Spec
spec = do
describe "Coin selection : LargestFirst algorithm unit tests" $ do
it "happy path result in successful selection - one input per output - a" $ do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to remove "happy path result in successful selection", and just keep the second part 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

[1,2,10,6,5]
(11 :| [1])
(Left $ MaximumInputsReached 2)
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good scenarios overall :+1

Properties and unit test generic scenario
-------------------------------------------------------------------------------}

data UnitTestInput = UnitTestInput
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat-picking but I'd just call that Fixture, because that's what it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Map.toList (getUTxO utxo) L.\\ utxoEntriesInCoinSelection
let getExtremumValue f = f . map (getCoin . coin . snd)

if L.null utxo' then utxo' `shouldBe` utxo' else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. since you're in a Monad, you could use when or unless here (from Control.Monad)

test/unit/Cardano/Wallet/CoinSelection/LargestFirstSpec.hs Outdated Show resolved Hide resolved
$ (Map.toList . getUTxO) utxo
rotate :: Int -> [a] -> [a]
rotate _ [] = []
rotate n xs = zipWith const (drop n (cycle xs)) xs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest a different approach here to avoid having too much logic in the test:

  • change the precondition to something like isRight . runExceptT . largestFirst .... In the end, that's really what the pre-condition is about here, checking that we can perform the coin selection! So, let's just perform it and assume that it's right. On small sets of value, it should be fast enough to not be too slow to evaluate with quickcheck.

Copy link
Contributor Author

@paweljakubas paweljakubas Mar 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really do not know how to use properly isRight with runExceptT ... but when I use CoverCase then it is not need I bet...but drawback is that logic stays


instance Arbitrary CoveringCase where
arbitrary = do
n <- choose (1, 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps slightly more than 2 ? What do you think ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now (1,10) and works quickly

arbitrary = do
n <- choose (1, 2)
txOutsNonEmpty <- NE.fromList <$> vectorOf n arbitrary
utxo <- arbitrary `suchThat` (condCoinsCovering txOutsNonEmpty)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... we should have this either as a precondition or in the generator. Having it in both places is redundant. I'd be in favor of generating only valid CoveringCase, so checking only here once (previous comment about simplifying the precondition still applies though).

Copy link
Contributor Author

@paweljakubas paweljakubas Mar 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it to have in generator, but when I would know how to put isRight in the property precondition I would remove suchThat and condCoinsCovering code. On the other case, I would need to make sure significant number of covering cases are tried => checkCoverage to be used

throwE $ MaximumInputsReached (fromIntegral n)


-- Selecting coins to cover at least the specified value
Copy link
Contributor Author

@paweljakubas paweljakubas Mar 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this extensive comment how the largestFirst works - hope it will save time anyone trying to understand this algorithm. This along with tests should explain the matter in a comprehensible way

@paweljakubas paweljakubas requested a review from KtorZ March 30, 2019 18:38
@KtorZ KtorZ force-pushed the paweljakubas/92/adding-types branch from 04ea9bc to 893002b Compare April 1, 2019 08:55
-- UTxO input, not participating in the coverage, is taken. We are back at (b)
-- step as a result
--
-- The steps are continued until all transaction are covered.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

I'd even add either, before (c) or at the end of (b) that, each output is treated independently wit the heuristic described in (c).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

-- which coin selection will be tried
-- (c) the biggest candidate UTxO input is tried first to cover the transaction
-- output. If the input is not enough, then the next biggest one is added
-- to check if they can cover the transaction output. This process id continued
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id continued -> is continued

Left err ->
result `shouldBe` (Left err)
Right expectedCoinsSel ->
result `shouldBe` (return expectedCoinsSel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 5 lines above are strictly equivalent to

result `shouldBe`expected

The pattern-match is superfluous here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed!


-- | Largest-first input selection policy
largestFirst
:: forall m. MonadIO m
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IO isn't needed here, the constraint is too strong and I'd suggest to lower it down to Monad m

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right


-- | Largest-first input selection policy
largestFirst
:: forall m. Monad m
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no need for a MonadIO m => constraint here (was actually too strong), so I've replaced it with a Monad m =>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

$ throwE $ NotEnoughMoney utxoBalance moneyRequested

when (numberOfUtxoEntries < numberOfTransactionOutputs)
$ throwE $ UtxoNotEnoughFragmented numberOfUtxoEntries numberOfTransactionOutputs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move these (+ the corresponding local declarations above) into the Nothing -> branch, to avoid computing them on every run, but only when the selection failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

target' =
target - (fromIntegral (getCoin (coin out)))
in
coverOutput (target', ins ++ [(inp, out)]) utxo'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I'd suggest to always prepend elements to lists when using recursion with a list accumulator, and in the terminal case, reverse the list. Because built-in lists in Haskell are linked-list, adding elements to the end of a list is costly, while prepending elements is O(1). Here, it probably doesn't matter much because inputs will remain rather small in practice; but still 😶

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, and also updated tests for it

let opts = defaultCoinSelectionOptions 100
let resultOne = runIdentity $ runExceptT $ largestFirst opts utxo txOuts
let resultTwo = runIdentity $ runExceptT $ largestFirst opts utxo txOuts
resultOne === resultTwo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed here the usage of MonadIO + shouldBe with just === & runIdentity; there's no need for full IO :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

:: CoveringCase
-> Property
propAtLeast (CoveringCase (utxo, txOuts)) =
isRight selection ==> let Right s = selection in prop s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here I've required that selection is Right as a precondition because our property only matters for valid selections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beautiful 💯

@KtorZ KtorZ merged commit 82fa37c into master Apr 1, 2019
@KtorZ KtorZ deleted the paweljakubas/92/adding-types branch April 1, 2019 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants