Tearoff fixes #78

kasiastreich · 2016-12-20T10:35:12Z

Add timestamp to merkle tree calculation.
Add fix of duplicate node issue by changing duplicated hash by adding an index of duplicated leave at a given level in a tree.

The problem was that it was possible to construct 2 different transactions with the same ids. Having number of leaves that were not power of 2. However, it's hard to use such situation in a meaningful way (in most cases transactions like that shouldn't pass contract verification). We should also consider, if we really want to have a transaction with the same commands or attachments, the same inputs are impossible (double spend). Only way that it is reasonable to have the same entries in tx are outputs and it is theoretically possible to use leaves duplication fact to add additional outputs at the end. For this trick to work such transaction can consist only of inputs or outputs, no commands, attachments, timestamp (but as mentioned before, for cash it wouldn't pass contract verification - amounts don't match).

…n class. Small cleanup.

…ex of duplicated leave at a given level in a tree. The problem was that it was possible to construct 2 different transactions with the same ids. Trick worked for txs having number of leaves that were not power of 2.

mikehearn · 2016-12-20T10:44:09Z

Great! I'll take a look at the code soon but before I start, I was thinking yesterday that a simpler fix than including an index into the hash and checking it would be to just pad the tree with the zero hash. As nobody knows a pre-image for SHA2(x) == 0 that would also fix the issue in a simpler way. Does it make sense?

mikehearn · 2017-01-05T10:06:27Z

Code looks good - nice improvements - but:

I think there are still fields missing like 'type', 'notary' etc.
I think using the zero hash would be simpler than the current trickyness around repeated hashing, for the case of a duplicated last node.

WDYT?

kasiastreich · 2017-01-05T10:30:20Z

I thought about zero hashes, they are also fine, I can implement that if you think it's nicer. As to fields missing, from our previous conversations I understood that leaves should include inputs, outputs, attachments, commands and timestamp only (based on whitepaper tear-offs section: "Typically an oracle will be presented with the Merkle branches for the command or state that contains the data, and
the timestamp field, and nothing else."), unless you want that additional data to be just sent with FilteredTransaction.

mikehearn · 2017-01-05T10:32:48Z

The hash of the transaction has to cover everything. When the white paper says "presented" it means that the Merkle branches for (and contents of) those fields are presented, not that two transactions with different notaries should have an identical Merkle root.

Yeah I think the zero hash is a bit simpler, right?

kasiastreich · 2017-01-05T10:50:12Z

Ok, I misunderstood it at the beginning (conversations, jira etc.) - I will add type, notary, signers and implement zero hashes. Diagrams in the whitepaper also need to be changed.

…lation.

…en level to padding leaves' list with zero hash to size of the nearest power of 2 - so we always have a full binary tree.

…struction of Merkle trees - padding with zero hashes and including all WireTransaction fields in id computation.

kasiastreich · 2017-01-11T16:15:46Z

I addressed the comments.

mikehearn · 2017-01-12T11:29:05Z

core/src/main/kotlin/net/corda/core/transactions/MerkleTransaction.kt

    return x.serialize(kryo).hash
 }

+val zeroHash = SecureHash.SHA256(ByteArray(32, { 0.toByte() }))


This should probably be in the SHA256 companion object rather than in this time. That way it's available for everyone.

mikehearn · 2017-01-12T11:49:38Z

core/src/main/kotlin/net/corda/core/transactions/MerkleTransaction.kt

        val newHash = this.hash.hashConcat(right.hash)
        return Node(newHash, this, right)
    }

+    // Check if a tree is full binary tree. Returns the height of the tree if full, otherwise throws exception.
+    @Throws(MerkleTreeException::class)
+    fun checkFull(level: Int = 0): Int {


Is this intended to be a public API? If not, make it private and you can take out the @throws annotation (it's only useful for java interop). If it is, make the documentation use standard kdoc / javadoc syntax /** */ so it shows up in the generated docs.

mikehearn · 2017-01-12T11:54:12Z

core/src/main/kotlin/net/corda/core/transactions/MerkleTransaction.kt


-    fun hashNodes(right: MerkleTree): MerkleTree {
+    private fun hashNodes(right: MerkleTree): MerkleTree {


I guess this function can be moved into the function where it's used, as it has only one call site.

mikehearn · 2017-01-12T11:55:43Z

core/src/main/kotlin/net/corda/core/transactions/MerkleTransaction.kt

@@ -102,17 +105,22 @@ sealed class MerkleTree(val hash: SecureHash) {

 /**
 * Class that holds filtered leaves for a partial Merkle transaction. We assume mixed leaves types.
+ * Notice that we include only certain parts of wire transaction (no type, signers, notary). Timestamp is always added,


But why not allow all parts of the tx to be present in the filtered tree? I don't get why it's limited here.

It boils down to mixed information I got before. Why Oracle needs to see all signers of tx (I am again referring to tech WP - use case described there is different)? Or if it doesn't have to, we need to provide additional filters on transaction or user needs to check type, either way it's complication.

Ideally this API would be as generic as possible: it'd let you select the subset of the tx you reveal. The current API is sufficient for the primary use case, but it wouldn't necessarily stretch to others.

mikehearn · 2017-01-12T12:09:32Z

core/src/main/kotlin/net/corda/core/transactions/WireTransaction.kt

@@ -42,7 +43,7 @@ class WireTransaction(
    @Volatile @Transient private var cachedBytes: SerializedBytes<WireTransaction>? = null
    val serialized: SerializedBytes<WireTransaction> get() = cachedBytes ?: serialize().apply { cachedBytes = this }

-    //We need cashed leaves hashes and whole tree for an id and Partial Merkle Tree calculation.
+    // We need cashed leaves hashes and whole tree for an id and Partial Merkle Tree calculation.


"cached"

I'm not sure why we need this though. The comment says we do, but doesn't explain why. Why not just cache the id hash, why cache the intermediate parts?

That code is ancient, but I think that my reasoning behind this was that you may want to build many partial trees from one tx. Building is done using already calculated hash tree, probably not that costly though (but there is no upper bound on tx size). I guess it's not a problem so I can remove it.

Hm, if we find we need such a thing and need it so often that we have to optimise hash calculation, we can put it back. For now I'd like to keep it simple.

mikehearn · 2017-01-12T12:13:52Z

core/src/main/kotlin/net/corda/core/transactions/WireTransaction.kt

+    /**
+     * Build filtered transaction using provided filtering functions.
+     */
+    fun buildFilteredTransaction(filterFuns: FilterFuns): FilteredTransaction {


Let's do a bit of thinking around the FilterFuns API. It seems like it may be awkward to use from Java (functions with default arguments need a bit of annotating to work well from Java) and hard to extend in future. Why not just have an (Any) -> boolean function and expect the user to do type testing. That way, if we ever add a new kind of thing to transactions, existing code still works and doesn't break.

I did some reading on interoperability, thanks.
I have few questions to how you see FilterFuns API (as I don't know other use cases of more general approach and I would love to learn some).

Do you want to have possibility of filtering every field of a tx? Or, say timestamp/notary/type is always included?

What checks will be performed after receiving filtered transaction. In Oracle case I see type checks and contents, however I can imagine some more elaborate verification that starts to complicate in a direction of subtransaction verification (if inputs/outputs are included).

Or for now, just don't worry and make it as simple as possible with moving all filtering (with one function) and checking to the user, but this approach is flattening the transaction - I am not sure if it's what we want.

My inclination is to go for more flexibility here rather than less, as long as the receiver of a filtered TX can easily say "it must have components X, Y and Z" and the API requires them to do so. So trying to make the data format flexible but the receive API hard to misuse. So yeah - why not allow every field to be filtered.

The point here is that we will need to freeze the API this year, and then if we've provided something that's just not flexible enough and which we can't evolve, we'd need to provide a second API, which isn't so great.

mikehearn · 2017-01-12T12:15:42Z

core/src/main/kotlin/net/corda/core/transactions/WireTransaction.kt

+        entries.forEach { it.mapTo(resultHashes, { x -> serializedHash(x) }) }
+        if (timestamp != null) resultHashes.add(serializedHash(timestamp))
+        if (notary != null) resultHashes.add(serializedHash(notary))
+        resultHashes.add(serializedHash(mustSign.sortedBy { it.hashCode() }))


I think you can assume that List<> is always ordered, or should be, so there is no reason to sort here.

Isn't it dependent on the order the commands were added when building tx (TransactionBuilder)? Then you will have two txs with different ids and the same content.

Ordering is implicitly a part of the content in these sorts of structures. If you reverse the order of inputs, for instance, it's correct that this yields a different hash.

We don't really want to be incorporating java hashcodes into actual secure hashes used for transaction IDs. Ideally it is as close to hashing the serialised binary as possible.

Hm, 'ordering of commands' is a bad example, but there is a case of having transactions with the same contents ordered in exactly the same way on every field but signers in different. When you first add input with notary then commands, against: first commands then input. Is ordering of signers that important? It's a field derived from other fields in the transaction as a function of it's contents, not referentially transparent. Additionally I don't incorporate them in the tree separately - it's one hash - just want to be sure it's not wrong.

The mustSign field is nasty and I'd like to get rid of it, so this discussion may not matter much, but having the exact same tx hash result from the same semantically equivalent contents is not necessary.

…alculation. Instead of many filtering functions over a transaction only one needs to be provided. Additional change to check and verification of FilteredTransaction. Make WireTransaction and FilteredLeaves traversable with list functions.

…API.

mikehearn

Looking good! Just a few more things to resolve.

mikehearn · 2017-01-26T15:56:02Z

core/src/main/kotlin/net/corda/core/crypto/PartialMerkleTree.kt

            val tree = buildPartialTree(merkleRoot, includeHashes, usedHashes)
-            //Too much included hashes or different ones.
+            // Too much included hashes or different ones.


Too much -> Too many (indefinite vs definite quantities)

mikehearn · 2017-01-26T15:56:43Z

core/src/main/kotlin/net/corda/core/crypto/PartialMerkleTree.kt

@@ -118,7 +131,7 @@ class PartialMerkleTree(val root: PartialTree) {
    fun verify(merkleRootHash: SecureHash, hashesToCheck: List<SecureHash>): Boolean {
        val usedHashes = ArrayList<SecureHash>()
        val verifyRoot = verify(root, usedHashes)
-        //It means that we obtained more/less hashes than needed or different sets of hashes.
+        // It means that we obtained more/less hashes than needed or different sets of hashes.


less -> fewer (this is an obscure English grammar rule that many native speakers get wrong too!)

mikehearn · 2017-01-26T15:59:20Z

core/src/main/kotlin/net/corda/core/transactions/MerkleTransaction.kt

+    // We may want to specify our own behaviour on certain tx fields.
+    // Like if we include them at all, what to do with null values, if we treat list as one or not etc. for building
+    // torn-off transaction and id calculation.
+    fun <T>traverseWithListFun(traverseFun: (List<Any>) -> T): T {


space after <T>

Why not just make this an extension function on WireTransaction, why introduce a new interface?

Ah, hmm, I guess it is so the same interface is implemented by a filtered transaction as well. OK, that makes sense.

Why not just provide a property that returns all the members as a list? Why insist that a function be applied to the result?

mikehearn · 2017-01-26T15:59:49Z

core/src/main/kotlin/net/corda/core/transactions/MerkleTransaction.kt

 }

 /**
- * Holds filter functions on transactions fields.
- * Functions are used to build a partial tree only out of some subset of original transaction fields.
+ * Class that holds filtered leaves for a partial Merkle transaction. We assume mixed leaves types, notice that every


We assume mixed leaf types

mikehearn · 2017-01-26T17:05:31Z

core/src/main/kotlin/net/corda/core/transactions/MerkleTransaction.kt

@@ -101,40 +80,65 @@ sealed class MerkleTree(val hash: SecureHash) {
 }

 /**
- * Class that holds filtered leaves for a partial Merkle transaction. We assume mixed leaves types.
+ * It assures that we always calculate hashes in the same order. Plus lets us define which fields of WireTransaction will be


Maybe rename to TraversableTransaction? The "tx" abbreviation is usually kept for little local variables and other places where it's not a big deal to shorten. Also maybe the docs for this interface can be a bit clearer? The current phrasing is a bit awkward.

mikehearn · 2017-01-27T13:30:11Z

docs/source/merkle-trees.rst

@@ -15,17 +15,17 @@ You can read more on the concept `here <https://en.wikipedia.org/wiki/Merkle_tre
 Merkle trees in Corda
 ---------------------

-Transactions are split into leaves, each of them contains either input, output, command or attachment. Other fields like


For some reason github won't let me comment on line 9, but there's a typo above this section: achive -> achieve

mikehearn · 2017-01-27T13:31:19Z

docs/source/merkle-trees.rst

+        ...
+        // Performing validation of obtained FilteredLeaves.
+        fun commandValidator(elem: Command): Boolean {
+                if (!(identity.owningKey in elem.signers && elem.value is Fix))


The indentation of this line looks wrong.

mikehearn · 2017-01-27T13:32:56Z

docs/source/merkle-trees.rst


 In the Oracle example this step takes place in ``RatesFixFlow``:

 .. container:: codeset

   .. sourcecode:: kotlin

-        val flow = RatesFixFlow(partialTx, filterFuns, oracle, fixOf, "0.675".bd, "0.1".bd)
+        val flow = RatesFixFlow(partialTx, ::filtering, oracle, fixOf, "0.675".bd, "0.1".bd)
+

 ``FilteredTransaction`` holds ``filteredLeaves`` (data that we wanted to reveal) and Merkle branch for them.

 .. container:: codeset

   .. sourcecode:: kotlin


There's a way to include parts of source code files directly in the docsite - look at some other files for "literalinclude". That way you don't have to duplicate the code in the docs.

mikehearn · 2017-01-27T13:33:39Z

samples/irs-demo/src/main/kotlin/net/corda/irs/flows/RatesFixFlow.kt

@@ -29,7 +28,7 @@ import java.util.*
 */
 open class RatesFixFlow(protected val tx: TransactionBuilder,
                        /** Filtering functions over transaction, used to build partial transaction presented to oracle. */
-                        private val filterFuns: FilterFuns,
+                        private val filtering: (Any) -> Boolean, // Not the best solution, when lambda expression passed (serialization).


Is the function still needed after the first checkpoint? It could be marked as transient if we don't want it to survive checkpointing.

It's needed in the FixSignFlow subflow after querying the oracle for a fix (filtering is passed in assembleSharedTX to RatesFixFlow). I could just move it inside, but I guess it should belong to assembling transaction? If so, it should survive checkpointing.

mikehearn · 2017-01-27T13:34:32Z

samples/irs-demo/src/test/kotlin/net/corda/irs/testing/NodeInterestRatesTest.kt

+        }
+    }
+
+    fun filterAllCmds(elem: Any): Boolean {


This can just be fun filterAllCmds(elem: Any) = elem is Command - but then maybe the function name is a bit misleading ("All" in a name normally implies it takes a collection)

mikehearn · 2017-02-01T11:59:35Z

I thought a bit more about the function and the checkpointing.

You can store a lambda in a checkpoint. It works, technically. But the problem is that it's very likely that developers won't realise how much stuff they're capturing by doing that, and it could result in giant or unworkable checkpoints.

I suggest instead of passing in a lambda, you make the call site call a protected method. The user of the flow can subclass it in order to change what is filtered. This will encourage developers to pass in any data they need explicitly, and that will be more likely to lead to the scope of the checkpoint being limited. It should be a minor change, I think? You can note in the javadoc of the protected method that when overriding, be careful not to make the sub-class an anonymous or inner class that could capture huge chunks of the app.

mikehearn · 2017-02-01T12:00:37Z

Other than that, it's all looking fine now. This is the last thing we need to consider, I believe.

…ixFlow class. Comment on situation when capturing too much scope and connected problems with checkpointing. Change oracle and tear-offs documentation.

mikehearn · 2017-02-03T13:18:15Z

Looks good to me!

kasiastreich added 3 commits December 19, 2016 13:45

Move merkle building extension functions on wire tx to WireTransactio…

d4e1787

…n class. Small cleanup.

Extend Merkle tree calculation with adding timestamp to leaves.

6466496

kasiastreich self-assigned this Dec 20, 2016

kasiastreich requested a review from mikehearn December 20, 2016 10:35

kasiastreich changed the title ~~Kstreich tearoff fixes~~ Tearoff fixes Dec 20, 2016

mikehearn mentioned this pull request Dec 22, 2016

Merkle Tree code has various issues #73

Closed

kasiastreich added 4 commits January 5, 2017 14:09

Merge branch 'master' into kstreich-tearoff-fixes

61cf504

Add notary, transaction type and signers to wire transaction id calcu…

59c6034

…lation.

Change construction of MerkleTree from duplicating last node on a giv…

8ac59db

…en level to padding leaves' list with zero hash to size of the nearest power of 2 - so we always have a full binary tree.

Update tear-offs documentation and diagrams to reflect changes in con…

7e0cbb2

…struction of Merkle trees - padding with zero hashes and including all WireTransaction fields in id computation.

mikehearn reviewed Jan 12, 2017

View reviewed changes

kasiastreich added 5 commits January 19, 2017 14:08

Minor fixes after review.

defcbba

Documentation update.

e61c333

Update IRS oracle demo after changes in filtering of WireTransaction …

b90ed4e

…API.

Correct typos.

fed73ef

mikehearn suggested changes Jan 27, 2017

View reviewed changes

kasiastreich added 2 commits January 30, 2017 11:36

Address review comments. Typos and renames.

dcc0120

Remove comment

583a7ad

kasiastreich added 2 commits February 2, 2017 14:30

Merge branch 'master' into kstreich-tearoff-fixes

764fad2

IRS demo change. Make filtering function a protected method of RatesF…

ba61f4f

…ixFlow class. Comment on situation when capturing too much scope and connected problems with checkpointing. Change oracle and tear-offs documentation.

mikehearn approved these changes Feb 3, 2017

View reviewed changes

kasiastreich merged commit 45d8e0f into master Feb 3, 2017

kasiastreich deleted the kstreich-tearoff-fixes branch February 3, 2017 14:03


		fun hashNodes(right: MerkleTree): MerkleTree {
		private fun hashNodes(right: MerkleTree): MerkleTree {

Tearoff fixes #78

Tearoff fixes #78

Conversation

kasiastreich commented Dec 20, 2016

mikehearn commented Dec 20, 2016 • edited

mikehearn commented Jan 5, 2017

kasiastreich commented Jan 5, 2017

mikehearn commented Jan 5, 2017

kasiastreich commented Jan 5, 2017

kasiastreich commented Jan 11, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikehearn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikehearn commented Feb 1, 2017

mikehearn commented Feb 1, 2017

mikehearn commented Feb 3, 2017

mikehearn commented Dec 20, 2016 •

edited