Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[soft fork] Block v3: miner commitments with compact proofs #3977

Closed
wants to merge 3 commits into from

Conversation

maaku
Copy link
Contributor

@maaku maaku commented Mar 28, 2014

There are proposals in the pipeline, and some currently deployed applications which require commitments of special forms of data to the blockchain by miners. For example, merged mining requires each block to have a data commitment at a single deterministically identifiable position. UTXO hash-tree commitments adds the root hash of a Merkle tree of the current validation state at a similarly known or identifiable position such that clients using such an index structure can acquire UTXO proofs with SPV security. Compact SPV proofs require commitments of a Merkle heap or skip list structure of past block headers, for the purpose of constructing traversals through the block headers with logarithmic scaling in space. All of these miner-commitment applications require to varying degrees three essential properties:

  1. A unique or identifiable place within the block where the commitment is to be placed. For a given commitment type the miner must know where to put the commitment into a block template, someone using it must know where to find it within a block or its Merkle proof (the transaction containing the commitment, the path to the transaction through the Merkle tree of the block, and the block header).
  2. Impossible for a miner to create multiple block commitments of the same type within the context of a single block. A user given just the transaction containing the commitment, its path through the Merkle tree, and the block header must have confidence that the commitment is of the indicated type, and the only one of that type in the block.
  3. Compact commitment proofs. The transaction containing the commitment should contain no more information than is necessary to uniquely identify the commitment type and the commitment itself, so as to minimize the size of commitment Merkle proofs. Achieving this property directly benefits protocols built on top of data commitments.

The only extant example of miner commited data in Bitcoin Core is BIP 34, which add the block height to the coinbase string in order to ensure coinbase transaction uniqueness. Higher level protocols such as p2pool and namecoin's merged mining also require commitments via the coinbase transaction. However such schemes fail to meet one or more of the properties above, especially the requirement for compact commitment proofs: coinbase transactions can be unreasonably large due to mining pool payout schedules and the increasing amount of data being stuffed in the coinbase string.

An alternative is to use transactions of a special form, and a separate transaction for each commitment. These commitment transactions have a single input whose scriptSig consists of a single data push, the contents of which is the commitment type:

scriptPubKey: <type> OP_EQUAL
scriptSig:    <type>

The outputs of the transaction are available for prune-able OP_RETURN data commitments. In order to meet requirement #1 and #2, a soft-fork validation rule is added such that any block containing two such special-form transactions with the same commitment type is rejected, and the anyone-can-spend outputs of the form <type> OP_EQUAL must be present for each required commitment, in order to provide an output to be spent COINBASE_MATURITY blocks later. It is suggested that one data commitment be set aside for the purpose of document timestamping, merged mining, and other applications which do not need additional validation rules. This patchset sets aside the OP_0 commitment type for this purpose.

@maaku
Copy link
Contributor Author

maaku commented Mar 28, 2014

And immediately after making the pull request I realize half of it is missing - CreateNewBlock needs to be updated to create the required commitment outputs. I will make that fix.

@luke-jr: Any required commitment increases the block size linearly, by the very nature of what you are doing. By adding a commitment to a block you have to at the very least add the bytes of the hash you are committing. This does it in a way which is more performant and immediately pruned from the UTXO set. And I think you may have misread req 1. My intent was that given the entire block or some transaction(s) in a Merkle proof, you know what to scan for to find the commitment.

The best system for coinbase commitments is p2pool's method. Such an approach is really not ideal. Beyond midstate being an ugly hack (not the least because some library SHA-256 implementations don't expose the necessary functionality to verify), it is also not performant. It would require the proofs to carry 32 bytes of midstate data, plus an average of 32 bytes of data prior to the commitment, and some metadata to indicate the start of the commitment within the string. That is 65 bytes of overhead plus the commitments which follow that you don't care about, which could be sizable in itself. This is a rather large waste of space, especially for applications which demand compact proofs, e.g. merged mining headers and compact spv proofs.

@maaku
Copy link
Contributor Author

maaku commented Mar 29, 2014

Fixed CreateNewBlock() to include required commitments, and 'getblocktemplate' RPC command to include a 'budgets' field which mandates any required transaction outputs beyond the miner payout.

@petertodd
Copy link
Contributor

It would require the proofs to carry 32 bytes of midstate data, plus an average of 32 bytes of data prior to the commitment, and some metadata to indicate the start of the commitment within the string. That is 65 bytes of overhead plus the commitments which follow that you don't care about, which could be sizable in itself. This is a rather large waste of space, especially for applications which demand compact proofs, e.g. merged mining headers and compact spv proofs.

What does this overhead work out to as a % of the total cost of the proof? Also, if I need more than one commitment proof for my application, what is the total size of all proofs with your scheme vs. committing to a binary radix tree of commitments?

@maaku
Copy link
Contributor Author

maaku commented Apr 1, 2014

Well a proof is just 80 bytes block header + untaken Merkle paths + extra data at the (sub-)transaction level + a few bytes metadata. So for just the coinbase and one commitment that is 80 + 32 + 1-2 bytes. That's about 50% increase on a minimal block, but percentage-wise that's a worst case since the overhead is constant but the proof size increases logarithmically with the number of transactions. With a 1000 transactions in the block it is a 16% increase. Still significant.

Using the coinbase transaction with midstate also has one other glaring problem: you are unprotected from length-extension attacks. You can't know for certain if the string given is actually a commitment, or just something that looks like it inside of a giant PUSHDATA. This can be somewhat mitigated against with economic arguments by using radix trees with a single root commitment, making this root commitment required, and making it illegal to put anything after it in the coinbase transaction. Then at least any PUSHDATA trickery would also invalidate the block.

Using a radix tree with a single exposed root hash (in the coinbase or its own transaction) is less space efficient for proofs, because if they are separate transactions then you can put them consecutively and align on a corresponding power-of-2 boundary in order to reduce the number of transaction-tree Merkle branches required, vs. the radix tree approach which doesn't let you save space from the transaction Merkle tree and requires new untaken branches for commitments you don't care about (whereas that would otherwise be amortized over the size of the transaction list). It's also messier because you've added (1) the radix tree implementation (although, yes, I want that for other purposes), and (2) the tree data still needs to be transported with the block, since validation rules will depend on these commitments being correct (e.g. [U]TXO hash-tree commitments). You'd have to include the serialized radix tree in the p2p block message, on disk in the block files, and anywhere else in the bitcoin ecosystem which transmits or archives full blocks for consumption by a validator. That's a lot more code to change, and a bigger disruption to the bitcoin ecosystem.

@forrestv
Copy link
Contributor

forrestv commented Apr 4, 2014

A while ago I started writing this proposal for MM2 (merged mining 2), which is essentially the "radix tree with single exposed root hash" maaku described, which fulfills all three of his required properties, and which P2Pool is already using. The rest of this comment mostly summarizes things that have already been said.

Admittedly, using the midstate is a bit hackish, but it isn't vulnerable to a length-extension attack, as maaku said, since there's only one root commitment hash fixed at the end of the coinbase transaction.

To conclude, everything's possible without a Bitcoin rule change, without creating any new transactions and using constant space in Bitcoin blocks, at the expense of about (64 bytes) + (32 bytes)*log2(number of types of commitments present in a block) for every block used in a commitment proof. (If the commitment root were padded so that it starts on a SHA-block boundary, the 64 bytes would decrease to 32 bytes, wasting on average about 32 bytes in every Bitcoin block.)

The other wrinkle is that if Bitcoin miners started validating the UTXO commitment, (only) the relevant Merkle branch within the radix tree would have to be forwarded around the Bitcoin network among nodes that care, requiring database and protocol changes for them.

@petertodd
Copy link
Contributor

@forrestv Oh, so p2pool already supports committing to data in addition to the p2pool share state through the radix tree it commits in that OP_RETURN output?

@forrestv
Copy link
Contributor

forrestv commented Apr 5, 2014

@petertodd The P2Pool protocol supports it, but the P2Pool software doesn't currently provide an API for adding other commitments. Once finished, my MM2 proposal will declare an API, and then I'll add support to P2Pool.

@maaku
Copy link
Contributor Author

maaku commented Apr 5, 2014

@forrestv you need a bitcoin rule change, otherwise you can't trust the midstate, as I explained above. Separating the commitments out of the coinbase transaction results in smaller proofs and fewer changes to existing infrastructure, at least once the commitments are used for anything validation related.

@petertodd
Copy link
Contributor

You realise your claiming that p2pool is insecure?

Anyway coinbase mid state as used by p2pool is unambiguous as the commitment is always the last txout of the transaction and the merkle path to the header has a fixed format.

@forrestv Excellent! That's exactly what I've been hoping we'd do for better merge mining/commitments. Speaking of, I was considering implementing my tree chains idea as a merge mined chain, possibly the only merge mined chain that will ever be needed if client side validation proves practical. p2pool is the right audience for it; I would have asked you to add commitments to p2pool if you hadn't already.

It would be good to do a BIP for it too.

On 6 April 2014 00:31:39 CEST, Mark Friedenbach notifications@github.com wrote:

@forrestv you need a bitcoin rule change, otherwise you can't trust the
midstate, as I explained above. Separating the commitments out of the
coinbase transaction results in smaller proofs and fewer changes to
existing infrastructure, at least once the commitments are used for
anything validation related.


Reply to this email directly or view it on GitHub:
#3977 (comment)

@forrestv
Copy link
Contributor

forrestv commented Apr 6, 2014

@petertodd, the insecurity he's talking about has to do with miners validating commitments. It would be possible to create a commitment that validating miners would treat as not a commitment at all, by stuffing it in a PUSHDATA. In the same vein, you'd need to make sure that a block doesn't have a commitment for a certain validation type if you think the block doesn't have it, which would be difficult to prove with P2Pool's approach.

For commitments that will require validation (SPV/non-SPV UTXO trees), P2Pool's solution is probably not the best choice. For everything else, it probably is.

@petertodd
Copy link
Contributor

BTW, I forgot to mention it before, but a serious issue with this scheme is that you can't prove the absence of a given commitment compactly without forcing the commitments to all be in the same index in the merkle tree, which in turn creates an ugly need for a central registration authority every time you add a new commitment. Equally that a block is now invalid just for having two commitments of the same type conflicts with future scalability solutions like sharding.

@BitcoinPullTester
Copy link

Automatic sanity-testing: FAILED MERGE, see http://jenkins.bluematt.me/pull-tester/p3977_26592396ce233e3e58736ee7dc346be20ad68b0f/ for test log.

This pull does not merge cleanly onto current master
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

@luke-jr
Copy link
Member

luke-jr commented Sep 23, 2014

Wouldn't it be simpler to just require the generation transaction produce a single dummy output, then spend that from 100 blocks ago with the first commitment, and produce an output that can be consumed by the second commitment of the same block? So instead of OP_EQUAL, you'd do OP_BLOCKCOMMITMENT (OP_NOP3 with no behaviour changes) and ignore it in scriptSig.

@petertodd I don't believe proving absence of a commitment is ever possible securely. Blocks could create a "commitment" of a given type with data of undeterminable validity for that type.

@petertodd
Copy link
Contributor

@luke-jr You're thinking too generally: I'm simply saying that with a more appropriate commitment scheme a proof can be created that a given block doesn't have a certain type of commitment; I'm not saying that such a proof can always be created without the consent of the miner. On the other hand, @maaku's scheme is flawed in that such a proof can never be created. (modulo providing the entire block!)

In any case since then I've written a library for creating such commitment proofs: https://github.com/petertodd/python-merbinnertree. Plenty more work to do on it, but @phantomcircuit has already been looking into using it for MM-like commitments.

@petertodd
Copy link
Contributor

BTW (U)TXO commitments are a concrete example where this is useful. A soft-fork to make (U)TXO commitments always required is problematic, because we can't change the format of those commitments in the future in another soft-fork - the exact form is baked in stone. However we can do a slightly less drastic soft-fork to require (U)TXO commitments to be valid, while still allowing miners to choose to not make the commitment. In this case the miners who chose not to calculate the commitment would provide a short and simple proof that the commitment didn't exist in their block. Completing the upgrade would be another soft-fork to require that the commitment not exist, thus upgrading the (U)TXO format and allowing the related code and indexes to be eventually removed.

@maaku
Copy link
Contributor Author

maaku commented Oct 10, 2014

Fixed bitrot. This could use some tests still.

@maaku
Copy link
Contributor Author

maaku commented Oct 10, 2014

kerami, using the transaction ordering is doable, but fairly ugly in my own opinion. It is also strictly speaking less capable as it doesn't allow for frictionless non-consensus commitments--with the current solution you can pick a random UUID as the commitment type and have some assurance against namespace collision. I have to admit this isn't a strong argument however, as it is not clear to me that such an applicaiton would be useful.

More seriously, it's hard to avoid the spend-a-coinbase-output scheme however. The inputs need to be restricted such that no fees can be given, otherwise you break the secuirty model by opening up the possibility of paying for incorrect or malicious commitments. In the coinbase commitment scheme this is trivial -- the one and only input is an anyone-can-spend. If you instead specify commitment type by position, then you still need at least one input by protocol rules, but because it is not required to be an anyone-can-spend you've now got to either make the same arbitrary restrictions on input value, or worry about economic attacks..

@maaku maaku force-pushed the coinbase-commitments branch 2 times, most recently from 4bdf409 to 7bda21d Compare October 13, 2014 22:44
@petertodd
Copy link
Contributor

@maaku

otherwise you break the secuirty model by opening up the possibility of paying for incorrect or malicious commitments.

Don't get caught up in notions of 'security model' - I can nearly as easily pay for those commitments with out-of-band and enforce those commitments with social, legal, or if all else fails, even fidelity bonds. The idea that avoiding in-band fee payments will help anything in the real world is silly.

This provides infrastructure for soft-forking bitcoin to include extra per-block committed data in transactions separate from the coinbase, so that Merkle proofs of these commitments do not have to include the entire coinbase transaction. This is accomplished by means of anyone-can-spend outputs in the coinbase which are spent COINBASE_MATURITY blocks later by the miner who finds the block, embedding the block's commitment in an OP_RETURN output of that spending transaction.

Includes one such commitment with no attached validation rules, suitable for document time-stamping, merged mining, and other free-form miner-committed data applications.
@maaku
Copy link
Contributor Author

maaku commented Feb 23, 2015

I am no longer convinced that this is the optimal approach, and am therefore closing this pull request.

@maaku maaku closed this Feb 23, 2015
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants