Permalink
Switch branches/tags
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
257 lines (167 sloc) 10.9 KB
  CIP: 6
  Title: P2SH data encoding; at least it's not P2PKH
  Authors: Ruben de Vries
  Discussions-To: https://counterpartytalk.org/t/cip-proposal-p2sh-data-encoding/2169/22
  Status: Accepted
  Type: Standards
  Created: 2016-06-24
  Updated: 2017-11-16

Abstract

With chaining 2 transactions together we can embed data in the scriptSigs of P2SH outputs.

Motivation

Counterparty currently has the following data encoding schemes:

  • opreturn the best solution, doesn't pollute the UTXO set, is the cheapest, but limited to a maximum 80 bytes
  • multisig the 2nd best solution, only temporarly pollutes the UTXO set, has no upper limit
  • pubkeyhash the worst solution, permanently pollutes the UTXO set, but also has no upper limit

For (literally) 99% of Counterparty transactions opreturn is enough (though some people forcefully use multisig encoding). Previously for the rest we'd fall back to using multisig encoding since it has no real drawbacks.

In Bitcoin Core v0.12.1 there was a DoS protection (called bytespersigop) that essentially made bare multisig non-standard. Bare multisig is discouraged by Bitcoin Core developers and may likely be limited or removed in a future Bitcoin Core release.

So we need an alternative for larger amounts of data and we want to avoid pubkeyhash encoding because it pollutes the UTXO set too much.

Rationale

It's possible to embed data in P2SH redeemScripts by doing 2 transactions where the first one sets up P2SH outputs with redeemScripts that allow us to put the data in the scriptSig of the spending transaction.

With this method we can easily put in a lot more data than the other methods. This does not polute the UTXO set and signature data can be pruned by Bitcoin nodes that wish to do so.

The original proof of concept, by Peter Todd, of this can be found here: https://github.com/petertodd/python-bitcoinlib/blob/master/examples/publish-text.py

Specification

The Basics

This process requires 2 transactions, the first to setup 1 or more P2SH outputs and then the second spending the P2SH outputs and placing the data we wish to embed in the scriptSig.

The Redeem Script

The data is split into chunks of max 520 size, then for each chunk we create a redeemScript as follows:

redeemScript = {{ data }} OP_DROP {{ pubkey }} OP_CHECKSIGVERIFY {{ n }} OP_DROP OP_DEPTH 0 OP_EQUAL

dissecting this into pieces we have:

{{ data }}

this is the counterparty operation data. Since this data is part of the redeemScript it is included in the hash of the P2SH address.

{{ pubkey }} OP_CHECKSIGVERIFY

this ensures the output needs to be signed by the owner. Any arbitrary script operation is allowed here as well.

{{ n }} OP_DROP

n is an incrementing number to ensure that each output is unique even when the data chunks aren't.

OP_DEPTH 0 OP_EQUAL

this prevents scriptSig malleability.

The Output Script

The output script placed in the first transaction is then:

ouputScript = OP_HASH160 {{ hash160([redeemScript]) }} OP_EQUALVERIFY

The real deal

Below we'll describe how a Counterparty send would look.

Transaction 1
Inputs
  • 1 + n UTXOs from any address with enough BTC to pay the fees for both the first and the second transaction.
Outputs
  • 1 + n P2SH outputs following the above method with at least DUST value. The sum of these outputs must contain enough value to pay the fees for the second transaction.
  • 1 change output to send any excess BTC to the sender or a new change address. This is optional.
Transaction 2
Inputs
  • 1 input spending a source output from the source. (This is optional. It can be used for defining a source address when sending from a P2SH address. See below.)
  • 1 + n inputs spending the P2SH data outputs. The transaction data is included in the scriptSig
Outputs
  • 1 output to specify the destination of the Counterparty transaction with DUST value if needed. In most transactions this can be omitted.
  • 1 OP_RETURN output encoding the data 'CNTRPTY' + 'P2SH' to signal that the data is found in the P2SH inputs, with 0 value.
  • 1 change output to any address. This is optional.
Fees & Coin Selection

The first transaction has to send enough BTC to the second transaction so the second transaction can pay for it's own fee without having to add extra inputs. So we calculate the value of the output in the first transaction to be:

estimated_size_of_tx2 = 10  # base size of  TX
estimated_size_of_tx2 += 181  # for source input
estimated_size_of_tx2 += sizeof(data)  # for the data
estimated_size_of_tx2 += count(data_p2sh_outputs) * (181 + 9)  # for the overhead of each data output being spend
estimated_fee_for_tx2 = (estimated_size_of_tx2 / 1000) * fee_per_kb

This means the amount of BTC to require when doing coin selection is:

estimated_size_of_tx1_without_inputs = 10  # base size of a TX
estimated_size_of_tx1_without_inputs += count(data_p2sh_outputs) * 29  # for P2SH data outputs

inputs = []
for UTXO in coinselection:
   inputs.append(UTXO)
   estimated_size_of_tx1 = estimated_size_of_tx1_without_inputs + count(inputs) * 181
   estimated_fee_for_tx1 = (estimated_size_of_tx1 / 1000) * fee_per_kb

   if sum(inputs) >= estimated_fee_for_tx1 + estimated_fee_for_tx2:
       break

Determining the source address

If the redeem script contains {{ pubkey }} OP_CHECKSIGVERIFY, then the source address will be constructed from the pubkey provided in the redeem script.

When the source address is a P2SH address, the redeem script will not contain any pubkey. In this case, the source address will be determined by the first input. If transaction 2 spends an input from a P2SH script and there is no {{ pubkey }} OP_CHECKSIGVERIFY in the redeem script, then the source address will be the P2SH address that provided the first input spent in transaction 2.

Example 1: A standard (P2PKH) address

Say we have source address of 1aP2PKHaddressxxxxxxxxxxxxxy43CZ9j and a data script hash of 3aP2SHDataAddress001xxxxxxxy4zdFf3.

Transaction 1 Input:

Input Value
1 0.100

Transaction 1 Outputs:

Output Value Destination Note
1 0.001 3aP2SHDataAddress001xxxxxxxy4zdFf3 Fund Transaction 2
2 0.098 1aP2PKHaddressxxxxxxxxxxxxxy43CZ9j Change (optional)

Fee Paid: 0.001

Transaction 2 Input:

Input Value Paid to ScriptSig
1 0.001 3aP2SHDataAddress001xxxxxxxy4zdFf3 <signature> {{ data }} OP_DROP {{ pubkey }} OP_CHECKSIGVERIFY {{ n }} OP_DROP OP_DEPTH 0 OP_EQUAL

Transaction 2 Output:

Output Value Destination Note
1 0 OP_RETURN CNTRPTYP2SH

Fee Paid: 0.001

Example 2: A multisig (P2SH) address

Say we have a source address that is a 2-of-3 multisig address with an address of 3aMultisigAddressxxxxxxxxxxy1QHYVH and a data script hash of 3aP2SHDataAddress002xxxxxxxy4zdFf3.

Transaction 1 Input:

Input Value
1 0.100

Transaction 1 Outputs:

Output Value Destination Note
1 0.001 3aP2SHDataAddress002xxxxxxxy4zdFf3 funds transaction 2
2 0.098 3aMultisigAddressxxxxxxxxxxy1QHYVH change (optional)

Fee Paid: 0.001

Transaction 2 Inputs:

Input Value Paid to ScriptSig
1 0.098 3aMultisigAddressxxxxxxxxxxy1QHYVH 0 <signature1> <signature2> 2 {{ pubkey1 }} {{ pubkey2 }} {{ pubkey3 }} 3 OP_CHECKMULTISIG
2 0.001 3aP2SHDataAddress002xxxxxxxy4zdFf3 0 <signature1> <signature2> {{ data }} OP_DROP 2 {{ pubkey1 }} {{ pubkey2 }} {{ pubkey3 }} 3 OP_CHECKMULTISIGVERIFY {{ n }} OP_DROP OP_DEPTH 0 OP_EQUAL

Transaction 2 Outputs:

Output Value Destination Note
1 0 OP_RETURN CNTRPTYP2SH
2 0.098 3aMultisigAddressxxxxxxxxxxy1QHYVH change (optional)

Fee Paid: 0.001

The Counterparty API

In practice this means the create_* API calls will have to start returning a list of 1 or more transactions which the client signs and broadcasts, so clients need to adapt to this new style and always assume they need to sign and broadcast N transactions.

Pre-segwit we won't be able to have the second transaction spend from the first transaction until the first transaction has been signed, this means the client actually needs to do 2 API calls, where the second has the txId of the signed first transaction as a parameter.

Once we can use segwit this problem is gone! However not all clients will straight away be able to sign segwit transactions (requires upgrades of the libraries they use).

Because of this being quite a hassle we propose to have 2 configuration and API parameters to control all of this: segwit and old_style_api.

old_style_api

When old_style_api = True the API will continue to function as normal, returning a single transaction, as string.

Once old_style_api = False the API will (across all create_* transactions) return a list of 1+ transactions (even when it's just 1).

While old_style_api = True P2SH encoding will not be used unless explcitily set (so - the default - encoding=auto will not use P2SH encoding).

segwit

When segwit = False the first API call will return [tx_hex, None] signalling that it needs a second call with p2sh_pretx_txid added as param, which should be the txId of the signed first transaction. The second API call will then return [None, tx_hex].

When segwit = True the API call will return [tx_hex, tx_hex] and will no longer require a second API call.

Child pays for Parent

The current scheme ensures both transactions pay fee for their own size, in the (near) future when the 'Child pays for Parent' functionality that has been added to Bitcoin Core is widely adopted we can change this so that the second transaction pays a larger portion of the total fee (still needs to pay enough to be relayed).

This will ensure that the first transaction is never mined without the second transaction.

Backwards Compatibility

This is a new encoding that will be completely unrecognised by older clients, any clients who don't upgrade would loose consensus with the nodes that did upgrade. It also affects how the Counterparty API works (see above).

Copyright

This document is placed in the public domain.