Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIP 136: Bech32 Encoded Tx Position References #555

Open
wants to merge 3 commits into
base: master
from

Conversation

@veleslavs
Copy link

veleslavs commented Jul 9, 2017

This proposed BIP was posted to the mailing list just under two months ago. As there has been no serious objections, (unfortunately no mailing list feedback at all), I wish to formally ask for a BIP number to be assigned to our proposal.

https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-May/014396.html

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Jul 9, 2017

@jonasschnelli I think that you would like to subscribe to this pull request.

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Jul 14, 2017

@jonasschnelli @ChristopherA @shannona @kimdhamilton I've updated this draft specification to include a Litecoin Display Format.

@veleslavs veleslavs force-pushed the veleslavs:Bech32_Encoded_TxRef branch 3 times, most recently from a960100 to 6b04a2f Jul 14, 2017

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Jul 14, 2017

@luke-jr Can you please allocate a new block of BIP numbers to WoT / ID based proposals; I think that this block should be about 20 numbers in width.

@luke-jr luke-jr added the New BIP label Jul 14, 2017

Show resolved Hide resolved bip-XXXX-Bech32_Encoded_Transaction_Postion_References.mediawiki Outdated

==== Encoding ====
* All display formats MUST be encoded with standard Bech32<ref>'''Why use Bech32 Encoding for Confirmed Transaction References?''' The error detection and correction properties of this encoding format make it very attractive. We expect that it will be reasonable for software to correct a maximum of two characters; however, we haven’t specified this yet.</ref> encoding as defined within the [https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki BIP 0173] specification.
* The Human Readable Part of the Bech32 encoding MUST be "tx"<ref>'''Why use the same Human Readable Part for different systems or blockchains?''' The ‘tx1’ prefix is to namespace the Bech32 encoding to this specification. We use Magic Values internally to namespace between projects.</ref>.

This comment has been minimized.

Copy link
@luke-jr

luke-jr Jul 14, 2017

Member

Seems to defeat the purpose of the human readable part.

This comment has been minimized.

Copy link
@jonasschnelli

jonasschnelli Jul 14, 2017

Member

Why? Isn't the human readable ideal for detecting (on the human level) if this is a testnet, mainnet or – lets say – lightcoin txref?

This comment has been minimized.

Copy link
@luke-jr

luke-jr Jul 14, 2017

Member

Exactly, it should be "tx" + "bc" in some order or another, not just "tx". :)

This comment has been minimized.

Copy link
@veleslavs

veleslavs Jul 14, 2017

Author

The human readable part is to inform users that the string should conform to this particular BIP. We have out own magic and versioning system so that we can upgrade and extend the format transparently for our users.

==== Encoding ====
* All display formats MUST be encoded with standard Bech32<ref>'''Why use Bech32 Encoding for Confirmed Transaction References?''' The error detection and correction properties of this encoding format make it very attractive. We expect that it will be reasonable for software to correct a maximum of two characters; however, we haven’t specified this yet.</ref> encoding as defined within the [https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki BIP 0173] specification.
* The Human Readable Part of the Bech32 encoding MUST be "tx"<ref>'''Why use the same Human Readable Part for different systems or blockchains?''' The ‘tx1’ prefix is to namespace the Bech32 encoding to this specification. We use Magic Values internally to namespace between projects.</ref>.
* All Non Bech32 alphabet characters after the Bech32 code separator MUST be ignored when phasing, (except for terminating characters).<ref>'''Why Strip all Non-Bech32 Alphabet Characters?''' We do not wish to expect the users to keep their TxRef's in good unicode form. We expect them to copy, paste, write by-hand, write in a mix of character sets, etc. Phasers should automaticity correct for all of these sorts of errors. However, we only specify the removal of non-bech32-alphabet characters.

This comment has been minimized.

Copy link
@luke-jr

luke-jr Jul 14, 2017

Member

By specifying the removal, you actually prohibit correction...

This comment has been minimized.

Copy link
@veleslavs

veleslavs Jul 14, 2017

Author

I do not see how this is the case, can you please provide an example?

|'''‘r’'''
|'''Bitcoin Main Chain'''
|-
|'''Litecoin'''

This comment has been minimized.

Copy link
@luke-jr

luke-jr Jul 14, 2017

Member

Litecoin stuff is inappropriate for BIPs.

This comment has been minimized.

Copy link
@jonasschnelli

jonasschnelli Jul 14, 2017

Member

As definition, yeah maybe. But we already have Litecoin references in BIP44 and BIP122.

This comment has been minimized.

Copy link
@luke-jr

luke-jr Jul 14, 2017

Member

As examples of altcoins, not as specification.

This comment has been minimized.

Copy link
@veleslavs

veleslavs Jul 14, 2017

Author

If including a Litecoin display format as an appendix is a strong objection I will be happy to remove it. It was included, by request, as Litecoin is used by Bitcoin Developers as a legitimate and functional testing ground.


This appendix contains a table of the allocated or reserved magic codes.
* If you wish to claim a magic code, please open a pull request adding the code to this list.
* If this list becomes too popular, we will move it to another location.

This comment has been minimized.

Copy link
@luke-jr

luke-jr Jul 14, 2017

Member

Suggest just reusing SLIP 44 (or better yet 173), possibly with a bitshift or sub-namespace.

@ChristopherA

This comment has been minimized.

Copy link

ChristopherA commented Jul 14, 2017

@luke-jr wrote: "Suggest just reusing SLIP 44 (or better yet 173), possibly with a bitshift or sub-namespace." & "Litecoin stuff is inappropriate for BIPs."

I agree that this may be more appropriate for SLIP (though I have no experience with their process) but unfortunately both SLIP 44 and 173 don't work well as bech32 prefixes because of the limitations of that encoding.

I personally would like to see a SLIP that includes a number of prefixes, not only for blockchain types, but also reserve placeholders for private key, public public key, xprv, xpub, ecdsa signature, schnorr signature.

@ChristopherA

This comment has been minimized.

Copy link

ChristopherA commented Jul 14, 2017

I'm also hoping to see this BIP have tx1: rather than tx1- as that allows it to conform better with W3C URN/URL standards, as TX1: is a method, and the data after is the the encoding of that method. If someone really preferred to use base58 for transaction references, it would be TX58: and then its encodling, etc.

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Jul 14, 2017

I agree that this may be more appropriate for SLIP (though I have no experience with their process) but unfortunately both SLIP 44 and 173 don't work well as bech32 prefixes because of the limitations of that encoding.

SLIP 173 is specifically designed for Bech32 prefixes...

I personally would like to see a SLIP that includes a number of prefixes, not only for blockchain types, but also reserve placeholders for private key, public public key, xprv, xpub, ecdsa signature, schnorr signature.

That's more of a BIP thing, and could be part of this if appropriate.

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Jul 14, 2017

@ChristopherA Thank you for the feedback of using the colon instead of the dash for the separating character after the Bech32 separator. I am going to update this proposal adopting this change.

@jonasschnelli any comments / objections?

@veleslavs veleslavs force-pushed the veleslavs:Bech32_Encoded_TxRef branch from 6b04a2f to a54360e Jul 14, 2017

@veleslavs veleslavs changed the title New BIP: Bech32 Encoded Transaction Postion References New BIP: Bech32 Encoded Transaction Position References Jul 14, 2017

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Jul 14, 2017

@ChristopherA I have updated this bit to use the colon instead of the dash for the first visual breaker, as recommended. @jonasschnelli and @shannona please take note that your code will need to be updated.

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Jul 14, 2017

@luke-jr I have moved the non-bitcoin related appendixes to satoshilabs/slips#89

However I still haven't decided what to do with the magic values, In this case I don't think that https://github.com/satoshilabs/slips/blob/master/slip-0044.md suits our purposes; as there is a large priority having the codes very concise.

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Jul 14, 2017

I think the codes should be human-readable more than concise, so SLIP 173 is a good match. But that's up to you.

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Jul 14, 2017

@luke-jr I think that it is the job of the GUI to provide this information in a clear way to the user. Users should not be expected to understand the difference between x-coin and y-scam-coin. Implementers should take care to display this information in a clear way to the users.

However I think that the non-bitcoin related magic code listing should be moved to a SLIP. I will leave Litecoin as an exception; as I've granted it a first-class 5-bit code, (maybe this should be reviewed and a 2nd class 10-byte magic code should be used for Litecoin). @jonasschnelli what do you think?

==== Encoding ====
* All display formats MUST be encoded with standard Bech32<ref>'''Why use Bech32 Encoding for Confirmed Transaction References?''' The error detection and correction properties of this encoding format make it very attractive. We expect that it will be reasonable for software to correct a maximum of two characters; however, we haven’t specified this yet.</ref> encoding as defined within the [https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki BIP 0173] specification.
* The Human Readable Part of the Bech32 encoding MUST be "tx"<ref>'''Why use the same Human Readable Part for different systems or blockchains?''' The ‘tx1’ prefix is to namespace the Bech32 encoding to this specification. We use Magic Values internally to namespace between projects.</ref>.
* All Non Bech32 alphabet characters after the Bech32 code separator MUST be ignored when phasing, (except for terminating characters).<ref>'''Why Strip all Non-Bech32 Alphabet Characters?''' We do not wish to expect the users to keep their TxRef's in good unicode form. We expect them to copy, paste, write by-hand, write in a mix of character sets, etc. Phasers should automaticity correct for all of these sorts of errors. However, we only specify the removal of non-bech32-alphabet characters.

This comment has been minimized.

Copy link
@clarkmoody

clarkmoody Jul 14, 2017

Contributor

Shouldn't "phasers/phasing" be "parsers/parsing" here and elsewhere?

@clarkmoody

This comment has been minimized.

Copy link
Contributor

clarkmoody commented Jul 14, 2017

I agree with @luke-jr that the human-readable part should inform the user that this is more than just "some transaction reference." For instance, as the user, I have to understand that tx1:r... is a Bitcoin transaction because of the r in the data part.

Since the human-readable part is specifically for humans, why not use bc-tx or similar string for the HRP? The spec could read

use the SLIP-0173 coin prefix, followed by `-tx` as the human-readable part

For instance, I would read bc-tx1:rqqq-qqqq-qmhu-qk as "bitcoin-transaction blah."

|'''Testnet'''
|'''0x6'''
|'''‘x’'''
|'''Any Test Network'''

This comment has been minimized.

Copy link
@clarkmoody

clarkmoody Jul 14, 2017

Contributor

Why "any test network"? This sort of ambiguity means that multiple transactions have identical reference strings.

You could use the 10-bit codes for testnets (or stick with the SLIP-0173 prefixes in the human-readable part).

This comment has been minimized.

Copy link
@jonasschnelli

jonasschnelli Jul 14, 2017

Member

Different test-nets can use different magic values. The idea behind "Any Test Network" is probably to have a longer and clear visually different txref form for testnets (longer bech32 part, longer HRP).

This comment has been minimized.

Copy link
@veleslavs

veleslavs Jul 15, 2017

Author

I do not want to assign different Magic values to different test networks; Test Networks are for testing purposes only. Developers should be aware about what network they are testing. (without needing a separate testing magic value).

For example, should testnet v4 and (future) testnet v5 be assigned different testing magic values? Such ideas quickly get ridiculous.

This comment has been minimized.

Copy link
@clarkmoody

clarkmoody Jul 17, 2017

Contributor

Some wallet interacting with two testnet chains, say for two altcoins, should be enough motivation to assign testnets separate Magic values. If not, the wallet will need to carry around metadata with each transaction reference, since all the references will be for "Any Test Network." There would also be extra code to check if it's a testnet or mainnet and switch around handling based on that.

|27 – 39
|13
|8191 transactions. (A full Bitcoin Block without Hard Fork).
|}

This comment has been minimized.

Copy link
@clarkmoody

clarkmoody Jul 14, 2017

Contributor

I can understand the desire to keep all reference strings to the nice 14-character version by keeping the data payload to these 40 bits, but it seems to place artificial limitations on the format (year 2048 & 8191 transactions). I also understand that this might be addressed with Version 1 encoding. But current blocks are not that far from having 8191 transactions.

You could go with a variable-length encoding similar to Bitcoin's variable ints and gain the benefit of having a format that will work for very large blocks and the very far future.

Also, the Bech32 reference libraries allow encoding from byte arrays into the base-5 arrays native to Bech32. It seems like bit-packing to these 40 bits might be overkill. As an alternative you could have one bit-packed byte to start:

# First two bits are the protocol version, supporting values 0-3
V = ((protocol version) & 0x03) << 6
# Next two bits are magic for the blockchain
# 0x00 = Bitcoin
# 0x01 = Testnet3
# 0x02 = Byte1 is another coin's magic code (gives 256 options)
# 0x03 = Byte1-2 is treated as the coin magic code (gives 65280 more options)
M = (magic & 0x03) << 4
# Next two bits are the byte length of the block reference
B = ((byte length of block reference) & 0x03) << 2
# Final two bits are the byte length of the transaction index
T = ((byte length of transaction index) & 0x03)
# Assemble into the first byte
Byte0 = V | M | B | T

This gives you up to 3 bytes for each block and transaction reference, which is 16.7 M blocks, or year 2336, and 16.7 M transaction slots.

Data part: [Byte0][optional magic bytes 1-2][block reference bytes][tx reference bytes]

So the shortest data part would have 3 bytes in it, with the reference version 0 genesis coinbase transaction having data part 0x050000.

I know this is a departure from your vision, but it would be much more flexible for the long term (in my opinion).

@jonasschnelli

This comment has been minimized.

Copy link
Member

jonasschnelli commented Jul 14, 2017

We should move the technical discussion to the mailing list where it belongs: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-May/014396.html.

@clarkmoody

This comment has been minimized.

Copy link
Contributor

clarkmoody commented Jul 14, 2017

Posted to mailing list. My comments will have no context, since the original BIP content was not sent to the list, only a GitHub link.

@veleslavs veleslavs force-pushed the veleslavs:Bech32_Encoded_TxRef branch 4 times, most recently from 38007c6 to c61649f Sep 12, 2018

@clarkmoody

This comment has been minimized.

Copy link
Contributor

clarkmoody commented Sep 12, 2018

I remain unconvinced that this is useful as written. @gmaxwell 's comments about archival data notwithstanding, it seems overly complex for its purpose.

Let's consider two cases:

Machine-to-machine

If two machines want to communicate the position of a transaction, they would simply use the block height and txid. There is no real need for brevity or additional checksums between machines, and a txid lookup within a block is a small linear search. Checksum information would be present in the machine connection itself, so it is not needed in the message format.

Person-to-person

The spec says As TxRef's are short, we expect that they will be quoted via voice or written by hand. It may take a shorter amount of time to communicate two numbers for block height and tx position than the fewer-but-random letters. So @luke-jr 's proposal above of something like tx:<chain>/<block>/<position> would yield tx:bc/0/0 for the genesis transaction. It remains human-readable but also becomes human-understandable. All coinbase transactions would have /0, making them recognizable.

You could add a variable number of bytes of the txid as a "checksum" of sorts, similar to Git using 4 bytes of the commit hash (tx:bc/541091/4/a51f appends two bytes). You could also change the / to be something else (-,:,_).

This alternate proposal suffers from no limitations on ultimate block height or transaction count.

@ChristopherA

This comment has been minimized.

Copy link

ChristopherA commented Sep 22, 2018

We have been using a slight variant of txref in the #RebootingWebOfTrust for the BTC Decentralized Identifier method called BTCR (https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust-spring2018/blob/master/draft-documents/btcr_resolver.md implementations at https://github.com/dcdpr/btcr-DID-method and https://github.com/WebOfTrustInfo/txref-conversion-java). The biggest difference is that we added an optional output index. If the output index is absent, it is considered 0 (and thus matches the older TXREF exactly), and if it is encoded it is the index.

An addition to our use, a particular advantage of the encoding with the output index is that this is basically what the Lightning Network uses for nodeids. In we add this to TXREF means that a bech32 txref can also be used as a shorter more readable alternative to the current hex notation for Lightning Nodes. (cc @rustyrussell)

@danpape

This comment has been minimized.

Copy link

danpape commented Sep 23, 2018

@veleslavs The changes @ChristopherA is referring to are detailed in a branch I made from an old version of your BIP. We proposed adding extra bits to encode the output index in the txref. I modified your old BIP with two new sections for display formats "0-ext" and "A-ext" You can see those additions here: danpape@c8ab8c9 I'm sorry to have dragged my feet asking to you review them--now that you have made significant changes to your BIP I can revise my changes and get them to you in a few days if you are interested in incorporating them into this BIP.

In addition to the C and Go implementations you note in your new BIP, @kimdhamilton has one for javascript: https://github.com/WebOfTrustInfo/txref-conversion-java , @rxgrant and I have one for C++ within https://github.com/dcdpr/btcr-DID-method. Both implementations support the "txref-ext" proposal. Finally, I have changes I can submit to jonasschnelli for adding "txref-ext" support in the original C code: danpape/bitcoin_txref_code@1d04b9d

@rxgrant

This comment has been minimized.

Copy link

rxgrant commented Sep 24, 2018

@clarkmoody By not using Bech32 there is only guess-and-check to repair transmission error.

@gmaxwell This BIP is not an implicit requirement that bitcoind offer generally queryable access to any txrefs.

@clarkmoody

This comment has been minimized.

Copy link
Contributor

clarkmoody commented Sep 24, 2018

@rxgrant I'm not sure you read my critique.

  • You can use the partial txid as a checksum
  • There are other ways to generate checksums than Bech32, while retaining human-readable encoding. A simple modulus operation between the block height, the tx position, and some prime number could probably provide an adequate checksum that could even be computed by hand.

@veleslavs veleslavs force-pushed the veleslavs:Bech32_Encoded_TxRef branch from c61649f to 6d9433f Sep 24, 2018

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Sep 24, 2018

@clarkmoody ,
Thank-you for your response. - I quite agree, some sort of human understandable transaction URI format would be very useful. However that is out-of-scope of this design.

Such a format could be naturally extended to something like:

(for Bitcoin):

block:        block://  bc/<block-height>
transaction:  tx:://    bc/<block-height>/<transaction-index>
input:        input::// bc/<block-height>/<transaction-index>/<input-index>
output:       output:://bc/<block-height>/<transaction-index>/<output-index>

I think that such a standard format would be very useful for URL's and other web-applications. However the format of this particular BIP is more focused, and has different use case in mind. I would suggest such a URI design should be made with a new BIP.

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Sep 24, 2018

@ChristopherA and @danpape
I think that this proposal could easily be extended to support referencing either inputs or outputs.

We have a single version bit that is currently unspecified. By setting this bit we can easily support an extra field.

About the other display formats, I'm sorry that I have removed the concept from the latest edition of this BIP. Of course there is nothing stopping you from defining your own; just pease take a new magic code.

@rxgrant

This comment has been minimized.

Copy link

rxgrant commented Sep 24, 2018

@clarkmoody When the checksum does not match Bech32 offers a better user experience than guess-and-check repairs. This is what I mean by "repair transmission error".

@rxgrant

This comment has been minimized.

Copy link

rxgrant commented Sep 24, 2018

@veleslavs Since BTCR references transactions on the Bitcoin blockchain, it does not look like a new magic code is useful.

The beginning of our URLs, which look like did:btcr:<bech32-data> offer the same contextual cues as the HRP, so we do not waste characters on it. However, we do keep the txref magic code in the Bech32 data part, to identify testnet references. How do you feel about loosening the encoding section to not require the HRP, in circumstances where it is obvious from context?

@danpape

This comment has been minimized.

Copy link

danpape commented Sep 24, 2018

@veleslavs A question regarding the version bit--perhaps I don't understand, but why just one bit? Wouldn't it be better to have a few more bits to support more than two versions in the future?

@clarkmoody

This comment has been minimized.

Copy link
Contributor

clarkmoody commented Sep 24, 2018

@danpape Version 1 would specify additional space for further versions.

* A Colon<ref>'''Why add a colon here?''' This allows it to conform better with W3C URN/URL standards.</ref> '''":"''' added after '1'.
* Hyphens<ref>'''Why hyphens to the TxRef?''' As TxRef's are short, we expect that they will be quoted via voice or written by hand. The inclusion of hyphens every 4 characters breaks the string and means people don't loose their place so easily.</ref> '''"-"''' added after every 4 characters beyond the colon.
All non-bech32-alphabet characters after the bech32 code separator MUST be ignored/removed when parsing (except for terminating characters).<ref>'''Why strip all non-bech32-alphabet characters?''' We do not wish to expect the users to keep their TxRef's in good unicode form (hyphens, colons, invisible spaces, random unicode characters, etc). We expect them to copy, paste, write by-hand, write in a mix of character sets, etc. Parsers should automatically correct for all sorts these sorts of these errors.

This comment has been minimized.

Copy link
@DanielWeigl

DanielWeigl Sep 26, 2018

Contributor

for all sorts these sorts

typo

This comment has been minimized.

Copy link
@veleslavs

veleslavs Sep 29, 2018

Author

@DanielWeigl thank you, corrected.

@veleslavs veleslavs force-pushed the veleslavs:Bech32_Encoded_TxRef branch 2 times, most recently from 6671e1b to d31c7be Sep 29, 2018

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Oct 16, 2018

@veleslavs Let me know when this is ready to merge

@veleslavs veleslavs force-pushed the veleslavs:Bech32_Encoded_TxRef branch from d31c7be to 5a31841 Dec 14, 2018

veleslavs and others added some commits Nov 6, 2018

Include Optional Encoded Outpoints
With thanks to Daniel Pape for the work behind this idea.

Please not that the test-vectors still need to be updated (again).

@veleslavs veleslavs force-pushed the veleslavs:Bech32_Encoded_TxRef branch from 5a31841 to e9ff93f Dec 14, 2018

@veleslavs

This comment has been minimized.

Copy link
Author

veleslavs commented Dec 14, 2018

@danpape @clarkmoody @rxgrant @ChristopherA @jonasschnelli I think that we are ready for a quick final review (particularly focusing on the test vectors). Then we can get this document merged.

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Feb 15, 2019

@veleslavs FYI you don't need to wait for reviews if you want me to just merge this...

@danpape

This comment has been minimized.

Copy link

danpape commented Feb 18, 2019

I made the last edits and had double checked all the test vectors at that time. Perhaps someone else should review as well, but I'm happy with it. Thanks.

There are two sets of Test Vectors included here:

* Bech32 Encoding Test Vectors. These are to test if a implementation accepts the encoding, with the correct human readable part, and separator.
* Bitcoin TxRef Test Vectors. These test the full specification, in particular, correct values for block hight and the transaction index.

This comment has been minimized.

Copy link
@rex4539

rex4539 Apr 3, 2019

Typo: "hight"

* <tt>txtest1:x7ll-llqq-q9lp-6pe</tt>: <tt>(0xFFFFFF, 0x0)</tt>
* <tt>txtest1:x7ll-llll-lew2-gqs</tt>: <tt>(0xFFFFFF, 0x7FFF)</tt>
The following list gives valid (though strangley formatted) Bitcoin TxRef's and the values in hex. (block height, transaction index)

This comment has been minimized.

Copy link
@rex4539

rex4539 Apr 3, 2019

Typo: "strangley"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.