-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BIP draft: Terminology for transaction components and aspects #1
base: master
Are you sure you want to change the base?
Conversation
e3871e7
to
f75f597
Compare
f75f597
to
4f2eb73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be continued..
bip-tx-terminology.mediawiki
Outdated
: [AT] The variant of the Script language used in P2TR Leaf Scripts (see BIP342). | ||
|
||
; Transaction Header | ||
: [AT] Collective term for the serialization artifacts and transaction fields that appear only once in the transaction serialization regardless of lengths of the Input and Output Lists: Transaction Version, Marker (segwit only), Flag (segwit only), length of Transaction Input List, length of Transaction Output List, Locktime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the terms "transaction field" and "transaction component" interchangeable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s a good question which I shall ponder further
bip-tx-terminology.mediawiki
Outdated
: [AT] A script template that forwards input validation to the Witness Stack. Witness Programs are a type of Forwarding Script. Witness Programs appear in the Output Script for native segwit outputs and in the Input Script for wrapped segwit inputs. | ||
|
||
; Witness Structure | ||
: [TC] The part of the serialized transaction that contains the witness stacks for each input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's the part of a serialized transaction, and because the Witness Stack is a TC, should this (Witness Structure) be a SA instead of a TC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, since the count of Witness Stacks is defined by the input counter, it’s literally just the concatenation of all Witness Stacks, so there are no additions to the serialization. I’ve redefined it as an [AT].
Gonna leave it open, since it still is a bit of a head-scratcher actually.
Nicely done! I think it is tremendously helpful to have a single source of truth for this stuff, especially as things have evolved over time and some terms in BIP-144 aren't as concise as they could be, and we have new fields for taproot transactions. I left comments on parts of this document that I felt were especially useful to have explained. I didn't go through the glossary with a fine toothed comb, but left notes on suggestions/questions that immediately jumped out at me. After a first pass, I think I mostly understand all the terms, but I'll be the first to admit I know next to nothing about P2TR, so cannot be of much help in reviewing those sections. Do you plan on adding some diagrams to this? Here is one I had to draw out and continued to reference as I made my way through the draft: As I look at it now, I'm a little confused by redeem script and bare output script being ATs and not TCs. My brain wants all little blue blocks on the bottom row to be TCs. Right now I don't think the motivation section does enough justice to the rest of the document, especially considering conversations we've had on the general confusion around some of these terms. I'm a little afraid to dig up my old notes on this stuff because even my attempts to tease apart some of these terms confused me! At the moment, I don't have any concrete suggestions on ways to enrich this section other than anecdotal examples (i.e. "people often use locking script, output script, and redeem script interchangeably"). But I will come back if I think of ideas. Overall I think this is a really good idea, and certainly not a small task! Excited for this draft to continue to evolve! |
The graphic is an excellent idea. I’ll incorporate that soon. Gotta run in a minute, will just push the quick fixes meanwhile. |
Also: • mention endianness of txid in Outpoint • amend comment about replaceability
36e5de9
to
cbac6f5
Compare
Concept ACK. Nit: document uses the term "UTXO" without defining it. Would love if you could normalize the usage of the term TXO (a transaction output that may be spent or not) with this document as people sometimes complain when I use it (and prefer the term txout). |
bip-tx-terminology.mediawiki
Outdated
: While BIP144 refers to witness stacks as “script witnesses”, they are not scripts. Strictly speaking, they’re also not stacks, because some Witness Items that appear in Witness Stacks are not added to the stack, such as Control Blocks. We prefer Witness Stack as it is well-established. | ||
|
||
; Output Script vs Locking Script | ||
: The scriptPubKey is also sometimes referred to as a "locking script". However, we aim to emphasize the position of the field in the transaction, as it can either take the function of a condition or forwarding script. We therefore prefer a name that references the location in the transaction rather than a function it does not always have. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, great initiative and great job explaining the terminology. I only have a couple of related suggestions. Rather than commenting in several places I would argue for a couple of changes here, in one place.
I would suggest that Locking Script and Unlocking Script are also mentioned 'aka' to Output Script and Input Script respectively.
I found that a lot of people have been using them this way and I have adopted it with success, i.e. people understood much easier what was the purpose of the script. I also I have never seen condition script and script arguments to be associated with locking and unlocking script respectively, but maybe I just never happened to come by those!
Rationale
I agree with emphasizing the position of the field. Output Script (as well as Input Script) should be the main entry.
However, an output script is what effectively locks the funds and I argue that it always has this function. e.g. P2SH locks the funds to an unknown script that will be revealed later (similarly for taproot alternative paths/scripts).
Similarly an input script is always what unlocks the funds. It includes everything required to unlock them (incl.any forwarding scripts, etc.).
Suggested changes:
Input Script entry: include "aka scriptSig or Unlocking Script"
Output Script entry: include "aka scriptPubKey or Locking Script".
Condition Script entry: just remove "aka locking script"
Script Arguments entry: just remove "aka Unlocking Script"
Rephrase (or maybe remove) this section.
Thanks again for your efforts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @karask, thanks for your comment. I should clarify that condition script and script arguments are newly introduced terms per this proposal specifically to decouple the position of the input script and output script from the function of locking and unlocking, and to underscore the evolution of output types from executable scripts to templates with additional meaning.
As output types have evolved over the years, we have departed from them relying on fully specified executable scripts, but rather imbued certain templates with additional constraints. For example a P2SH program as written only checks that the redeem script revealed in the input script matches the pre-image in the output script, but the implied meaning of the P2SH program additionally requires the redeem script to be evaluated satisfactorily. Later with wrapped P2WSH, we didn’t even require the witness script to contain all the arguments with push operations directly in the script but rather provide them as separate items akin to a pre-built stack. These stand-alone witness items is what I refer to as script arguments. With native segwit outputs finally, the input script is empty altogether, and it would feel weird to me to refer to still refer to it as “unlocking script” when it holds no sway in the authorization of spending a UTXO at all anymore.
I think I’ve identified a few points that I need to rework, and I will look to incorporate my explanation into the rationale as well as the descriptions of input script and output script. I’ll think about how to best mention “locking script” and “unlocking script”. I’ll also try to rework my description of forwarding scripts to honor their function in committing the spender to the condition script even when they no longer directly express the spending conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Murch - Are Forwarding Scripts (Witness Programs and P2SH Programs) not considered to be locking scripts?
Is that why "locking script" is specifically associated with Condition Scripts and not general Output Scripts, as karask suggests?
I ask because I agree with the first two suggestions of this comment, but also am worried taking that position is reinforcing confusion of the terms "locking script" and "unlocking script". I second karask's point that locking script and unlocking script are comparatively easier to understand, and I feel like they have seen a lot of success. Maybe because they naturally invoke an image of a lock and key.
They definitely should be incorporated in this BIP, as you have done, though the devil is in the details of how they should be defined/represented. I think what I'm trying to say is the terms locking/unlocking script will likely be easy and accessible entry points for people trying to understand this doc and level up their transaction terminology so care must be taken in how they are represented! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized while I was typing my comment I missed your response. Please disregard if I've said anything that has already been answered 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s actually a good point. Perhaps “forwarding scripts” and “condition scripts” are both subtypes of “locking scripts”.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as a todo to elaborate on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: document uses the term "UTXO" without defining it. Would love if you could normalize the usage of the term TXO (a transaction output that may be spent or not) with this document as people sometimes complain when I use it (and prefer the term txout).
Defined the term UTXO and replaced multiple uses of UTXO with “TXO”.
@vostrnad, @karask, @satsie: Addressed most open feedback.
I’m in the process of lowercasing and italicizing all defined terms, see first few in last commit, will continue in that style.
Hi @xekyo
I don't see my locking/unlocking concerns be addressed (with changes or counter-explanations). Not sure if you are still contemplating on this but I wanted to raise it again now (or else I will probably forget! :-) It might seem pedantic but I think it is important.
Locking and unlocking scripts already have a high-level semantic meaning. Indeed, I don't believe they are mentioned anywhere in BIPs. If this BIP did not mention locking and unlocking scripts I would not raise it as an issue, esp. given that the terminology is targeted more towards BIP/technical writing.
I have already mentioned how it has been used until now and if we mention it in this BIP it should be aligned. My concern is that associating them with "extra"/specific meaning (condition script/script arguments) is going to further confuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @karask, I added it as a todo in the opening comment, I am currently reworking the capitalization and formatting issues, while I mull over how I want to incorporate aliases better.
Do you happen to know some examples for resources that use the terminology of “locking script” and “unlocking script”? I think it might be used in Mastering Bitcoin, but I don’t know other cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mastering Bitcoin is the main resource I had in mind as well; it has proliferated from there to several blogs and other posts though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karask: I now mention “locking script” as a synonym of output script, and “unlocking script” as synonym for input script. I also rewrote the rationale sections regarding them. They could probably still use some polish, but I’m starting to suffer from tunnel vision—I think I’ve rewritten it four times meanwhile. 😬
I could use a set of fresh eyes to tell me whether it makes any sense, if you have a moment to glance at those definitions and rationales specifically. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: document uses the term "UTXO" without defining it. Would love if you could normalize the usage of the term TXO (a transaction output that may be spent or not) with this document as people sometimes complain when I use it (and prefer the term txout).
Defined the term UTXO and replaced multiple uses of UTXO with “TXO”.
@vostrnad, @karask, @satsie: Addressed most open feedback.
I’m in the process of lowercasing and italicizing all defined terms, see first few in last commit, will continue in that style.
bip-tx-terminology.mediawiki
Outdated
: While BIP144 refers to witness stacks as “script witnesses”, they are not scripts. Strictly speaking, they’re also not stacks, because some Witness Items that appear in Witness Stacks are not added to the stack, such as Control Blocks. We prefer Witness Stack as it is well-established. | ||
|
||
; Output Script vs Locking Script | ||
: The scriptPubKey is also sometimes referred to as a "locking script". However, we aim to emphasize the position of the field in the transaction, as it can either take the function of a condition or forwarding script. We therefore prefer a name that references the location in the transaction rather than a function it does not always have. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as a todo to elaborate on
This [WIP] commit mostly fixes formatting:
@karask: There is a very rough rationale on locking/unlocking which I pushed because I have to run now, and might want to look at it more from another device, but I wouldn’t consider this to be fully addressing your concerns yet. Thanks for your patience. |
Please let me know if you notice any synonyms that I’m still missing and should be listing. |
b180277
to
730a813
Compare
I think it's actually quite clear from the definition of forwarding script: "Witness programs and P2SH programs are forwarding scripts. Forwarding scripts make use of script templates that imply additional evaluation steps beyond the explicitly expressed conditions." Thus, a P2TR output script is a forwarding script that either implies a signature check or forwards to a leaf script. The taproot output key is a witness program, it's not a condition script since it doesn't include any explicit conditions and isn't even a script (unlike every other type of condition script).
P2WPKH doesn't either, the situation is nearly identical to P2TR keypath. |
Thanks @xekyo for initiating this much-needed endeavor and to all the contributors working to enhance our shared language. Speaking of language, I experimented with an LLM as a co-pilot for this task. phind, a new LLM-powered search engine for developers appears to be a great fit. Unfortunately, I will be occupied in the upcoming weeks, but someone else may find value in utilizing this tool for further exploration: Example the 3 first prompts: |
I noticed that both 'witness structure' and 'witness stack' have 'witness' as a synonym. While it has been so used, i.e. a 'witness' in the context of a tx is the witness structure but in the context of an input it is a witness stack, I believe that having it as synonym to both might be confusing (at least without clarification). If we were to remove one of the two, I would suggest remove it from witness structure, only because of relative frequency, i.e. we talk about witness in the context of inputs more often than of txs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good so far. I haven't read through the "Rationale" section yet, but here are some initial comments.
: [Artifact] 1-byte serialization artifact indicating that a type of extended serialization is being used for this transaction. Must always be <code>00</code> (see [[https://github.com/bitcoin/bips/blob/master/bip-0144.mediawiki|BIP144]]). (Note: Non-segwit nodes will only accept stripped segwit transactions, because the marker appears at the position where non-segwit nodes expect the input counter. The input counter may not be zero, so a complete (non-stripped) segwit transaction appears invalid to a non-segwit node.) | ||
|
||
; outpoint | ||
: [Component] Identifies the transaction output (TXO) being spent by a transaction input. Consists of a txid and output index. The txid is serialized in little-endian but displayed in big-endian. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen prevout
commonly used to refer to an outpoint.
: [Component] Identifies the transaction output (TXO) being spent by a transaction input. Consists of a txid and output index. The txid is serialized in little-endian but displayed in big-endian. | |
: [Component] Identifies the transaction output (TXO) being spent by a transaction input. Consists of a txid and output index. The txid is serialized in little-endian but displayed in big-endian. | |
: Synonym: <code>prevout</code> |
|
||
We also introduce some umbrella terms, concepts, and ideas that are useful to describe aspects of transactions (labeled [Concept]). | ||
|
||
===Anatomy of a serialized transaction=== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section would make more sense if it were put below "Definition of Terms" as an example of some of the things that were discussed in that section.
; output index | ||
: [Component] Part of an outpoint. The position of the output in a transaction’s output list that created the identified TXO. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen this referred to as vout
.
; transaction input list | ||
: [Concept] The enumeration of all transaction inputs of a transaction. | ||
: Synonym: <code>tx_ins</code>, inputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is often called the vin
.
; transaction output list | ||
: [Concept] The enumeration of all transaction outputs of a transaction. | ||
: Synonym: outputs, <code>tx_outs</code> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is often called the vout
.
** '''Transaction Header''' (cont.) | ||
*** '''Lock Time''' <code>ffd30a00</code>: the 4-byte lock time field, little-endian for 709631 | ||
|
||
===Definition of Terms=== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strong Concept ACK! Thank you for volunteering to wade through all the bikeshedding in this noble endeavor :) Still reading and digesting but will leave feedback in the next few days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
** '''Transaction Output List''' | ||
*** Length of '''Transaction Output List''' <code>01</code>: serialization artifact, <Code>CompactSize</code> here indicating 1 output, considered part of the Transaction Header | ||
*** First '''Output''' | ||
**** '''Amount''' <code>b4ba0e0000000000</code>: field defining that 965300 satoshi are assigned to this output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
satoshi
is used as the plural version here, but in other places the plural is written as satoshis
.
Maybe the term ought to be defined separately as well? (Eg - a satoshi
is the smallest denomination of a coin. Each coin represents 100,000,000 satoshis
.)
: [Artifact] A serialization artifact indicating features used by the transaction. As of writing, the only allowed value is <code>01</code> which indicates that the transaction serialization has a witness structure (see [[https://github.com/bitcoin/bips/blob/master/bip-0144.mediawiki|BIP144]]). | ||
|
||
; forwarding script | ||
: [Concept] A collective term for scripts that redirect input validation to another script or data structure. Witness programs and P2SH Programs are forwarding scripts. Forwarding scripts make use of script templates that imply additional evaluation steps beyond the explicitly expressed conditions. In the case of P2SH, the output script in verbatim only implies that the redeem script must be the preimage of the hash in the output script, but the template prescribes that the redeem script must additionally be satisfied. For witness programs, the output script is even less verbose with more implied meaning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the output script in verbatim only implies
I think just verbatim is more correct.
Maybe a better way of expressing this idea is:
"In the case of P2SH, the output script itself only explicitly specifies that the redeem script must be the preimage of the hash in the output script, but the template requires..."
Concept ACK 👍👍👍 |
: [Component] Part of an outpoint. The position of the output in a transaction’s output list that created the identified TXO. | ||
|
||
; output script | ||
: [Component] Must be present in each transaction output. Contains either a condition script or a forwarding script. Originally, the Bitcoin code base used Hungarian notation. This field was presumably named <code>scriptPubKey</code> to refer to the transaction field of the type script that contained the public key. The use of this field had evolved beyond that interpretation even when Bitcoin was published since the field could already contain more complicated scripts. Today, this field is referred to as the output script. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "presumably" bit doesn't seem right to me? I would have assumed the thinking was "okay, so we have an elliptic-curve public key to decide who can spend it, and then an ECDSA signature from that public key to actually spend it" "yeah, but what about generalising it for multisig or only-if-you-win-this-hand-of-poker or whatever?" "oh, cool cool cool. so let's have a scripting language so you can specify arbitrary conditions?" "and then we'll just turn both the pubkey and signature into scripts -- the scriptPubKey will be a public script that only the people allowed to spend can satisfy, and they'll do that with a script of their own which we'll call the scriptSig" "nice! you're so smart satoshi!" "thanks satoshi!".
That is, "scriptPubKey" is a script (that's it type) and it's also a "PubKey" (that's its purpose -- it's the public part that everyone can know, for which there's a corresponding private part that someone can use to generate a signature), it's not that the "scriptPubKey" is a script that happens to contain ECC pubkeys. That got changed with p2sh and segwit, where additional implicit constraints beyond what is specified in the script are imposed; but in that case "output script" isn't really an accurate description either for the same reason: p2sh, p2wpkh, p2wsh and p2tr scriptPubKeys aren't really scripts in any meaningful sense.
I'm not really sure it even makes sense to come up with a single term to describe all the current scriptPubKey possibilities -- p2pk, p2pkh, p2ms, p2anchor are all scripts, but p2sh, p2wsh are just pushing hashes of scripts, p2wpkh is a push of a pubkey hash, and p2tr pushes a direct pubkey that may also involve many hidden script hashes. Not all of those involve a "script", not all of them involve an ECC pubkey, p2anchor doesn't even put any constraints on spending. Maybe "spending constraint commitment" would be accurate? (The constraint itself is "a signature by this pubkey" or "give me a script that hashes to this, and then an initial stack for the script that results in successful execution")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your interpretation of the origin of scriptPubKey
, perhaps it could be included like this? (I'm sure someone can write it more elegantly)
: [Component] Must be present in each transaction output. Contains either a condition script or a forwarding script. Originally, the Bitcoin code base used Hungarian notation. This field was presumably named <code>scriptPubKey</code> to refer to the transaction field of the type script that contained the public key. The use of this field had evolved beyond that interpretation even when Bitcoin was published since the field could already contain more complicated scripts. Today, this field is referred to as the output script. | |
: [Component] Must be present in each transaction output. Contains either a condition script or a forwarding script. Originally, the Bitcoin code base used Hungarian notation. This field was presumably named <code>scriptPubKey</code> as it was of the type script and either because it contained the public key or because it fulfilled the role of a public key. The use of this field had evolved beyond that interpretation even when Bitcoin was published since the field could already contain more complicated scripts. Today, this field is referred to as the output script. |
However, "output script" is, in my opinion, by far the best name for this field, even for scripts with implied meaning ("forwarding scripts"), unless someone can convince me that this shouldn't be called a script:
OP_0 OP_PUSHBYTES_20 7de8b69f55bebe02f99f77b2c8f9848b915f1953
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Thanks for working on this!
Looking at the transaction dissection, I noticed that for some of the integer fields (e.g. nLockTime) the endianness used for serialization is included, while it's missing in others (Transaction Version, Output Index, Sequence, Amount). To be consistent, could either include it in all of them or remove it from all and write some general sentence that every integer field is serialized in little-endian, if not explicitly stated otherwise? IIUC that's the case and we don't ever serialize in big-endian (except for txids, where the concept of endianness IMHO doesn't make that much sense, see comment below).
Left two additional comments below, feel free to ignore. Only looked at the anatomy section so far, will continue with the definition of terms section soon.
*** Second '''Witness Stack''': The witness data corresponding to the second input. | ||
**** Item count for the second '''Witness Stack''' <code>03</code>: serialization artifact, two or more Witness Items indicate a scriptpath spend. | ||
**** First '''Witness Item''' | ||
***** '''Length of the first Witness Item''' <code>40</code>: 64 bytes indicate a signature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could make readers think that signatures in Bitcoin are always 64 bytes long. Not sure if it's worth to go into more detail here (e.g. by adding a foot-note "note that for pre-taproot spends, signatures were larger and varied in size, only since segwitv1, Schnorr signatures have always a fixed size of 64 bytes (or 65 bytes with non-default sighash flag)"). Maybe just s/signature/Schnorr signature/ would already be a slight improvement?
*** Length of '''Transaction Input List''' <code>02</code>: serialization artifact, <code>CompactSize</code> here indicating 2 inputs, considered part of the transaction header | ||
*** First '''Input''' | ||
**** '''Outpoint''' | ||
***** '''txid''' <code>7bc0bba407bc67178f100e352bf6e047fae4cbf960d783586cb5e430b3b700e7</code>: little-endian txhash indicating that the spent TXO was created by the transaction e700b7b330e4b56c5883d760f9cbe4fa47e0f62b350e108f1767bc07a4bbc07b. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pedantic mode ad "little-endian": a txid is a hash result and hence just a sequence of bytes, the concept of endianness only becomes relevant if we'd ever want to treat it as integer (which we do for block-hashes w.r.t. PoW calculations, but AFAIK never for txids). It's just that for some reason the txids are all shown in reverse when presented to / parsed from the user (see uint256.GetHex()/SetHex()), but on the wire the txid is serialized in the same order as the double-sha256 hash result. So in my world-view, if we want to keep the endianness terminology, the way we store txids internally and on the wire would be "big-endian", and "little-endian" is used for displaying and parsing them. Wrote about other unfortunate consequences of that need for txid reversing a longer time ago here: bitcoin/bitcoin#24952 (comment).
I'd also mildly agree to the anwer in https://bitcoin.stackexchange.com/questions/2063/why-does-the-bitcoin-protocol-use-the-little-endian-notation/2069#2069: "Hashes are defined by the standards as being big-endian, and crypto libraries deal with them in that form, so hashes are transmitted in big-endian. Bitcoin displays hashes in little-endian because Bitcoin sometimes considers hashes to be little-endian integers instead of strings." (that "sometimes" seems to be only for block hashes, as far as I'm aware)
Would it be out of scope to describe attributes that pertain to an entire block, and not a specific transaction? In particular, the peculiarities around the |
Nice BIP! Is this mainly intended for human language rather than naming things in libraries of various programming languages? It might be nicer to make some of the names shorter for programming. Or perhaps have alternative names used in programming. For instance "output index" is a good name to use without context since "index" alone would be confusing but within context "index" can be better. e.g. This is also not including block components. Is that explicitly out of scope? Might be nice to have that also covered. |
*** Length of '''Transaction Input List''' <code>02</code>: serialization artifact, <code>CompactSize</code> here indicating 2 inputs, considered part of the transaction header | ||
*** First '''Input''' | ||
**** '''Outpoint''' | ||
***** '''txid''' <code>7bc0bba407bc67178f100e352bf6e047fae4cbf960d783586cb5e430b3b700e7</code>: little-endian txhash indicating that the spent TXO was created by the transaction e700b7b330e4b56c5883d760f9cbe4fa47e0f62b350e108f1767bc07a4bbc07b. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be "Transaction ID"? Stands out as the only acronym here.
Seems like some names could have a "short-name" etc. E.g. "Transaction ID" ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strong concept ACK
I updated 400 pages of documentation to use the terms suggested in this document and didn't encounter anything objectionable. A few open questions I had are mentioned below.
Although I may have been the one to suggest to @xekyo to add synonyms, I think it might be worth moving the Synonyms lists to a table on the Bitcoin.it wiki (or some other user-editable location). That way adding new synonyms won't require editing this BIP after it is finalized. Additionally, a list of synonyms might be a good place to put common short aliases for programmers as mentioned by @Kixunil in #1 (comment) Even from the original draft of what became BIP125, we included a link to a wiki page for attracting user-contributed information about replacement policies and support, so that sort of forward outsourcing has been done before.
*** First '''Output''' | ||
**** '''Amount''' <code>b4ba0e0000000000</code>: field defining that 965300 satoshi are assigned to this output | ||
**** '''Output Script (scriptPubKey)''' | ||
***** Length of the '''scriptPubKey''' <code>16</code>: serialization artifact, here instructing the interpreter to read 22 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is scriptPubKey offered in parenthesis as an alias for output script and used on the following line when we don't follow a similar convention for input scripts (scriptSigs)?
; P2SH program | ||
: [Concept] A script template that forwards input validation to the redeem script. P2SH programs are a type of forwarding script. | ||
|
||
; redeem script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be "redeem script" or "redemption script"? The original version of BIP16 (pre-fork) uses "redeemScript", but a post-fork coda added after the discovery of an unintended consequence (the 520 byte limit) uses "redemption script".
To me, "redemption script" feels more in-line with this document's focus on proper English phrasing for terms; however, "redeem script" is probably the far more widely used term at this point given Bitcoin Core's API calling it "redeemScript". I don't have a preference either way.
; input index | ||
: [Concept] The position of an input in a transaction’s input list. | ||
|
||
; input script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we call the coinbase field? In Bitcoin Core's API, a regular transaction's scriptSig is what this document calls an input script:
$ bitcoin-cli getblock $( bitcoin-cli getbestblockhash ) 2 | jq '.tx[1].vin[0].scriptSig'
{
"asm": "",
"hex": ""
}
But in that same API, the field at the same position in the first transaction of a block is called a coinbase field:
$ bitcoin-cli getblock $( bitcoin-cli getbestblockhash ) 2 | jq '.tx[0].vin[0].coinbase'
"03b1270c04f4119f642f466f756e6472792055534120506f6f6c202364726f70676f6c642f2d7cf2e20529010000000000"
My preference would be to call that special field the "coinbase" field (like Bitcoin Core does), which follows historical convention and helps people understand why we call that transaction the coinbase transaction.
This document proposed a set of terminology to refer to various aspects and components of transactions.
TODOs:
varInt
vscompactSize
varInt
ScriptCode