BIP drafts: SwiftSync Specification#2152
BIP drafts: SwiftSync Specification#2152rustaceanrob wants to merge 3 commits intobitcoin:masterfrom
Conversation
murchandamus
left a comment
There was a problem hiding this comment.
Just a quick first glance, but could you please break your text into shorter lines? That makes it easier to leave review and track what changed between commits. Either 100 or 120 characters per line seems to work well enough.
| Status: Draft | ||
| Type: Specification | ||
| Assigned: ? | ||
| License: BSD-3-Clause |
There was a problem hiding this comment.
Could you please add links to the relevant discussions on the mailing list and Delving on this document and the others?
| License: BSD-3-Clause | |
| License: BSD-3-Clause | |
| Discussion: https://groups.google.com/g/bitcoindev/c/FpSWUxItXQs/m/pnfjP6rFCgAJ |
| License: BSD-3-Clause | ||
| Requires: BIP ?, BIP ? | ||
| ``` | ||
| # Abstract |
There was a problem hiding this comment.
IIRC, Markdown only allows a single first-level header per document, so you probably want to bump all of the headers up one level.
| # Abstract | |
| ## Abstract |
|
FWIW, I don't mind the unbroken lines and even prefer them. Avoids rejigging line lengths to keep them consistent when updating or having lines with very different lengths. |
danielabrozzoni
left a comment
There was a problem hiding this comment.
I did an initial pass and left some comments. I read the BIPs in the commit order (block undo -> histfile -> swiftsync) and it was pretty easy to follow.
| # Abstract | ||
|
|
||
| Inputs of a Bitcoin block are referenced by the outpoint data structure. This poses a limitation during initial block download (IBD), such that a client must process blocks sequentially to validate the chain history. The SwiftSync protocol allows blocks to be evaluated in arbitrary order, however additional data is required that must be served over the peer-to-peer network. Namely, the creation height, coinbase flag, input script, and amount for each spent coin must be accessible to fully validate a block in a state-less manner. This data cannot be trusted by a peer under usual conditions, however SwiftSync allows a client performing IBD to validate the correctness of this data. | ||
| # Motivation |
There was a problem hiding this comment.
The motivation is a bit repetitive with what's in the abstract, and there are a lot of technical details. I would reshape this a little in order to give a stronger motivation ("IBD is slow because it must be done sequentially, SwiftSync speeds this up by removing this limitation")
|
|
||
| # Abstract | ||
|
|
||
| Inputs of a Bitcoin block are referenced by the outpoint data structure. This poses a limitation during initial block download (IBD), such that a client must process blocks sequentially to validate the chain history. The SwiftSync protocol allows blocks to be evaluated in arbitrary order, however additional data is required that must be served over the peer-to-peer network. Namely, the creation height, coinbase flag, input script, and amount for each spent coin must be accessible to fully validate a block in a state-less manner. This data cannot be trusted by a peer under usual conditions, however SwiftSync allows a client performing IBD to validate the correctness of this data. |
There was a problem hiding this comment.
While reading the spec we notice that the bip is about sharing undo data, but that was not clear from the abstract.
I would add a sentence along the lines of "This document describes how to share [this additional data] over the peer to peer network"
| ## Data structures | ||
| #### Height Code | ||
|
|
||
| The height and coinbase flag are encoded as a 32 bit integer. To encode the height and flag, binary left shift the height one bit, treat the boolean as a bit, insert it into the newly opened bit position. To decode the height, binary right shift the code. To decode the coinbase flag, mask the first bit position of the header code and interpret the bit as a boolean. |
There was a problem hiding this comment.
"To encode the height and flag, binary left shift the height one bit, treat the boolean as a bit, insert it into the newly opened bit position."
I would be slightly more explicit and say "the coinbase flag" instead of "the boolean"
| | `len` | `CompactSize(Len(vector<Coins>))` | The length of the coins vector | | ||
| | `coins` | `vector<Coin>` | Coins that were spent, after filtering on the request `cutoff` | | ||
|
|
||
| A client supporting the `bspent` MUST include coins created _before_ the `cutoff` field in `getbspent` requests. A client receiving a `bspent` message with un-requested or missing coins MUST disconnect from the serving peer. A client supporting `bspent` MUST adhere to the format of `Coin` specified in the `verion` of the request. |
| - `[]byte`: arbitrary byte vector | ||
| - `<N bytes>`: size `N` byte vector | ||
| - `vector<Foo>`: arbitrary sized vector of `Foo` |
There was a problem hiding this comment.
nit: I was a bit confused at first, I would reshape this a little to make it clearer:
[]byte: arbitrary sequence of bytes with no fixed length<N bytes>: byte vector of sizeN, whereNis specified inline (e.g. <32 bytes>)vector<Foo>: vector of arbitrary length of elements of typeFoo
| - `<N bytes>`: size `N` byte vector | ||
| - `vector<Foo>`: arbitrary sized vector of `Foo` | ||
| - `CompactSize`: encoding of unsigned integers defined in peer-to-peer messages | ||
| - `CompressAmount`: defined in the Function Appendix section |
There was a problem hiding this comment.
nit: I would be a bit more explicit here:
CompressAmount: compression function for integer amounts, as defined in the Function Appendix section
| | `0x07` | `P2WPKH` | `<20 bytes>` | `OP_0 20 <20 bytes>` | | ||
| | `0x08` | `P2TR` | `<32 byte X-only public key>` | `OP_1 32 <32 bytes>` | | ||
|
|
||
| Scripts are serialized in this format by concatenating the `Prefix` and `Format` fields. |
There was a problem hiding this comment.
I would move this above the table, I kept wondering how to encode until I read this :)
| | Field | Type | Serialization | Description | | ||
| | :----------------------- | :---------------------------- | ------------- | ----------------------------------------------------------------------------------------------------------------- | | ||
| | Input index | 32 bit unsigned integer | Little endian | The index in the block inputs for which this coin corresponds. The coinbase inputs are _excluded_ from this index | | ||
| | Height and coinbase flag | Height code | Little endian | | |
There was a problem hiding this comment.
I think this should have "Defined above" in "Serialization"
| A current limitation of IBD is that it must be done sequentially. This is a result of the height, coinbase flag, input script, and amount of the block inputs being omitted from the data committed to by proof of work in the current block, and, thus, this data cannot be trusted if received over the wire naively. Using the SwiftSync protocol, a client is able to verify the correctness of this data, even if served by a potentially untrusted party. This is a property of the SwiftSync hash aggregate, which commits to the height, coinbase flag, script, and amount when creating and deleting coins. | ||
| # Specification | ||
|
|
||
| In Bitcoin Core, to roll-back the chain state in the event of a block reorganization, the height, coinbase flag, script and amount metadata for each input of a block are stored in a data structure known colloquially as "undo data". This terminology stems from its use to "undo" the effect of a block by repopulating the UTXO set with the coins that were spent by the reorganized block. To remain general in language, this data will be referred as "spent coins." |
There was a problem hiding this comment.
nit: it took me a while to figure out that the coinbase flag is needed in order to check that a transaction is not spending an immature coinbase... It would be nice to have this specified somewhere to avoid confusion, but I'm not sure if it really fits here, so, as you wish
jurraca
left a comment
There was a problem hiding this comment.
some writing nits but overall the concept is clear enough.
|
|
||
| # Abstract | ||
|
|
||
| Inputs of a Bitcoin block are referenced by the outpoint data structure. This poses a limitation during initial block download (IBD), such that a client must process blocks sequentially to validate the chain history. The SwiftSync protocol allows blocks to be evaluated in arbitrary order, however additional data is required that must be served over the peer-to-peer network. Namely, the creation height, coinbase flag, input script, and amount for each spent coin must be accessible to fully validate a block in a state-less manner. This data cannot be trusted by a peer under usual conditions, however SwiftSync allows a client performing IBD to validate the correctness of this data. |
There was a problem hiding this comment.
however additional data is required that must be served over the peer-to-peer network
I would combine this and the next sentence so it's clear what the data is required for at the end of this sentence, e.g. "... in order to validate the blocks, namely the creation height, coinbase flag...".
This data cannot be trusted by a peer under usual conditions
"under usual conditions" doesn't provide much value here, imo it's either trusted or validated, in this case it can't be trusted when served by a peer.
| # Abstract | ||
|
|
||
| Inputs of a Bitcoin block are referenced by the outpoint data structure. This poses a limitation during initial block download (IBD), such that a client must process blocks sequentially to validate the chain history. The SwiftSync protocol allows blocks to be evaluated in arbitrary order, however additional data is required that must be served over the peer-to-peer network. Namely, the creation height, coinbase flag, input script, and amount for each spent coin must be accessible to fully validate a block in a state-less manner. This data cannot be trusted by a peer under usual conditions, however SwiftSync allows a client performing IBD to validate the correctness of this data. | ||
| # Motivation |
There was a problem hiding this comment.
The Abstract and Motivation repeat similar ideas. imo the Abstract part should focus more on what this proposal provides, instead of the reasoning and context that led to it. Could also merge them into one section if you want.
|
|
||
| In Bitcoin Core, to roll-back the chain state in the event of a block reorganization, the height, coinbase flag, script and amount metadata for each input of a block are stored in a data structure known colloquially as "undo data". This terminology stems from its use to "undo" the effect of a block by repopulating the UTXO set with the coins that were spent by the reorganized block. To remain general in language, this data will be referred as "spent coins." | ||
|
|
||
| Bitcoin Core full archival nodes store spent coins for all blocks. This is useful in the context of SwiftSync, as no additional index must be created or maintained to serve this data to peers. There are, however, some discrepancies between how this data is serialized on disk in Bitcoin core and how this proposal seeks to serialize this data over the peer-to-peer protocol, which are detailed in the rationale section. |
SwiftSync is a protocol for clients to parallelize initial block download, based on the original writeup.