Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Dev Docs: Describe P2P Messages That Request Or Reply With Data #642

Merged
merged 4 commits into from Nov 16, 2014

Conversation

Projects
None yet
2 participants
Contributor

harding commented Nov 12, 2014

P2P Reference Preview: http://dg0.dtrt.org/en/developer-reference#p2p-network
P2P Examples Preview: http://dg0.dtrt.org/en/developer-examples#p2p-network

Adds to the devel reference page detailed documentation on the following messages: block, getblocks, getdata, getheaders, headers, inv, mempool, merkleblock, notfound, and tx.

Adds to the devel examples page an example of requesting and parsing a merkleblock message.

Adds to the devel docs overview pages links to the above two new P2P sections.

Tweaks the autocrossref plugin ignore pattern to not crossref in the middle of a GIF image name; this allows the inclusion of animated GIFs.

Note: I'm working on documenting the remaining P2P messages in a separate branch.

harding and others added some commits Oct 28, 2014

Dev Docs: Add P2P Messages That Request Or Reply With Data
Adds to the devel reference page detailed documentation on the following
messages: block, getblocks, getdata, getheaders, headers, inv, mempool,
merkleblock, notfound, and tx.

Adds to the devel examples page an example of requesting and parsing a
merkleblock message.

Adds to the devel docs overview pages links to the above two new
P2P sections.

Tweaks the autocrossref plugin ignore pattern to not crossref in the
middle of a GIF image name; this allows the inclusion of animated GIFs.

@saivann saivann commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+{% endhighlight %}
+
+{% endautocrossref %}
+
+### Data Messages
+
+{% autocrossref %}
+
+The following network messages all request or provide data related to
+transactions and blocks.
+
+![Overview Of P2P Protocol Data Request And Reply Messages](/img/dev/en-p2p-data-messages.svg)
+
+Many of the data messages use
+[inventories][inventory]{:#term-inventory}{:.term} as unique identifiers for
+for transactions and blocks. Inventories have a simple 36-byte
@saivann

saivann Nov 15, 2014

Contributor

s/for for transactions/for transactions

@saivann saivann commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+The following annotated hexdump shows a `headers` message. (The message
+header has been omitted.)
+
+{% highlight text %}
+01 ................................. Header count: 1
+
+02000000 ........................... Block version: 2
+b6ff0b1b1680a2862a30ca44d346d9e8
+910d334beb48ca0c0000000000000000 ... Hash of previous block's header
+9d10aa52ee949386ca9385695f04ede2
+70dda20810decd12bc9b048aaab31471 ... Merkle root
+24d95a54 ........................... Unix time: 1415239972
+30c31b18 ........................... Target (bits)
+fe9f0864 ........................... Nonce
+
+00 ......... Transaction Count (0x00)
@saivann

saivann Nov 15, 2014

Contributor

Small alignment issue, should probably be:
00 ................................. Transaction Count (0x00)

@saivann saivann commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+de55ffd709ac1f5dc509a0925d0b1fc4
+42ca034f224732e429081da1b621f55a ... Hash (TXID)
+
+01000000 ........................... Type: MSG_TX
+91d36d997037e08018262978766f24b8
+a055aaf1d872e94ae85e9817b2c68dc7 ... Hash (TXID)
+{% endhighlight %}
+
+{% endautocrossref %}
+
+#### MemPool
+
+{% autocrossref %}
+
+The `mempool` message requests the TXIDs of transactions that the
+receiving node has verified are valid but which have not yet appeared in
@saivann

saivann Nov 15, 2014

Contributor

s/verified are valid/verified as valid , maybe?

@saivann saivann commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+{% endautocrossref %}
+
+#### MerkleBlock
+
+{% autocrossref %}
+
+The `merkleblock` message is a reply to a `getdata` message which
+requested a block using the inventory type `MSG_MERKLEBLOCK`. It is
+only part of the reply: if any matching transactions are found, they will
+be sent separately as `tx` messages.
+
+This message is part of the bloom filters described in BIP37, added in
+protocol version 70001 and implemented in Bitcoin Core 0.8.0
+(February 2013).
+
+If a filter has been previous set with the `filterload` message, the
@saivann

saivann Nov 15, 2014

Contributor

s/previous set/previously set/ maybe ?

@saivann saivann commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+43b1c1ce3d248cbfc6c15870f6c5daa2 ... Hash #1
+019f5b01d4195ecbc9398fbf3c3b1fa9
+bb3183301d7a1fb3bd174fcfa40a2b65 ... Hash #2
+41ed70551dd7e841883ab8f0b16bf041
+76b7d1480e4f0af9f3d4c3595768d068 ... Hash #3
+20d2a7bc994987302e5b1ac80fc425fe
+25f8b63169ea78e68fbaaefa59379bbf ... Hash #4
+
+01 ................................. Flag bytes: 1
+1d ................................. Flags: 1 0 1 1 1 0 0 0
+{% endhighlight %}
+
+Note: when fully decoded, the above `merkleblock` message provided the
+TXID for a single transaction that matched the filter. In the network
+traffic dump this output was taken from, the full transaction belonging
+to that TXID was sent immediately after the the `merkleblock` message as
@saivann

saivann Nov 15, 2014

Contributor

s/after the the/after the/

@saivann saivann commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+present) and further descendants as necessary.
+
+However, if you find a node whose left and right children both have the
+same hash, fail. This is related to CVE-2012-2459.
+
+Continue descending and ascending until you have enough information to
+obtain the hash of the merkle root node. If you run out of flags or
+hashes before that condition is reached, fail. Then perform the
+following checks (order doesn't matter):
+
+* Fail if there are unused hashes in the hashes list.
+
+* Fail if there are unused flag bits---except for the minimum number of
+ bits necessary to pad up to the next full byte.
+
+* Fail unless the hash of the merkle root node is identical to the
@saivann

saivann Nov 15, 2014

Contributor

Feel free to ignore this one or not, it's just a small suggestion, it seems a bit clearer to me when reversing the negation consistently with other sentences:

"Fail if the hash of the merkle root node is not identical to the merkle root in the block header."

@saivann saivann and 1 other commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+ancestor nodes.
+
+![Example Of Creating A MerkleBlock Message](/img/dev/animated-en-merkleblock-creation.gif)
+
+Start processing the tree with the merkle root node. The table below
+describes how to process both TXID nodes and non-TXID nodes based on
+whether the node is a match, a match ancestor, or neither a match nor a
+match ancestor.
+
+| | TXID Node | Non-TXID Node
+|--------------------------------------|------------------------------------------------------------------------|----
+| **Neither Match Nor Match Ancestor** | Append a 0 to the flag list; append this node's TXID to the hash list. | Append a 0 to the flag list; append this node's hash to the hash list. Do not descend into its child nodes.
+| **Match Or Match Ancestor** | Append a 1 to the flag list; append this node's TXID to the hash list. | Append a 1 to the flag list; process the left child node. Then, if the node has a right child, process the right child. Do not append a hash to the hash list for this node.
+
+Any time you descend into a node for the first time, a flag should be
+appended to the flag list. Never put a flag on the list at any other
@saivann

saivann Nov 15, 2014

Contributor

To my understanding, given that you need to put a flag when going from the left child to the right child, would it be more accurate to say the following?

"Never put a flag on the list when ascending to a previously processed node"

@harding

harding Nov 15, 2014

Contributor

Yes, when descending into the right child, you will need to put a flag on the list. Also, "Never put a flag..." is a correct statement.

However, the current statement looks accurate to me, so I'm not sure that the revised statement is more accurate. I'm guessing something in the current statement looked confusing to you---could you point that out for me? (Is it just that the current statement has the subjunctive clause? "except when...")

@saivann

saivann Nov 15, 2014

Contributor

Maybe it's just the way I read "descending" (which I interpret as going from the parent to the child, not going edit: from the left node to the right node).

@harding

harding Nov 15, 2014

Contributor

Oh, hmm. I guess I conceptualized things the way the animated GIF shows them, with little arrows showing:

  1. going down into the left child
  2. going back up into the parent
  3. going down into the right child

How about if we just s/descend into/begin processing/ in the text so it reads:

Any time you begin processing a node for the first time, a flag should be
appended to the flag list. Never put a flag on the list at any other
time, except when processing is complete to pad out the flag list to a
byte boundary.

If that sounds good to you, I'll probably also see about making the same change in the parsing instructions.

@saivann

saivann Nov 15, 2014

Contributor

@harding Sounds perfect to me!

@saivann saivann and 1 other commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+
+Any time you descend into a node for the first time, a flag should be
+appended to the flag list. Never put a flag on the list at any other
+time, except when processing is complete to pad out the flag list to a
+byte boundary.
+
+When processing a child node, you may need to process its children (the
+grandchildren of the original node) or further-descended nodes before
+returning to the parent node. This is expected---keep processing depth
+first until you reach a TXID node or a node which is neither a TXID nor
+a match ancestor.
+
+After you process a TXID node or a node which is neither a TXID nor a
+match ancestor, stop processing and begin to ascend the tree until you
+find a node with a right child you haven't processed yet. Descend into
+that right child and begin processing again.
@saivann

saivann Nov 15, 2014

Contributor

I tend to read this one as "begin processing after descending into the right child", in other words, ignore the right node. Would it be clearer to say the following?

...until you find a node with a right child you haven’t processed yet and begin processing again.

@harding

harding Nov 15, 2014

Contributor

We already have an "and" in that sentence, so I'm changing the final sentence in that paragraph to: "Descend into that right child and process it."

@saivann

saivann Nov 15, 2014

Contributor

Make sense, thanks!

@saivann saivann commented on an outdated diff Nov 15, 2014

_includes/ref_p2p_networking.md
+
+Note: the receiving peer itself may respond with an `inv` message
+containing header hashes of stale blocks. It is up to the requesting
+peer to poll all of its peers to find the best block chain.
+
+If the receiving peer does not find a common header hash within the
+list, it will assume the last common block was the genesis block (block
+zero), so it will reply with in `inv` message containing header hashes
+starting with block one (the first block after the genesis block).
+
+| Bytes | Name | Data Type | Description
+|----------|----------------------|------------------|----------------
+| 4 | version | uint32_t | The protocol version number; the same as sent in the `version` message.
+| *Varies* | hash count | compactSize uint | The number of header hashes provided not including the stop hash. There is no limit except that the byte size of the entire message must be below the [`MAX_SIZE`][max_size] limit; typically from 1 to 200 hashes are sent.
+| *Varies* | block header hashes | char[32] | One or more block header hashes (32 bytes each) in internal byte order. Hashes should be provided in reverse order of block height, so highest-height hashes are listed first and lowest-height hashes are listed last.
+| 32 | stop hash | char[32] | The header hash of the last header hash being requested; set to all zeroes to request an `inv` message with 500 header hashes (the maximum which will ever be sent as a reply to this message).
@saivann

saivann Nov 15, 2014

Contributor

If I understand correctly, would it be more accurately phrased this way?

"The header hash of the last header hash being requested; set to all zeroes to request an inv message with all subsequent header hashes (the maximum which will ever be sent as a reply to this message will be 500 header hashes)."

@saivann saivann commented on the diff Nov 15, 2014

_includes/ref_p2p_networking.md
+![Example Of Parsing A MerkleBlock Message](/img/dev/animated-en-merkleblock-parsing.gif)
+
+Keep the hashes and flags in the order they appear in the `merkleblock`
+message. When we say "next flag" or "next hash", we mean the next flag
+or hash on the list, even if it's the first one we've used so far.
+
+Start with the merkle root node and the first flag. The table below
+describes how to evaluate a flag based on whether the node being
+processed is a TXID node or a non-TXID node. Once you apply a flag to a
+node, never apply another flag to that same node or reuse that same
+flag again.
+
+| Flag | TXID Node | Non-TXID Node
+|-------|------------------------------------------------------------------------------------------|----
+| **0** | Use the next hash as this node's TXID, but this transaction didn't match the filter. | Use the next hash as this node's hash. Don't process any descendant nodes.
+| **1** | Use the next hash as this node's TXID, and mark this transaction as matching the filter. | The hash needs to be computed. Process the left child node to get its hash; process the right child node to get its hash; then concatenate the two hashes as 64 raw bytes and hash them to get this node's hash.
@saivann

saivann Nov 15, 2014

Contributor

Given that one may need to get the hashes of additional childs, would it be more accurate to say the following?

The hash needs to be computed. Process the child nodes to get its hash, starting with the left child node. Once you have the hashes of the two child nodes, concatenate the two hashes as 64 raw bytes and hash them to get this node’s hash.

@harding

harding Nov 15, 2014

Contributor

@saivann Hmm, it feels better to me to use the table to define the conditionals the way you might write a recursive function rather than use it to explain how the function will look as it operates over the tree. For example:

process(node) {
  if (flag==1 && type == "non-txid" ) {
    return(hash(process(left-child) + process(right-child)))
  }
 [...]
}

Although I think describing that descendant nodes may need to be hashed before this node is processed may help people who only read the table, I think the way the current version describes that behavior in detail in one of the following paragraphs allows us to keep the table neat and also be very clear.

Does that make sense?

@saivann

saivann Nov 15, 2014

Contributor

@harding Ah... right, yes please ignore my comment (sorry!). After reading the text again, I now think your version was appropriate

Contributor

saivann commented Nov 15, 2014

@harding I've commented on everything that seemed worth reporting. Really amazing work!!!

Contributor

harding commented Nov 15, 2014

@saivann thank you so much for your review! (I know it was a giant pull request---thanks for taking the time.) All your comments look reasonable to me; I'll start implementing them and comment on any that I discover don't work. Thanks again!

harding added some commits Nov 15, 2014

Contributor

harding commented Nov 15, 2014

Updated text and preview with corrections suggested by @saivann. (Thanks!)

In the absence of critical feedback, I'll merge this pull around 18:00 UTC Sunday.

@harding harding merged commit 9ee7b8b into bitcoin-dot-org:master Nov 16, 2014

harding added a commit that referenced this pull request Nov 16, 2014

@harding harding deleted the harding:data-messages branch Feb 25, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment