Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

rational to rationale and little and big endian mixup fix #1102

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
3 participants
Contributor

mruddy commented Oct 22, 2015

Two small fixes:

  1. Fixed a typo "rational" -> "rationale"

  2. Fixed a mixup between big and little endian.

About the big and little endian mixup, here's my reasoning:

https://en.bitcoin.it/wiki/Block_hashing_algorithm:
"Note that the hash, which is a 256-bit number, has lots of leading zero bytes when stored or printed as a big-endian hexadecimal constant, but it has trailing zero bytes when stored or printed in little-endian."

Example, double sha256 hash of block 300,000 header:

echo -n '020000007ef055e1674d2e6551dba41cd214debbee34aeb544c7ec670000000000000000d3998963f80c5bab43fe8c26228e98d030edf4dcbe48a666f5c39e2d7a885c9102c86d536c890019593a470d' | xxd -r -p | sha256sum -b | xxd -r -p | sha256sum -b
5472ac8b1187bfcf91d6d218bbda1eb2405d7c55f1f8cc820000000000000000

The result is internal byte order (verified by checking the previous block hash of block header 300,001) and little endian (trailing zeros).

Also, as a double check, this verifies that the definition of "internal byte order" is consistent with the first example in the table and should be labeled as little endian:

echo -n '00' | xxd -r -p | sha256sum -b | xxd -r -p | sha256sum -b
1406e05881e299367766d313e26c05564ec91bf721d31726bd6e46e60689539a

@harding harding added the Dev Docs label Oct 22, 2015

Contributor

harding commented Oct 22, 2015

Regarding rational→rationale, if you split that off into a separate commit or PR, I'll apply that right away.

For endianness, we were discussing that in #1061 and I think we're waiting for @carnesen to provide a patch. If that doesn't happen soon, I think I'm going to put forward my solution of just dropping the endianness labels and they're a source of never-ending confusion and debate.

Contributor

carnesen commented Oct 22, 2015

My bad, I think I mostly figured out what's going on under the hood so far
as endianness in the hashes (at a point in the algorithm, a sequence of
bytes is interpreted as a big-endian integer) but never did write anything
up. @harding or anyone else please feel free to jump in with a PR for that
section.

On Thu, Oct 22, 2015 at 9:44 AM David A. Harding notifications@github.com
wrote:

Regarding rational→rationale, if you split that off into a separate commit
or PR, I'll apply that right away.

For endianness, we were discussing that in #1061
#1061 and I think
we're waiting for @carnesen https://github.com/carnesen to provide a
patch. If that doesn't happen soon, I think I'm going to put forward my
solution of just dropping the endianness labels and they're a source of
never-ending confusion and debate.


Reply to this email directly or view it on GitHub
#1102 (comment)
.

Contributor

mruddy commented Oct 22, 2015

@harding just broke out the typo into #1103.

I quickly looked through the Bitcoin Core code and the FIPS PUB 180-4 (that specifies SHA-256) and I haven't made up my mind on the endian-ness wording yet. The spec probably defines the endian-ness, but it is a matter of constructing a few sections together to get that interpretation and I need to re-read it to make sure.

EDIT: Also, I hadn't seen the discussion in #1061 so I want to check that out to take it into consideration too.

harding added a commit to harding/bitcoin.org that referenced this pull request Oct 23, 2015

Contributor

mruddy commented Oct 23, 2015

Here is my current thinking on the endianness topic.

First, we must determine whether the SHA-256 result should be considered a 256-bit integer or simply an array of 32 bytes.

If considered to be a 256-bit integer, then based on sections 2.2.1, 3.1 2, and 6.2.2 of FIPS PUB 180-4 (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf) and Bitcoin Core's implementation of SHA-256 (https://github.com/bitcoin/bitcoin/blob/c719cefc417cc578f48b33069b764339a61054ce/src/crypto/sha256.cpp#L174-L181) I think that a fair case could be made that it is a big-endian value.

For our purposes though, I think it is more natural to think of the SHA-256 result as being simply an array of 32 8-bit bytes (aka a blob).
I say this because the Bitcoin Core code treats the (previous) block, Merkle root, and transaction hashes as being of type uint256 which, from a comment in the code, "is called uint256 for historical reasons only. It is an opaque blob of 256 bits and has no integer operations.".
Treating the hash as a blob also makes the most sense when considering how the proof-of-work check against the target threshold is carried out (https://github.com/bitcoin/bitcoin/blob/c719cefc417cc578f48b33069b764339a61054ce/src/pow.cpp#L100). The POW check basically just treats the blob as a single 256-bit value stored as a little-endian when comparing against the target threshold.

Basically, depending on your perspective, you could correctly consider the same sequence of bytes in memory as being either a blob, a single big-endian value, or a single little-endian value.
That's too damn confusing. I think the endian terminology should be avoided and "reverse of the computed hash" style terminology should be used instead.

Thus, I essentially agree with Sergio's comment #580 (comment)

I've adapted my proposed commit. Looks like @harding was doing some work concurrently. However you want to handle this is fine with me. I might have to fix some formatting in this latest commit anyways.

Contributor

harding commented Oct 23, 2015

@mruddy thanks! I did indeed just open #1105 -- what do you think about the proposal of dropping the endian references in the text and then summarizing the issues in a Bitcoin Wiki page, which Bitcoin.org will then link to?

Contributor

harding commented Oct 23, 2015

I just reviewed this updated PR, and I'll probably adapt some of it into #1105 (particularly the terminology).

Contributor

harding commented Oct 23, 2015

@luke-jr wow. And I thought nBits was weirdest data type in Bitcoin.

Contributor

mruddy commented Oct 23, 2015

@harding Yes, dropping endian references in this DevDoc and then linking out to the wiki for more detail could make sense. The FIPS pub and code references probably make more sense in a more detailed wiki entry or appendix rather than in-line. I'll be happy to review what you do for #1105 if you want me to close this PR. It's funny how I initially thought this PR would be a quick update and then how much more complex it really was in the code and FIPS spec when I went down the rabbit hole.

Contributor

mruddy commented Oct 28, 2015

handled by #1105

@mruddy mruddy closed this Oct 28, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment