Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endianness reversed #1061

Closed
bedeho opened this issue Sep 16, 2015 · 11 comments
Closed

Endianness reversed #1061

bedeho opened this issue Sep 16, 2015 · 11 comments
Labels

Comments

@bedeho
Copy link

bedeho commented Sep 16, 2015

The "Hash Byte Order" section does a good job of trying to highlight the issue of txid/blockid byte order confusion,but the titles of the last table unfortunately gives the opposite interpretation what it should be.

As written, RPC byte order of block ids have leading zeros, this would be recognized as big endian, as these digits are most significant - given the very fact that hashing involves minimizing this value. However, the titles on the table have this reversed in the parentheses (“Big Endian”) and (“Little Endian”).

@harding
Copy link
Contributor

harding commented Sep 16, 2015

The quoted text is the phrasing I found other documentation using, which is why it says:

Off-site documentation such as the Bitcoin Wiki tends to use the terms big endian and little endian as shown in the table below, but they aren’t always consistent. Worse, these two different ways of representing a hash digest can confuse anyone who looks at the Bitcoin Core source code and finds a so-called “big endian” value being stored in a little-endian data type.

I think maybe the best solution is to remove the labels "big endian" and "little endian" so that the examples speak for themselves. Does that sound like an adequate solution to you? (If so, I'll update the doc.)

@bedeho
Copy link
Author

bedeho commented Sep 16, 2015

I would recommend removing it. I have never seen RPC byte order be referred to as little endian anywhere else. There is indeed at least one possible contradiction one the bitcoin wiki on this

https://en.bitcoin.it/wiki/Block_hashing_algorithm states
"The output of blockexplorer displays the hash values as big-endian numbers; notation for numbers is usual (leading digits are the most significant digits read from left to right)."

https://en.bitcoin.it/wiki/Protocol_documentation
"Note: Hashes in Merkle Tree displayed in the Block Explorer are of little-endian notation."

One refers to merkle tree hashes, and one refers to hashes in general. Perhaps byte order varies with different fields/values?

@acityinohio
Copy link

Yeah, I agree with you both (@bedeho and I were diving into endianness in a few exchanges), I think maybe removing the labels "big endian" and "little endian" make sense, and providing another concrete example (perhaps a block header hash, since its hash value always has to be less than the target difficulty, and as the wiki link illustrates, can provide a nice anchor to describing byte order).

@carnesen
Copy link
Contributor

"big endian" and "little endian" are the proper technical terms for describing the two possible byte orderings. I think @bedeho is right that the big/little column headings are currently swapped. To me the pedagogically confusing part of the section is the use of "standard" and "reversed". From a human perspective, big-endian numbers are "standard". For an x86 Linux computer, little-endian numbers are "standard". There is no "standard" Unix endianness. Please allow me a couple days to take a stab at rewriting the section.

@harding
Copy link
Contributor

harding commented Sep 17, 2015

@carnesen sure, please feel free to rewrite. (But note that I used standard to mean the byte order universally used to display SHA256 output, which is meant as a string not a number.)

@luke-jr
Copy link
Contributor

luke-jr commented Sep 17, 2015

Note: The displayed block hashes are indeed little endian. Bitcoin's PoW comparison with the target flips the hash around.

@bedeho
Copy link
Author

bedeho commented Sep 17, 2015

This issue has been dealt with before in #580 , and @SergioDemianLerner seemed to conclude that:"there is not such thing as little-endian or big-endian transaction hash digest. "

I am not familiar with the structure of the output of the various hashing algorithms, but I agree that endianness of byte order is only relevant in so far as bytes actually vary in significance (i.e. most vs. least). The byte representation of lots of data (e.g. text) does not have such significance, yet it certainly matters what order it is interpreted, hence the two are not the same.

@carnesen
Copy link
Contributor

Ah, I see. Output of SHA256 is a byte array (a.k.a. its hex-encoded string
equivalent), not an integer (number). Therefore "standard" and "reversed"
are probably better than "little/big endian". I've been meaning to
understand this corner of Bitcoin in more detail and would appreciate being
motivated to do so by working a PR for this issue. Stay tuned ...

Edit: By "byte array" I meant "byte array with an ordering uniquely specified by the algorithm"

@bedeho
Copy link
Author

bedeho commented Sep 17, 2015

@carnesen I basically see it exactly as you just described it.

@harding
Copy link
Contributor

harding commented Oct 23, 2015

Opened #1105 to drop endian references. That PR also provides a suggestion for an alternative way to describe the issues for anyone who really likes thinking about endianness problems. :-)

@luke-jr
Copy link
Contributor

luke-jr commented Oct 23, 2015

FWIW, more endian "fun" coming soon... apparently versionbits will number the bits:

7 6 5 4 3 2 1 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24

I don't care enough to argue for fixing this, so I guess it'll be left for documentation to deal with. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants