Reverse bitcoin_hash in memory #144

swansontec · 2014-12-02T22:42:03Z

This is a pretty big change. Please see the email on the mailing list. Let's hold off on merging this for a day or two, to be sure it's stable.

evoskuil · 2014-12-03T00:11:13Z

This looks good. I'm queuing up patches for -protocol/-client/-explorer.

evoskuil · 2014-12-03T00:38:05Z

So is it true that the obelisk wire protocol hash is reversed and the in-memory representation is now "unreversed", meaning that we need to explicitly reverse in -protocol (and will need to in new -server code)? Doing so fixes test failures in -protocol.

evoskuil · 2014-12-03T01:42:53Z

I'm going to templatize encode/decode_hash as follows:

template <typename Hash>
std::string encode_hash(const Hash& hash);

template <typename Hash>
bool decode_hash(Hash& out, const std::string& in)

Because we lost implementation for the other hash sizes.

evoskuil · 2014-12-03T02:38:05Z

We have to make a decision for BX on hash encoding. A large number of tests are failing due to the reversal. Wire encoding is what is typically presented, such as in blockchain.info info. The easiest way to repair the tests, existing documentation, and to retain correlation with other services, is to reverse the hash for all presentation. That sort of begs the question however, why we wouldn't just treat the reversed format as canonical internally.

evoskuil · 2014-12-03T03:21:19Z

I tried reversing the output for hash encode and decode, but the problem is much deeper than that. Any libbitcoin function that accepts or returns, directly or indirectly, a bitcoin_hash or a bitcoin_short_hash will exhibit the behavior change. So all code that has taken a dependency on these functions will need to be modified, and without a comprehensive test suite it would be nearly impossible to do this reliably. Even with such coverage in BX I cringe at having to make this change. Given that the public and wire representation of these hashes is generally the reversed representation I think we are much better off not attempting this change and instead treating these hashes as opaque data.

genjix · 2014-12-03T03:23:49Z

As an API our constraints are different to a monolithic blob, so I tried to keep the representation close to what people expect. There are 2 options: custom hash type with reversed representation, or reversing the data in hashing functions (which is what libbitcoin does).

evoskuil · 2014-12-03T03:48:09Z

@genjix I think that they key question is "what people expect". While there are a number of conversions internally to accomplish this, the API does expose what people expect to see. I think this is the proper choice as it hides the confusion in the library implementation instead of exposing it to API users.

BX uses most of the libbitcoin APIs and after looking through the tests failures, and trying to make it work, it's clear that in order to represent what people expect there would be conversions in a lot of places, where there are none presently.

swansontec · 2014-12-03T06:40:23Z

I'm sorry, but both you guys you have it all wrong. In the old system, everything followed this double-reversal pattern:

sha256(sha256) → reverse → memory → reverse → serializer
sha256(sha256) → reverse → memory → reverse → signature
sha256(sha256) → reverse → memory → reverse → script ops
sha256(sha256) → reverse → memory → reverse → checksum
sha256(sha256) → reverse → memory → reverse → block difficulty
sha256(sha256) → reverse → memory → base16

As you can see, the only operation that isn't double-reversed is the base16 conversion. This means that everything external to the libraries is already the right way around! The obelisk wire protocol, for instance, has the bytes in the same order you would get from sha256(sha256), since the serializer was un-reversing them.

In the new system, things work like this:

sha256(sha256) → memory → serializer
sha256(sha256) → memory → signature
sha256(sha256) → memory → script ops
sha256(sha256) → memory → checksum
sha256(sha256) → memory → block difficulty
sha256(sha256) → memory → reverse → text

Now, instead of reversing bytes everywhere, we reverse the bytes in one place - the text conversion. This is why encode_hash and decode_hash must NOT be templates, since they ONLY for bitcoin hashes. Nothing else has this crazy reversing requirement.

So, I have good confidence that I can fix BX completely, reliably, and correctly by swapping out a grand total of 3 functions. None of the other libraries should need any changes whatsoever.

As far as outside library users like are concerned, yes, this breaks API. However, the current API is insane, and is not what anybody would expect. It is utterly inconsistent with all other Bitcoin software. Just look at that first chart again, and notice that every operation that touches the blockchain has to un-reverse the hash before it can proceed.

Here at AirBitz, these reversed hashes have bitten us pretty hard. Any time we go to cobble together a tool in another language, like Python, we have to account for the fact that the binary data inside our database is reversed. We would not be having these nightmares if the binary data had been the right way round in the first place.

evoskuil · 2014-12-03T07:26:20Z

The experience I had after checking out the changes was that compilation was relatively easy to achieve and I had a handful of test failures in libbitcoin-protocol (all that hit the decode_hash calls) and a large number in libbitcoin-explorer.

The libbitcoin-protocol failures are fixed by reversing the bc::decode_hash() calls. This doesn't sound like what you expected, so please confirm.

In bx, after reversing primitives::btc160 and primitives::btc256 serialization << and deserialization >> there are still had a large number of test failures. But I'll take another look at it, since what you are saying makes sense. I may have missed something.

swansontec · 2014-12-03T08:01:19Z

I'll take a poke as well. One requirement with this new branch is that all bitcoin_hashes must go through encode_hash and decode_hash for their text representations, and that no other hashes should go through these functions. We never had this requirement before. As a result, we are quite sloppy with our text ↔ binary conversions, especially in the unit tests. For example, we were using decode_hash to decode EC private keys in some of our unit tests, which obviously won't work now. The new base16_literal provides a nice alternative to decode_hash, at least for the unit tests.

evoskuil · 2014-12-03T08:04:42Z

I noticed something along these lines, which is why I want to take another look. I'm tied up in the clang/osx Travis build configuration at the moment.

swansontec · 2014-12-03T08:25:29Z

This looks like a bug in libbitcoin-protocol! The protobuf field is "bytes", and the description indicates that it's bytes, but the converter is actually shoving in base16 ASCII text.

swansontec · 2014-12-03T09:35:48Z

I just ran the following procedure, so it looks like the hash reversal doesn't actually affect libbitcoin-protocol:

Compile libbitcion at reverse~1, which is the last commit where the hashes are the old way around.
Fix libbitcoin-protocol to use the proper functions, so it compiles and passes its tests again.
Upgrade libbtitcon to reverse, where the branches are the right way around.
Compile and run libbitcoin-protocol unit tests again.

Things still work at step 4, so we're good here. Pull requests coming soon, once I add a 'hash_literal' to libbitcoin.

evoskuil · 2014-12-03T09:49:23Z

Cool, I won't get to this until the morning but I'm hopeful based on seeing something I missed last time around.

swansontec · 2014-12-03T11:26:32Z

Ok, I've updated the branch based on my work with libbitcoin-protocol. The base16 and hash conversions now have a bettery symmetry:

encode_base16 / encode_hash
decode_base16 / decode_hash
base16_literal / hash_literal

swansontec · 2014-12-03T11:49:28Z

The protocol changes are in libbitcoin/libbitcoin-protocol#36

swansontec · 2014-12-03T14:12:39Z

I have finished fixing bx. I was able to do the same 4-step process using reverse~1, so again, the change doesn't actually affect anything (assuming you use the right API's).

As with libbitcoin-protocol, there were some cases where decode_hash was being used for ec_secret's. There were also a lot of cases where base16(hash) was being used instead of the more-appropriate btc256(hash). That was about the extent of the problem, though.

The pull request is libbitcoin/libbitcoin-explorer#171

swansontec · 2014-12-03T15:32:10Z

I have just done another update. This one fixes the problem with bx fetch-history. Apparently, our short_hashes are reversed in the obelisk binary protocol (and absolutely nowhere else). Ugh.

swansontec · 2014-12-04T13:10:10Z

I have created a new pull request, #147, which includes just the base16 changes. We can go ahead and pull that while we wait to decide on the hash reversal.

evoskuil · 2014-12-07T01:41:20Z

After further discussion with @swansontec, and narrowing the actual issue down to this simple commit, I'm convinced we should take this change. The expected representation is achieved through the encode/decode functions, which should always be used with to textually serialize our hashes in any case.

With this change, the reversal only takes place in the encode_hash and decode_hash functions.

Reverse bitcoin_hash in memory

swansontec force-pushed the reverse branch from 111f07f to 1a35d64 Compare December 3, 2014 11:24

swansontec force-pushed the reverse branch from 1a35d64 to 2c751cb Compare December 3, 2014 13:47

swansontec force-pushed the reverse branch from 2c751cb to 0ab36db Compare December 3, 2014 15:30

swansontec mentioned this pull request Dec 4, 2014

Base16 #147

Merged

swansontec force-pushed the reverse branch 2 times, most recently from 0befb8f to 198effd Compare December 5, 2014 15:23

swansontec mentioned this pull request Dec 6, 2014

Use build_data instead of serializer for obelisk libbitcoin/libbitcoin-client#58

Merged

swansontec force-pushed the reverse branch from 198effd to c4d0221 Compare December 6, 2014 23:58

Un-reverse the bitcoin_hash type in memory

7ea06f8

With this change, the reversal only takes place in the encode_hash and decode_hash functions.

swansontec force-pushed the reverse branch from c4d0221 to 7ea06f8 Compare December 10, 2014 17:12

evoskuil added a commit that referenced this pull request Dec 11, 2014

Merge pull request #144 from libbitcoin/reverse

c8f5698

Reverse bitcoin_hash in memory

evoskuil merged commit c8f5698 into version2 Dec 11, 2014

evoskuil deleted the reverse branch December 13, 2014 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reverse bitcoin_hash in memory #144

Reverse bitcoin_hash in memory #144

swansontec commented Dec 2, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

genjix commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 4, 2014

evoskuil commented Dec 7, 2014

Reverse bitcoin_hash in memory #144

Reverse bitcoin_hash in memory #144

Conversation

swansontec commented Dec 2, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

evoskuil commented Dec 3, 2014

genjix commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

evoskuil commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 3, 2014

swansontec commented Dec 4, 2014

evoskuil commented Dec 7, 2014