New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade consensus hashing to cover more of the state #226

Merged
merged 21 commits into from Jan 4, 2016

Conversation

Projects
None yet
3 participants
@zathras-crypto
Copy link

zathras-crypto commented Sep 6, 2015

Note: tests will fail at the moment.

Upgrades the consensus hashing process to include the following data:

  • DEx sell offers
  • DEx accepts
  • MetaDEx trades
  • Crowdsales
  • Property IDs & totals

Also adds support for omnidebug=consensus_hash_every_transaction for more granular consensus hash generation for debugging.

Feedback welcomed :)

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 6, 2015

Wow, this is cool! And not intrusive (which I like very much!).

I'm wondering: we already include all balances via the tally in the hash, which seems to make the total tokens a duplicated effort? However, I do think we should include the SP ID peeks.

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 6, 2015

Out of curiousity: if one would have all that data in clear text (i.e. not hashed), would that be sufficient to reconstruct the whole state?

We may also include the exodus_prev.

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Sep 6, 2015

I'm wondering: we already include all balances via the tally in the hash, which seems to make the total tokens a duplicated effort? However, I do think we should include the SP ID peeks.

Yes you're right, good call thanks...

We may also include the exodus_prev.

I'm not sure about this as that's an Omni Core specific variable that's not part of the state. The reason I went with hashing strings in the first place was to enable other implementations/services to generate their own consensus hashes which should match - if we include Omni Core specific data we restrict the scope of use to just Omni Core. The Exodus balance at the time the hash is generated is in with the rest of balances though :)

Out of curiousity: if one would have all that data in clear text (i.e. not hashed), would that be sufficient to reconstruct the whole state?

Not in all things (eg property metadata) but from a math perspective yes I believe that's all you would need to reconstruct the state because we do exactly that (reconstruct a state) when we load a state from persistence, and this is basically the same data.

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 6, 2015

I'm not sure about this as that's an Omni Core specific variable that's not part of the state.

That's a very good point. :)

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 6, 2015

To pick up the discussion via email: at this point the OmniJ consensus checks are mostly nice to have, but the checkpoints are far more powerful. We should make it common practise to verify the checkpoints on multiple plattforms.

At least for me this is a bit time-consuming, due to the extra effort of generating cross plattform binaries and running the synchronization several times. Currently I also have no fully synchronized chain in my OS X VM. Of course, especially we, should setup everything and not complain too much here, but I'm wondering, if we may integrate this directly into the workflow.

We can't use Travis CI for consensus/checkpoint checks, because the build and test environment has no fully synchronized chain, and there is a time limit of 50 minutes per build target, which we'd likely exceed.

So what are the alternatives? We might spin up some AWS instances. I assume EC2 would be the way to go for Linux and Windows? Is there any service that supports OS X?

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 7, 2015

The unit tests fail, because the mechanism changed. It's more tricky to test now, given that the whole state is included.

Back then, I created them by starting fresh, and doing each step, then took the hash and added the hash in the test. Since we have no good way to change the state arbitrarily (except for balance operations for example), I'm not really sure how to continue from here. But let's try to tackle this step by step:

First, another note:

I'm not sure, if we need to include the reserved balances for DEx offers, because again, we already include balances via the tally hashing.

Further, I'm wondering, whether we should add checks, to ensure no "empty" offers (e.g. MetaDEx) are included in the hash. What do you think? Is this an issue?

Except for the "buyer" of DEx offers and accepts, we always have an object, such as CMPTally, CMPOffer, CMPAccept, CMPCrowd and CMPMetaDEx.

Maybe we can add:

std::string GenerateConsensusString(const CMPTally& obj, const std::string& address, const uint32_t propertyId)
{
    // need to check, whether the tally is "empty" before calling this function!
    // ...
    return strprintf("%s|%d|%d|%d|%d|%d",
            address, propertyId, balance, sellOfferReserve, acceptReserve, metaDExReserve);
}

std::string GenerateConsensusString(const CMPOffer& obj, const std::string& address, const uint32_t propertyId)
{
    return strprintf("%s|%s|%d|%d|%d|%d|%d",
            obj.getHash().GetHex(), address, propertyId, obj.getOfferAmountOriginal(),
            obj.getBTCDesiredOriginal(), obj.getMinFee(), obj.getBlockTimeLimit());
}

std::string GenerateConsensusString(const CMPAccept& obj, const std::string& address)
{
    return strprintf("%s|%s|%d|%d|%d",
            obj.getHash().GetHex(), address, obj.getAcceptAmount(), obj.getAcceptAmountRemaining(),
            obj.getAcceptBlock());
}

std::string GenerateConsensusString(const CMPCrowd& obj)
{
    return strprintf("%d|%d|%d|%d|%d",
            obj.getPropertyId(), obj.getCurrDes(), obj.getDeadline(), obj.getUserCreated(),
            obj.getIssuerCreated());
}

std::string GenerateConsensusString(const CMPMetaDEx& obj)
{
    return strprintf("%s|%s|%d|%d|%d|%d|%d",
            obj.getHash().GetHex(), obj.getAddr(), obj.getProperty(), obj.getAmountForSale(),
            obj.getDesProperty(), obj.getAmountDesired(), obj.getAmountRemaining());
}

This allows easier testing of each string.

In addition, we may also split GetConsensusHash() into smaller units, where one function handles the tally hashes, another this, another that, ... once again to allow easier testing.

Unfortunally we don't have "plain data collections" everywhere, like the tally map, because then we could just pass the global state object, say for example like this:

// tally can't be passed const &, due to the iteration via init(), next()
void AddConsesusHash(std::map<string, CMPTally>& tallyMap, SHA256_CTX* pShaCtx)
{
    for (std::map<string, CMPTally>::iterator it = tallyMap.begin(); it != tallyMap.end(); ++it) {
        // ...
    }
}

The upside here would be that we could easily create a new map for the tests, which is not dependant on the global state, to avoid the extra clearing or restoring of the original state.

Not sure, if all this is really useful, but just some ideas here..

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Sep 7, 2015

At least for me this is a bit time-consuming, due to the extra effort of generating cross plattform binaries and running the synchronization several times

Perhaps we can automate the build process via Travis, and pull those builds via scripting to instances where we have the blockchain stored to run a further set of tests?

We might spin up some AWS instances. I assume EC2 would be the way to go for Linux and Windows? Is there any service that supports OS X?

I have the blockchain on AWS for OmniChest.info so can make that available as a regular snapshot perhaps? I'm not sure about any services that support OS X, I googled and found a few 'cloud' offerings.

but I'm wondering, if we may integrate this directly into the workflow.

Sure sounds good - I'm not sure exactly how we'd do it but it would be worth the effort to automate testing of checkpoints.

I'm not sure, if we need to include the reserved balances for DEx offers, because again, we already include balances via the tally hashing.

OK, that makes sense. Will amend.

Further, I'm wondering, whether we should add checks, to ensure no "empty" offers (e.g. MetaDEx) are included in the hash. What do you think? Is this an issue?

I think that's implied - a trade with empty values is not a valid active trade and thus shouldn't be in the consensus hash. If there are "empty" offers in the MetaDEx maps we have a bug that should be fixed within the MetaDEx code.

Note: I really should find some time to refresh the auditor branch :)

Maybe we can add ...std::string GenerateConsensusString...
This allows easier testing of each string.

I like this idea, anything I can do to make testing easier leave it with me :)

In addition, we may also split GetConsensusHash() into smaller units

I'll start with GenerateConsensusString and we'll take it from there...

Not sure, if all this is really useful, but just some ideas here..

Thanks for the feedback, always useful dude :)

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 7, 2015

Perhaps we can automate the build process via Travis, and pull those builds via scripting to instances where we have the blockchain stored to run a further set of tests?

Yeah, there are probably a few options, but I'm not yet sure about specifics. Just thinking about an ideal scenario:

  1. someone pushes
  2. in the top commit, or PR, there is a magic keyword, like: [ci check consensus] (because parsing takes time, and it's probably wasteful to do it for every commit)
  3. the Travis CI build passes
  4. if there was the magic string, Travis CI uploads the build results to the consensus checking instance
  5. this initiates the checkpoint verification
  6. the outcome is notified via the GitHub API, so it's shown as status of the PR

All this isn't really trivial, but more or less straight forward and doable with some reading and testing.

But maybe we can start small, like having a Windows instance, which can manually be used to verify checkpoints, and as second step, add the automatic upload via Travis, etc...

I have the blockchain on AWS for OmniChest.info so can make that available as a regular snapshot perhaps?

I'm not entierly sure where you're going: I assumed we'd have a dedicated instance with synchronized chain, which runs all the time. Or were you referring to something like: we initiate the verification, create a new instance on the fly, the instance then pulls the chain from Chest, ...?

Are you able to estimate the costs of running a full node on a Windows instance on AWS? I have no idea actually, For those 1000+ combined builds via Travis, the total cost for S3 over the last weeks was like 10 cents or so (the Mac OS X SDK is hosted there).

Let's also ping @msgilligan, because we have a dedicated CI server for consensus tests already! :) Though none for Windows or OS X.

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Sep 7, 2015

I'm not entierly sure where you're going: I assumed we'd have a dedicated instance with synchronized chain, which runs all the time. Or were you referring to something like: we initiate the verification, create a new instance on the fly, the instance then pulls the chain from Chest, ...?

Well, I was thinking using the AWS API to spawn an instance, pull the build from Travis, mount a snapshot of the blockchain, run any tests, notify results and shutdown.

We can have a dedicated instance sure, that's fine too - I always have cost optimization on the brain hehe :)

Are you able to estimate the costs of running a full node on a Windows instance on AWS?

Sure, OmniChest.info used to cost about $90 USD a month for the infrastructure on AWS (this includes storage, bandwidth and 2 small instances; one Ubuntu for Omni Core and one Windows for OmniChest.info). That's recently gone up to $125ish USD a month due to increased power to support some pretty heavy API usage but we wouldn't be worrying about anything like that for testing. So I'd estimate somewhere around the $50 mark for an always on micro windows node and necessary storage etc. I'd be happy to cover that cost.

Those Travis costs are incredible btw!

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 7, 2015

Well, I was thinking using the AWS API to spawn an instance, pull the build from Travis, mount a snapshot of the blockchain, run any tests, notify results and shutdown.

Ah, I see! I'm not very familiar with AWS, but if this is possible, it would be great!

Those Travis costs are incredible btw!

Just to clarify: Travis CI is free for open source projects, but I use S3 to host the OS X SDK, which is pulled via Travis. To take some load from Chest, we may also consider to move the binaries to S3.

Speaking of, I'd really love to have something like https://builds.jonasschnelli.ch/ for Omni Core. It's shiny, and the binaries are build via Gitian. @jonasschnelli: sorry for the ninja-mention, but I was wondering, if this is a custom solution that you build there, or if you could give us a hint how we may build a similar system for this project (which is Bitcoin Core + Mastercoin)? Any input would be very appreciated! :)

@jonasschnelli

This comment has been minimized.

Copy link

jonasschnelli commented Sep 7, 2015

@dexX7: I'm using a script to autobuild (nightly) or trigger build (PR). The script itself is very ugly and totally tied to my environment (won't share! to ugly.. :) ).
But it's pretty trivial. Nightly build gets kicked by cron, PR by a command called ./build <Github PRNUM>.
The script itself checkouts the master, and does the per platform gitian building.
If a PR gets built, it loads some PR infos over https://api.github.com/repos/bitcoin/bitcoin/pulls/<PRNUM>, does some json parsing and extract the correct URL to pull in the PR (some git reset --hard magic first, etc.).

Tell me if you need help.

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 7, 2015

Thanks @jonasschnelli! This is already very helpful! Once I find some time, I'm going to give it a try.

@jonasschnelli

This comment has been minimized.

Copy link

jonasschnelli commented Sep 7, 2015

I forgot to mention that the webpage displaying the binaries, etc. is done over apache apaxy (http://adamwhitcroft.com/apaxy/). The important files (binaries, build.logs) are just copied to the web httpdocs dir. The headers is a static html part that gets included over apache/apaxy.

Would definitively be better to use some PHP magic for this.

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Sep 7, 2015

@jonasschnelli: Yeah, I found apaxy over the source code already. :) This sounds pretty straight forward! I assume you generate the headers for each build with commit information as part of the build process?

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 1, 2016

Note to self, include administrative address for properties (to capture state changes from 'change issuer' transactions).

@zathras-crypto zathras-crypto force-pushed the zathras-crypto:0.0.10-Z-UpgradeConsensusHash branch to 7b7fd42 Jan 1, 2016

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 1, 2016

Rebased on latest tip, still WIP.

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 1, 2016

Just to give you a headsup: I support this PR, and it's great you picked it up. :)

edit: eventually I'd like to extract the actual string creation as described in #226 (comment), but this isn't a requirement for now, just something to have in mind for later.

@dexX7 dexX7 modified the milestone: 0.0.10.1 Jan 1, 2016

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 3, 2016

@dexX7 can you think of anything else I've missed (state-relevant data) that we could add into the consensus hash process, or do you think it's reasonably well covered?

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 3, 2016

Just to give you a headsup: I support this PR, and it's great you picked it up. :)

Awesome, it seems pretty straight forward going back to it, I think it's getting close to merge-ready :)

edit: eventually I'd like to extract the actual string creation as described in #226 (comment), but this isn't a requirement for now, just something to have in mind for later.

Thanks for the pointers, I've broken out the string creation in 959264f as you suggested, thanks!

for (uint32_t propertyId = startPropertyId; propertyId < _my_sps->peekNextSPID(ecosystem); propertyId++) {
CMPSPInfo::Entry sp;
{
LOCK(cs_tally);

This comment has been minimized.

@dexX7

dexX7 Jan 3, 2016

Member

cs_tally is already locked at the beginning. This one can be removed.

This comment has been minimized.

@zathras-crypto

zathras-crypto Jan 3, 2016

Good call, thanks

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 3, 2016

On the first glimpse this looks good and pretty complete. :)

Regarding the tests: you may just remove them. Once this PR is in, I'll take a look at adding some new unit tests.

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 3, 2016

Cool cool, I think this is ready to go then if you're happy with it mate :)

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 3, 2016

Very nice! :) I'll take a second more detailed look at it tomorrow.

On a related note:

I've been thinking about our reorg tests. Currently the tests are very targeted (e.g.: create some property, rewind and confirm it's gone), and with consensus hashes, which cover the whole state, we might try something else:

  1. Create random transactions of any kind
  2. Mine block(s)
  3. Store consensus hash
  4. Repeat 1-4 until many transactions were created
  5. Invalidate last block(s)
  6. ???
  7. Compare consensus hash
  8. Repeat 5-7 until genesis block

The upside I see here is that we don't need to hassle with coding specific tests only for reorganizations, if we could show that the state is the same after the blocks were rolled back by comparing the consensus hashes.

Now the tricky part is that the whole state-rolling-back stuff only fully kicks in once a new block is mined (e.g. invalidateblock followed by setgenerate), which has an impact on the results.

I'm not sure, if this is solvable, or if there might be an alternative, like moving the reorg logic from mastercore_handler_block_begin() into mastercore_handler_disc_begin().

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 3, 2016

Very nice! :) I'll take a second more detailed look at it tomorrow.

Thanks bud, much appreciated :)

On a related note ... Now the tricky part is that the whole state-rolling-back stuff only fully kicks in once a new block is mined (e.g. invalidateblock followed by setgenerate), which has an impact on the results.

Can you clarify what you mean by having an impact on the results dude? As far as I'm aware there is no scenario where a block can be disconnected from the tip without being replaced by another, so invalidateblock followed by setgenerate (ie disconnect one block and connect another) seems to me to be the appropriate model for mirroring the behaviour we'd see on-chain?

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 3, 2016

Can you clarify what you mean by having an impact on the results dude?

Haha, yeah, you're right regarding block disconnection followed by a new block, however, when testing, I'm not sure where to go from here. Maybe I miss something though. :)

Say:

  • Block 1: balance is 3 | store consensus hash, send a simple send (say 5 tokens)
  • Block 2: balance is 9 | no action

If block 2 is now invalidated, the mempool cleared (to get rid of the send), and a new block 2b mined, then even though the balance is back to 3, we'd compare the generated hash of block 1 to the one from the new block 2b.

Issues that come to my mind here are for example additional DevMSC, or in another context maybe an already timed out DEx offer, closed crowdsale etc., because we're one block ahead.

edit: haven't tested it, but say we go back to block 0, then mine one block, and compare block 1 to block 1b (the new one), then there might also be some impact due to the different block timestamps.

So basically, ideally I'd like to store the hash of block 1, then do something, and go back to block 1, and compare that hash, instead of the hash of the next block.

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 3, 2016

I think I see where you're going (the Dev MSC example helped, since that's based on block timestamp), thanks :)

Rather than moving logic which might be intrusive, could we achieve the same result by adding debug options to include consensus hash generation in additional areas (eg include a -omnidebug=consensus_hash_block_disconnect or something similar which causes a hash to be generated at the completion of the block disconnection process)?

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 3, 2016

Ah, that's a good idea. I'm going to play around with it a bit.

It probably won't work though, because the state isn't reversed during the disconnect, and in the example above the hash would then still map to block 2, even though we may as well be before block 1 (or so).

@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 3, 2016

It's been a while since I looked at the reorg stuff (IIRC Bart wrote that bit), but I believe the state is rolled back by loading from persistence, so perhaps we could do hash generation post-loading a state or similar.

Let me know how you get on playing around and if you need more ideas I'll dig into the reorg stuff for inspiration :)

@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 4, 2016

Alright. I looked through the code once more and locally build and tested it with the regtests we have. The effort is somewhat redundant, given the tests via Travis CI, but still.

I'm going to merge in just a second. :)

@dexX7 dexX7 merged commit 4d06a8d into OmniLayer:omnicore-0.0.10 Jan 4, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

dexX7 added a commit that referenced this pull request Jan 4, 2016

Merge pull request #226
4d06a8d Remove checkpoint tests for old hashing process (zathras-crypto)
755abb5 Remove second tally lock (thanks @dexX7) (zathras-crypto)
959264f Break out consensus string generation (thanks @dexX7) (zathras-crypto)
1a8ad4f Update checkpoints to reflect new consensus hashing process (zathras-crypto)
18669b4 Add current issuer address for properties to consensus hash (zathras-crypto)
7b7fd42 Update checkpoints for new consensus hash algo (zathras-crypto)
5af9547 Remove reserved amounts from DEx sell data added to consensus hash (zathras-crypto)
a902c4d Remove unused GetTallyPropertyTotal() (zathras-crypto)
d51edb2 Switch properties to next available instead of token counts (thanks @dexX7) (zathras-crypto)
66561dd Tidy up consensus hashing changes (zathras-crypto)
af151cb Sort DEx accepts by matching txid then buyer for consensus hashing (zathras-crypto)
95e92ea Order DEx offers and MetaDEx trades by txid in consensus hash (zathras-crypto)
a98132a Order crowd data in consensus hash by property id (zathras-crypto)
6cbd79d Add DEx accept to consensus hash (zathras-crypto)
f726c50 Add property id and total amount of tokens to consensus hash (zathras-crypto)
86a6176 Add crowdsale data to consensus hash (zathras-crypto)
ffe6206 Add MetaDEx trade data to consensus hash (zathras-crypto)
abeba7a Allow generating consensus hashes after each Omni transaction (zathras-crypto)
fdee9d5 Fix format specifiers for DEx data in consensus hash (zathras-crypto)
f394ed1 Add DEx offers to consensus hash calculations (zathras-crypto)
1eafd66 Extract GetConsensusHash() into consensushash.h/cpp (zathras-crypto)
@dexX7

This comment has been minimized.

Copy link
Member

dexX7 commented Jan 4, 2016

A few minor things I noticed, which are not included:

  • the action of CMPMetaDEx
  • the issuer bonus of CMPCrowd
  • the early bird bonus of CMPCrowd
@zathras-crypto

This comment has been minimized.

Copy link

zathras-crypto commented Jan 4, 2016

Ahh good call, for completeness we should put those in, leave it with me :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment