Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split header MMR (and sync MMR) out from txhashset #3004

Merged
merged 10 commits into from
Sep 6, 2019

Conversation

antiochp
Copy link
Member

@antiochp antiochp commented Aug 27, 2019

This PR introduces an explicit separation between the txhashset and the header MMR.
We need both to fully validate full blocks (the output, rangeproofs and kernels are validated w.r.t. the corresponding header, which in turn is validated against the existing header MMR).

But we also need to be able to manipulate the header MMR in isolation - for example when validating "header first" and during initial header sync.

Some thoughts around what we do currently -

  • txhashset is received from a peer as part of fast sync
    • header MMR is built locally from sync'd headers
    • we do some funky temp dir stuff when validating a txhashset (and jump through hoops to copy the local header MMR around with it)
  • rebuilding the header MMR during fast sync is expensive (and we don't need to do it as we own the local header MMR)
  • we process "header first" and need to be able to mutate our header MMR in isolation
  • processing a full block involves processing the header first (is PoW valid etc.)
    • process the output|rangeproof|kernel MMRs based on pre-existing header MMR
  • processing a fork/reorg is tricky
    • our header chain may diverge from our main (full block) chain due to sync
    • so rewinding to process a block on a fork involves multiple stages of rewind

This PR moves the header MMR out from under the txhashset and into the local chain instance directly.

We now need a "pair" of extensions when processing full blocks -

  • the txhashset extension
  • the corresponding header extension

Both extensions are rewound independently, but consistently to support various fork/reorg scenarios. We rewind to a single consistent common ancestor (similar to a 3 way merge in git).
i.e. Our header MMR has diverged from the main chain due to a peer advertising previously unseen headers.

Making these changes has allowed a lot of code to be simplified and removed where we no longer need to handle edge cases and complex scenarios across the various MMRs when keeping them all consistent.


TODO -

  • missing docs and other build warnings
  • more testing

@garyyu
Copy link
Contributor

garyyu commented Aug 29, 2019

👍 this makes a lot of sense to me, to split it out from txhashset. I will take some tests in next days.

btw, May I ask why we constructed a MMR for header? what's the main benefit comparing to original pure database solution? I searched the origin PR #1716 but it didn't mention these.

@antiochp
Copy link
Member Author

btw, May I ask why we constructed a MMR for header? what's the main benefit comparing to original pure database solution? I searched the origin PR #1716 but it didn't mention these.

A header MMR gets us a step closer to being able to support FlyClient - #1555

@garyyu
Copy link
Contributor

garyyu commented Aug 29, 2019

Oh, Yes! now I remember it:-) thanks for your answer.

@antiochp antiochp force-pushed the split_header_mmr branch 3 times, most recently from 4436cb6 to 2f380cd Compare August 29, 2019 17:41
@antiochp antiochp mentioned this pull request Sep 3, 2019
@antiochp antiochp force-pushed the split_header_mmr branch 3 times, most recently from b873c4d to a06900a Compare September 4, 2019 08:20
@antiochp
Copy link
Member Author

antiochp commented Sep 4, 2019

@garyyu Did you get chance to test this? I'd like to merge soon if we can as it simplifies a lot of the rewind logic.

return Some(header);
if let Ok(hash_at_height) = header_pmmr.get_header_hash_by_height(header.height) {
if let Ok(header_at_height) = self.chain().get_block_header(&hash_at_height) {
if header.hash() == header_at_height.hash() {
Copy link
Contributor

@DavidBurkett DavidBurkett Sep 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check shouldn't be necessary anymore since we're retrieving by hash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh looks like we can simplify this, but we do still need the check.
The core of this is to check the provided header against the header at that height on the current chain - if they have the same hash then the header is on the current chain.

  1. look up the header based on hash
  2. look up the header on our chain based on this height
  3. check these are the same header by hash

But we don't need to actually look the header up based on hash_at_height as we can just compare this hash directly.

if let Ok(header) = self.chain().get_block_header(&hash) {
	if let Ok(hash_at_height) = header_pmmr.get_header_hash_by_height(header.height) {
		if header.hash() == hash_at_height {
			return Some(header);
		}
	}
}

Copy link
Contributor

@garyyu garyyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take some time to finish reading the chain.rs, still need some time to read others on this big PR.

chain/src/chain.rs Show resolved Hide resolved
chain/src/chain.rs Outdated Show resolved Hide resolved
chain/src/chain.rs Show resolved Hide resolved
chain/src/chain.rs Show resolved Hide resolved
chain/src/chain.rs Show resolved Hide resolved
chain/src/pipe.rs Outdated Show resolved Hide resolved
chain/src/txhashset/txhashset.rs Show resolved Hide resolved
chain/src/txhashset/txhashset.rs Show resolved Hide resolved
and pass in header_pmmr for header lookups
@antiochp
Copy link
Member Author

antiochp commented Sep 5, 2019

Rebased on master to pull in #3015
Had to refactor rebuild_height_pos_index() moving it from the extension to the txhashset itself.
We only really want to use extensions when mutating the underlying MMRs in the txhashset or when viewing rewound MMR state.
We want to rebuild the height pos index based on current state so using the txhashset directly makes more sense here. We also need access to the header_pmmr so we can lookup headers by their height.

These changes has made the LOC grow quite a bit in this PR but its mainly just a refactor to deal with the breakout of the header_pmmr from the txhashset.

@garyyu
Copy link
Contributor

garyyu commented Sep 6, 2019

@antiochp Please let me know if it's ready for another reviewing, thanks.

@antiochp
Copy link
Member Author

antiochp commented Sep 6, 2019

@garyyu yes please!

Copy link
Contributor

@garyyu garyyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good for me, thanks again for this nice refactoring 👍
(a little clean-up needed for the unused old output pos index, but for making this PR not extending more contents, I can write another PR for that.)

@garyyu garyyu merged commit 3839852 into mimblewimble:master Sep 6, 2019
garyyu pushed a commit to gottstech/grin that referenced this pull request Sep 7, 2019
* wip

* sync sort of works now

* get rid of the deadlock during compaction
there is *always* a deadlock in there when we make changes like this...

* cleanup how we rebuild the sync MMR on init

* cleanup rewind logic

* roll the "fix invalid root" changes into this PR

* move rebuild_height_pos_index into txhashset
and pass in header_pmmr for header lookups

* cleanup and remember to setup sync head on init

* cleanup unnecessary ref muts

* rebuild_height_pos_index when writing txhashset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants