Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement Flyclient by Benedikt Bunz, Lucianna Kiffer, Loi Luu and Mahdi Zamani's #1555

Open
tromp opened this issue Sep 19, 2018 · 26 comments

Comments

@tromp
Copy link
Contributor

tromp commented Sep 19, 2018

replacing the prev_hash by the MMR root of all previous block hashes, as presented in
https://www.youtube.com/watch?time_continue=8404&v=BPNs9EVxWrA

@tromp tromp changed the title implement Benedikt Buenz's Flyclient implement Benedikt Bunz's Flyclient Sep 19, 2018
@ignopeverell
Copy link
Contributor

To complement, this could have 2 applications:

  1. Light clients.
  2. Really fast sync, with full headers downloaded in the background afterward.

@ignopeverell ignopeverell added this to the Testnet4 milestone Sep 19, 2018
@ignopeverell
Copy link
Contributor

Marked this for T4 just for the header previous change. Actual sync can come later.

@antiochp
Copy link
Member

@tromp @ignopeverell Starting to look a bit closer at this (been thinking about it in the background for a bit as well).

Our current MMRs (output, rangeproofs and kernel) all implement a "data" file and a "hash" file.

Are we thinking of doing something similar here?
Or do we ideally want to try and get away with just a hash file?

We have the block headers in the db/index, so we do not necessarily need a data file?
Or we could put headers in the data file and remove the need to store them all in the db?

The only issue I see with that approach is headers are not a "fixed" size, specifically the PoW.
And our MMR impl assumes/requires data to be of a fixed size.
We get around this with the rangeproof MMR by using a max size - which implies we are "wasting" space in the rangeproof MMR (though not sure how much).

So -

  • store just header/pow hashes in the hash file of a modified MMR?, or
  • store headers in full MMR and find a solution for variable sized data?

@tromp
Copy link
Contributor Author

tromp commented Sep 25, 2018

We get around this with the rangeproof MMR by using a max size

I thought bulletproofs are all 674 bytes?
It's of course quite possible that future advances will allow smaller rangeproofs,
so having our MMRs able to handle variable sized data would be beneficial.

For the purpose of verifying accumulated difficulty, we need the sipkeys for each pow.
Assuming we don't want to store those, we need to derive them from the header.

So conceptually, having the data be all headers rather than just all pow is quite attractive.

The header MMR would be the one MMR to rule them all.

@antiochp
Copy link
Member

We do this to define the len of a rangeproof in ser.rs -

grin/core/src/ser.rs

Lines 381 to 385 in 11f2d7b

impl PMMRable for RangeProof {
fn len() -> usize {
MAX_PROOF_SIZE + 8
}
}

MAX_PROOF_SIZE comes from a const in the underlying rust-secp lib.

I was assuming this was because they are potentially of varying lengths.
Maybe they are technically, but all of our rangeproofs in Grin are the same length?

How much variation is there in the size of pow (solution?) in our headers?
Wondering if we could get away with just defining a MAX_HEADER_SIZE or similar (and just taking the hit on the wasted bytes)?

@tromp
Copy link
Contributor Author

tromp commented Sep 25, 2018

How much variation is there in the size of pow (solution?) in our headers?

That depends on the choice of 2nd PoW on mainnet obviously, but if we have both
size 29 and size 32 cuckatoo, then there's a 15 bytes difference (between 153 and 168 bytes).

@antiochp
Copy link
Member

antiochp commented Sep 25, 2018

Actually - its probably not too complex to have a modified PMMRBackend that simply delegates storing of hashes to a "hash file" and storing of the actual data to the database (lmdb).
That way we can store arbitrary data without worrying about the data file format.
We don't actually need the data file backing the PMMR.

To append data to this backend -

  • hash the header, store hash in the hash file, note the pos we inserted into
  • store the header in the db
  • maintain an index in the db of pos (the new pos) to the header hash

We will basically have the existing header_by_height index alongside an additional header_by_pos index. These two indices will be specific to the most work chain.
And all the headers (even outside the most work chain) will be in the db, indexed directly by their hash (same as they are currently).

@tromp
Copy link
Contributor Author

tromp commented Sep 25, 2018

hash the header, store hash in the hash file, note the pos we inserted into

Note that "hash the header" is already done together with the pos (hash_with_index).

@antiochp
Copy link
Member

Yeah - I just mean we need to know the pos to index the header against, based on where we insert the hash into the hash file.

@ignopeverell
Copy link
Contributor

The len() is a leftover from when we had Borromean sigs based range proofs. Maybe some code to remove...? I'm ok with the approach otherwise.

@antiochp
Copy link
Member

I carried out some various refactoring to make the header processing (both sync and during block propagation) a bit easier to reason about.

I spiked (a couple of different times) trying to wire a basic "header MMR" into the processing pipeline.
Tried adding this to both the existing txhashset (as a fourth PMMR) and as a standalone txhashset type thing, just for headers.

Each time I came up against roadblocks getting it to work in practice.

Putting down some thoughts here around what I believe to be the complexity here -

Blocks vs. Headers

  • We process blocks through either normal propagation via gossip/broadcast or when syncing.
    During normal block propagation a blocks is either increases total work and becomes new head of the chain or exists on a fork.
    • If it is on the head of the chain we do not need to rewind.
    • If it is a fork we rewind our MMRs and apply it as necessary, then discard the rewound MMRs.
  • At any given point in time we have one "chain head" (the block with most total work).
  • Everything else is a losing fork until such time as any fork increases the total work and becomes the main chain.
  • We do not need to track these losing forks other than storing the blocks in the db/index so that we can rewind and reapply to the MMRs as required.
  • The input|output|kernel MMRs (on disk) simply track the main chain.

Headers are different though.

  • Any peer may advertise a fork with higher total work at any time.
  • We do not want to switch over to tracking this new fork until we can verify it is indeed "most work".
    • To facilitate this we track both a "header head" and a "sync head" for headers.
  • The "header head" is the header we know to be "most work".
    • This is the chain we are currently tracking.
  • During sync we request blocks along the chain toward the "header head" if we do not have them yet.
  • The "sync head" allows us to track a potential alternative fork so we can request headers on this fork to verify it.
  • Once verified as "most work" we update the "header head", switch to this new chain and request blocks.

The critical point here is we need to support two competing header chains at any given time.
Supporting these with a single header MMR involves either -

  1. constant flip-flopping, rewinding and reapplying to a single MMR (which gets prohibitive during sync)
  2. choosing a single header fork to track (DoS vector from a mischievous peer lying about competing forks)

So coming to the conclusion that we cannot do a "Header MMR" without actually maintaining two MMRs internally.
Any header received requires us to rebuild (via rewind and reapply if necessary) the full header MMR prior to this.
This would be a lot easier to implement if we maintained a "header head MMR" and a "sync head MMR".

Going to explore this a bit further but I'm pretty sure we cannot do this with a single header MMR.

@tromp
Copy link
Contributor Author

tromp commented Sep 28, 2018

Replacing prev_hash by a header MMR presents obvious difficulties to syncing, as one needs the frontier of a block's header MMR to verify the next block's header MMR. So when learning of a higher cumdiff block, all one can do is check their PoW, ask for the preceding header (whose id we cannot know), and repeat until it merges onto the history we do have. Only at that point can we go forward and fully check the branch, including block downloads. Once done we would switch head to sync head.

@ignopeverell
Copy link
Contributor

ignopeverell commented Oct 1, 2018

  1. constant flip-flopping, rewinding and reapplying to a single MMR (which gets prohibitive during sync)
  2. choosing a single header fork to track (DoS vector from a mischievous peer lying about competing forks)

I'm curious why you think either of these are a problem. We already do 2, as we will pick the most work header which could be advertised by a single peer. There is no way around that. And the way you address the DoS is ban that peer as soon as you realize it lied (see #1139). Regarding 1, rewind should be cheap and headers aren't too large to re-load and hash, so it should be a really minor cost in sync overall (and generally null due to 2).

@tromp
Copy link
Contributor Author

tromp commented Oct 1, 2018

There is still a danger of DoS attack if we believe a higher cumdiff with very low current diff.
But in this case few resources are wasted, as we will keep asking for the previous header,
and check its PoW instantly.
If we don't want to have our bandwidth wasted, then we could compute the minimum number of blocks that can drop difficulty by, say, a factor of 10, and ask the peer to just provide headers from that height downward. Then they can no longer fake too low a diff...

@ignopeverell
Copy link
Contributor

I guess what I was trying to say is that referencing the MMR root in itself doesn't increase those risks.

@antiochp
Copy link
Member

antiochp commented Oct 2, 2018

I don't think I explained my concerns very well earlier, so let me try and describe the scenario that's bothering me -

For simplicity we assume height is a reasonable proxy for cumulative difficulty.

  • node is fully in sync and up to date at height 100,000.
  • a peer surprises everyone and advertises (via ping/pong) a new chain head with more work
    • new height is 100,005
    • but forked off at 99,000 (1,000 blocks prior)

Our node then does the following -

  • switches to header_sync mode
  • inits its sync_head to current header_head (height 100,000)
  • builds a locator from height 100,000 as necessary
  • asks the peer with the advertised head for headers passing the locator

The peer uses the locator to identify an approximate fork point and sends back headers on its chain from that point.
This will be at least 1,000 blocks prior and may be considerably earlier than this.

Lets assume for simplicity the peers sends back headers [99000, 99001, ..., 99511].

Our node then processes these new headers.
It needs to -

  • rewind the header MMR to height 89,999
  • reapply any headers necessary (in this case none)
  • apply new headers

The issue is we're actually tracking two chains at this point -

  1. the main chain (both head and header_head are pointing at the old height 100,000).
    And we need this behavior to be stable so we can safely handle block height 100,001 if it gets propagated while we're in the middle of the header_sync.
    We need this because we don't yet know this new set of headers will turn out to be valid.

  2. the new "candidate" header chain based on the peer advertising a new head (and we track our progress on this via our sync_head).

The header MMR will only be able to track on of these in its persisted state. Presumably it tracks header_head and not sync_head.

Back to the example - our node now builds a new locator and asks the peer for more headers.
We get back the headers [99512, 99513, ..., 100004, 100005].
Our node now needs to -

  • rewind all the way back to height 89,999 (where the fork occurred)
  • reapply headers [99000, 99001, ..., 99511] (we have them in the db, but not in the header MMR)
  • apply new headers [99512, 99513, ..., 100004, 100005]

This "rewind and reapply" is not cheap (relatively speaking, compared to normal header sync).

Our "streaming headers" works against us here as well.
We don't just need to rewind and reapply a couple of batches of 500 headers at a time, we actually have to rewind and reapply for each batch of 8 headers, each time rewinding all the way to the fork point and reapplying all subsequent batches on top of this.

  • rewind 1000 headers, apply 8 headers, then
  • rewind 1000 headers, apply 16 headers, then
  • rewind 1000 headers, apply 24 headers, then
  • ...
  • rewind 1000 headers, apply 1005 headers

And this gets really expensive with a long fork.


This is not an issue currently because we track enough state for the two chains, with our sync_head and our header_head. And by storing all the headers we have ever seen in the db, not as an explicit chain, but simply keyed by hash.

Once we introduce state that makes the chain explicit (i.e. the header MMR which represents a single chain), then "rewind and reapply" to flip-flop between the two chains gets expensive for longer forks.

Does that make sense?

@ignopeverell
Copy link
Contributor

It does make sense but it doesn't seem too hard to optimize (for a relative version of hard). Here are a few ideas, which you probably already thought of but maybe not in conjunction:

  1. Past headers in the alt chain don't need to be re-validated, they just need to be appended again to the MMR after rewind. This is cheap (most expensive is reading them again from db, which isn't bad especially with the header cache).
  2. The batch size of 8 can definitely be optimized. Especially if we implement your iterator idea, we could read a lot more from the iterator when a rewind is required.

To put this in perspective, consider kernels. A block ultimately could reasonably have a few thousand of those. Even a short fork of a few blocks would need to rewind and reapply 10k or 20k elements arranged in larger batches (blocks). So we do need to be fast at doing that regardless.

@bbuenz
Copy link

bbuenz commented Oct 2, 2018

Awesome! Two things: Please also credit my coauthors Lucianna Kiffer, Loi Luu and Mahdi Zamani. And we are working hard to have the paper up as soon as possible. That will hopefully create some clarity. We have a new updated analysis that gives you the optimal sampling strategy and even takes into account changing difficulty. I guess the most important thing is that the MMR (https://eprint.iacr.org/2015/718.pdf) structure remains the same and is the only thing that needs to be implemented in the blockchain. Everything else, like how the proof exactly is constructed is purely a client side change.

@bbuenz
Copy link

bbuenz commented Oct 2, 2018

Also I think that Beam has also been working on an implementation. Not sure what the politics are but perhaps it's possible to work together?

@tromp
Copy link
Contributor Author

tromp commented Oct 2, 2018

excuse my ignorance, but what is a locator and how does it allow identification of an approximate fork point?

Regarding "have to rewind and reapply for each batch of 8 headers", why is that quadratic behaviour needed? why not rewind 1000 headers and apply 1005 headers the first time?

Am I right in thinking that rewinding an MMR n steps takes time linear in n?

@ignopeverell
Copy link
Contributor

what is a locator and how does it allow identification of an approximate fork point?

Same idea as in bitcoin. In short, a node sends the header hashes it knows 2^n back from the head up to genesis. The receiving node identifies the common headers and sends what's after that (on a fork or not).

why not rewind 1000 headers and apply 1005 headers the first time?

Just an idiosyncracy of parsing headers from the network, just happens to be fastest if we read them by batches of 8. But as I mentioned, we can definitely introduce larger batches when we know there's more handling involved.

Am I right in thinking that rewinding an MMR n steps takes time linear in n?

Rewind itself is mostly constant. We just jump back to the fork point in one go. That's another nice thing with MMRs. There's a caveat around spent bitmaps that need to be reconstructed however, which is linear.

@tromp tromp changed the title implement Benedikt Bunz's Flyclient implement Flyclient by Benedikt Bunz, Lucrecia Kiffer, Loi Luu and Mahdi Zamani's Oct 4, 2018
@antiochp
Copy link
Member

See #1726 for concrete tasks for testnet4.

@antiochp antiochp removed this from the Testnet4 milestone Oct 17, 2018
@antiochp
Copy link
Member

FlyClient (in loose terms) allows us to validate the full header chain by randomly sampling a small subset of headers and providing Merkle proofs that those headers are included in the full header chain (based on the previous header_root committed to in the header).

Can we then do something similar for the kernel MMR? Can we randomly sample a small subset of kernels and do the same thing?

i.e.

  • randomly sample of headers, then for each header
    • Merkle proof to prove inclusion in header chain based on header_root of latest header
    • random sample of kernels (from set of kernels up to this header), then for each kernel
      • Merkle proof to prove inclusion in the kernel set based on kernel_root of the header

So we prove that n headers are included in the current header chain.
And that for each of those headers, we prove that m kernels are included in the kernel MMR committed to in the header.

Does this allow us avoid verifying the full kernel sum for the full header chain during fast sync?

@tromp
Copy link
Contributor Author

tromp commented Dec 12, 2018

If a random sample of PoWs that the latest block commits to are valid, then you know that the cumulative difficulty is approximately correct.

There is no similar guarantee for a sample of kernels. You need to verify every single kernel to know that the current UTXO set is correct. Knowing that 99.999% of kernels is correct is not enough.

@antiochp
Copy link
Member

Shame.

@tromp tromp changed the title implement Flyclient by Benedikt Bunz, Lucrecia Kiffer, Loi Luu and Mahdi Zamani's implement Flyclient by Benedikt Bunz, Lucianna Kiffer, Loi Luu and Mahdi Zamani's Oct 3, 2019
@tromp
Copy link
Contributor Author

tromp commented Oct 3, 2019

Latest version of flyclient paper at https://eprint.iacr.org/2019/226

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants