Recovery of PoVs #388
Comments
Yeah, we are aware of it. And somewhere I have an todo to implement recovery from the relay chain. |
@bkchr thank you, looking forward to updates. |
I've explained this in chat and paritytech/polkadot#2203, but might as well spell it out here.. Safer Option: We've at least 2f+1 out of 3f+1 validators who signed that they have availability pieces of Dave's block, so a collator could contact any f+1 of those, obtain their piece, and reconstruct Dave's block from the erasure coding. This works fine, but requires opening connections to 256 < f+1 < 512 validators. It's fine if only the next upcoming collator does so, but not good if many do so. Faster Option: There are three node classifications who know the full parachain block, the collator who created it (Dave), the two-ish validators who acted as backing checkers for it (maybe Dave's buddies), and the approval checkers who checked the block. Approval checkers can only be known by listening to the relay chain gossip, but Alice, Bob, Charlie already do so anyways. Approval checkers do not currently retain Dave's block, but they could do so temporarily, which enables downloading the whole block by talking to only one validator. Approval checkers cannot be known by Dave in advance, meaning he cannot plan upon this working. Yet, if Dave controls proportions q < 1/3 of the collators and p<1/3 of the validators then he waits an expected 6 p^{-16}/q seconds between attempts, roughly 1 year with q=p=1/3. Acceptable risk since stalls are not soundness violations. There is nothing wrong with either option, so we'll implement which ever looks simplest, almost surely the safer slower option. If everyone behaves then no problem. :) We could later add an opt-in form of the faster option, if we're worried both about this attack and about approval checkers holding blocks too long. We've the same issues for XCMP messages because parachains must process incoming ones. We create XCMP messages as outputs of the sending chain's PVF, so again they're known by anyone who checks the sending block, i.e. sending block's collator, sending block's backing checkers, or sending block's approval checkers. In this case however, we've cross parachain logic so there is no simple honest path via which the message arrives usually. Also, receivers only want the message itself, not to rerun the sender's block, so the safer option really sucks. We'll therefore make receiving collators ask for their messages from sender's backing and approval checkers, so the faster approach. Again stalls remain an acceptable risk, but if abuse happens then we'll code up the safer slower approach as a fall back. It's kinda interesting that XCMP favors the opposite method, but not problematic. |
That works as long as we have a consensus that knows who is the next collator. Currently, with the relay chain based consensus we can not know this. |
We could provide detached proofs of being the next collator with any of aura, babe, and sassafras, but.. It's also fine if you think the fast option of asking backing and approval checkers might be less code or might carry enough additional value in terms of simplifying the XCMP implementation. We'll discuss retaining the candidate longer with Rob since that's our only sticky point. |
Yeah I know,that will be easy :) |
I hijack this issue now ;) As described above, we need to implement support for PoV recovery through the relay chain availability recovery. The PoV recovery should only be done by collators to not put that much pressure on the relay chain. The rough idea on how it should work: After we have imported a relay chain block that contains an unknown parachain block, we need to start the recovery:
After having recovered the PoV, we can decode it and import the inner block and announce this block to the network. As we are currently running with 12s block time, we should have enough time to recover the block before the next round starts and a new block needs to be produced. |
Are parachains nodes meant to first seek the block from the parachain's own network? Or are we already too late for that by the time it appears on the relay chain? As an aside, after we have upcoming block authorizations or PrePVF or whatever we call them, then I think upcoming block producers could ask random approval checkers for the whole block, under the theory that outing yourself to a random validator causes little censorship risk. We should not be distracted by this sort of thing right now though I suppose. |
We don't have any support for searching the parachain network for a given block :(
I would still like to see that bakers give all connected collators of a given parachain the chance to download a seconded PoV. If we got support for this, availability recovery should probably only be required if the backers can not give us the data. |
I meant whether parachains should attempt to gossip blocks internally. We cleaned up the pre-collation ideas somewhat paritytech/polkadot#2888 (comment) In this, we've some advantages to majority of the parachain seeing the collation as originating from the backing validators, even if it later gets gossiped around the parachain. |
The whole point of using the recovery is to recover a PoV that wasn't gossiped in the parachain network. Either because the collator was malicious or the collator crashed before being able to gossip the block to other parachain nodes. |
I was wondering whether this is planned to be fixed before auctions go live on Kusama. |
I have it working in a local branch. I'm currently cleaning it up. I want to have this out asap |
Description
Let parachain A has four collators: Alice, Bob, Charlie, and Dave. Let Dave is a malicious collator. Currently having at least one collator is possible to stop block producing on a parachain. Let's see following steps:
Possible solutions
Probably collator should be able to recover the block from the relay when it can't get the best block from sync process.
The text was updated successfully, but these errors were encountered: