Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment of availability-chunk indices to validators #47

Merged
merged 11 commits into from Jan 25, 2024

Conversation

alindima
Copy link
Contributor

@alindima alindima commented Nov 13, 2023

Rendered

This RFC proposes a way of permuting the availability chunk indices assigned to validators for a given core and relay chain block, in the context of recovering available data from systematic chunks.

Signed-off-by: alindima <alin@parity.io>
@alindima
Copy link
Contributor Author

CC: @sandreim, @burdges, @ordian, @eskimor, @alexggh

@alindima alindima marked this pull request as draft November 13, 2023 12:09
Comment on lines 147 to 148
This message type will include the relay block hash where the candidate was included. This information will be used
in order to query the runtime API and retrieve the core index that the candidate was occupying.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on a second thought: is this even possible? If the validator is not aware of the fork, how can it call a runtime API on that fork?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the validator hasn't seen that fork, it's can't call into runtime API of that

Copy link

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree we need such a mapping, but not convinced it needs to be a runtime API (the only benefit to me seems simplifying alternative client impls, but by not much).

Such a mapping doesn't enable systematic chunk recovery - it's possible already - but it spreads the load more evenly across validators (right now all systematic chunks are assigned to the same validators per session regardless of block or core id).

My suggestion is to use a hash of ParaId for now instead of core_index, which is readily available in the candidate receipt.

text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
Comment on lines 104 to 105
1. mitigates the problem of third-party libraries changing the implementations of the `ChaCha8Rng` or the `rand::shuffle`
that could be introduced in further versions, which would stall parachains. This would be quite an "easy" attack.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get this argument. Are you talking about supply chain attacks? How is it specific to this RFC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's an example:

  1. all validators are running normally, using systematic recovery.
  2. a contributor on the rand crate (malicious or not) makes a perfectly valid change to the shuffling algorithm, that results in a different output for shuffle().
  3. there's a polkadot node release that bumps rand to this new version.
  4. some of the validators upgrade their node.
  5. as a result, some/all parachains are stalled, because some validators have a differing view of the validator->chunk mapping. Also, the validators that upgraded stop receiving rewards

IMO this has a high chance of happening in the future.

I view this mapping in a similar way to the per-session validator shuffle we do in the runtime to choose the validator active set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now, we could implement our own version of shuffle, which would mitigate this issue.
The thing I'm concerned about is implementing our own version of ChaCha8Rng. How feasible is it to assume that it won't change in the future? CC: @burdges

Copy link

@burdges burdges Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a supply chain attack so much as rand having a history of churn & instability.

text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
1. relatively quick to compute and resource-efficient.
1. when considering the other params besides `validator_index` as fixed,
the function should describe a random permutation of the chunk indices
1. considering `session_index` and `block_number` as fixed arguments, the validators that map to the first N/3 chunk indices should
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about my argument to use session_index here, I think it's a very niche edge-case that's not worth worrying about. An alternative to (session_index, block_number) would be block hash which would also sidestep that issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds fair. Another issue I just thought of with using block number is that for disputes hapenning on unimported forks, the ChainAPI call for getting the block number would also fail.

text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
Comment on lines 155 to 159
As an alternative to core_index, the `ParaId` could be used. It has the advantage of being readily available in the
`CandidateReceipt`, which would enable the dispute communication protocol to not change and would simplify the
implementation.
However, in the context of [CoreJam](https://github.com/polkadot-fellows/RFCs/pull/31), `ParaId`s will no longer exist
(at least not in their current form).
Copy link

@ordian ordian Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO we shouldn't worry too much about CoreJam compatibility. IIUC there might be a drop-in replacement for ParaId (AuthId), so we should first explore an avenue with ParaId perhaps.

The (bigger) problem with ParaId is that it's claimable by users, so in the theory one could create a collision attack where multiple paras have the same systematic chunks. In practice, I believe such an attack would be high cost and low benefit. And, perhaps, it could be mitigated by using a cryptographic hash of ParaId (or AuthId or core_index in the future).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the core index here probably, not the para id. I'd thought the core indexes could be spaced somewhat evenly, but with the actual sequence being random.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed my view here above in #47 (comment)

It's probably more future proof if we do not require the core index here, so the map uses num_cores, num_validators, relay parent, and paraid. We've other options but this looks pretty future proof. CoreJam cannot really work without something like paraid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using paraid was my first attempt, but it complicates things quite a bit.
As Andronik notes, it's claimable by a user. I don't think it's the biggest problem though, they can't get a different ParaId every day (can they?).

paraid's biggest problem is that it doesn't form a strictly monontonically increasing sequence, so we'd have to do something else with it. My initial attempt was to seed a ChaCha8 RNG with it and generate one random number. Then this number was the offset into the chunk index vector.

But then if we complicate it so much, we may want to embed it into the runtime, so that future updates to the rand crate don't break us (which creates all sorts of problems that could be avoided).

We've had this paraid -> core_index back and forth for a while now, it'd be great if we could all (the ones interested in this RFC) hop on a call or somehow reach a conclusion. There are pros and cons to both, AFAICT.

I think this remains the only bit that prevents this RFC from moving on (anybody correct me if I'm wrong)

Comment on lines 147 to 148
This message type will include the relay block hash where the candidate was included. This information will be used
in order to query the runtime API and retrieve the core index that the candidate was occupying.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the validator hasn't seen that fork, it's can't call into runtime API of that

@alindima
Copy link
Contributor Author

Considering feedback and latest discoveries, I'll rewrite the proposal using ParaId instead of CoreIndex and using relay_parent_hash instead of (session_index, block_number)

@alindima
Copy link
Contributor Author

Rewritten the proposal. Have a look

Copy link

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments.

I am still not convinced we need runtime API for this, but curious what others think. If we put the idea to an extreme, we could put most of the client code into the runtime, so it can be automatically and atomically upgraded. But do we want to?

text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
Comment on lines 123 to 124
mapping, this mitigates the problem of third-party libraries changing the implementations of the `ChaCha8Rng` or the `rand::shuffle`
that could be introduced in further versions. This would stall parachains if only a portion of validators upgraded the node.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should obviously not rely in consensus on something that is not specified to be deterministic, like https://polkadot.network/blog/a-polkadot-postmortem-24-05-2021#the-good. And try to reduce third-party dependencies in general.
I would assume that a library that implements ChaCha8Rng adheres to the spec, otherwise it's a bug.
To mitigate supply-chain issues including bugs, we should probably use cargo-vet or a similar tool, but again, this is out of scope of and not limited to this RFC.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. What we would want is some version/spec identifier. That is only allowed to change at session boundaries. Then we can put a requirements on nodes to implement that spec and once enough clients did, we can do the switch.

While we are at it, this should probably take into account the used erasure coding itself as well. We should be able to swap it out for a better implementation if the need arises.

Copy link
Contributor

@tomaka tomaka Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is off-topic for this RFC, but as a heads up we already use ChaCha20 and shuffle for the validators gossip topology: https://github.com/paritytech/polkadot-sdk/blob/2d09e83d0703ca6bf6aba773e80ea14576887ac7/polkadot/node/network/gossip-support/src/lib.rs#L601-L610

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. What we would want is some version/spec identifier. That is only allowed to change at session boundaries. Then we can put a requirements on nodes to implement that spec and once enough clients did, we can do the switch.

While we are at it, this should probably take into account the used erasure coding itself as well. We should be able to swap it out for a better implementation if the need arises.

Yes. We'll use the new NodeFeatures runtime API for that: paritytech/polkadot-sdk#2177

If we'll make changes in the future to either the shuffling algorithm or the underlying reed-solomon algorithm, we can add a new feature bit there

@alindima
Copy link
Contributor Author

Thanks for the reviews and suggestions Andronik!
Curious to see what others think as well!

text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
Comment on lines 123 to 124
mapping, this mitigates the problem of third-party libraries changing the implementations of the `ChaCha8Rng` or the `rand::shuffle`
that could be introduced in further versions. This would stall parachains if only a portion of validators upgraded the node.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. What we would want is some version/spec identifier. That is only allowed to change at session boundaries. Then we can put a requirements on nodes to implement that spec and once enough clients did, we can do the switch.

While we are at it, this should probably take into account the used erasure coding itself as well. We should be able to swap it out for a better implementation if the need arises.

) -> ChunkIndex {
let threshold = systematic_threshold(n_validators); // Roughly n_validators/3
let seed = derive_seed(relay_parent);
let mut rng: ChaCha8Rng = SeedableRng::from_seed(seed);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Can we not deterministically arrive at an assignment based on block number and para ids, in a way that perfectly evens out load?

For example: We use block number % n_validators as a starting index. Then we take threshold validators starting from that position to be systemic chunk indices for the first para in that block. The next para starts at the additional offset threshold, so block_number + threshold % n_validators and so on.

We could shuffle validator indices before doing that, but I don't see how this gains us anything.

Now, using information that is not available as part of the candidate receipt was part of the problem we wanted to avoid. What is the "next" para, this information is not necessarily available in disputes. Essentially this means we are using the core number.

But:

  1. It usually is, most validators should have seen a block that got disputed, otherwise the candidates could never have gotten included.
  2. Disputes are not the hot path. They are an exceptional event, that should barely ever happen. It should not be an issue, if disputes are not able to use systemic chunks always or even ever.

There are other cases, e.g. collators recovering availability because the block author is censoring. But also those should be exceptional. If systemic chunk recovery would not be possible here, it would not be a huge problem either. On top of the fact that this should also not be a common case, the recovery here is also not done by validators - so worse recovery performance would not be an issue to the network.

Summary:

Systemic chunk recovery should only be important/relevant in approval voting, where we have to recover a whole lot of data every single relay chain block.
Therefore it would be good, if systemic chunks worked always, but I would consider it totally acceptable if it were an optimization that would only be supported in approval voting.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid paraid here imho, not overly keen on relay parent either.

Instead use era/session, slot, and chain spec to define the validator sequence, and then core index to define the start position in the validator sequence.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validators are already randomized at the beginning of the session. Core index for start position makes sense. What is wrong with using the block number in addition?

Reason, I would like to have it dependent on the block (could also be slot, I just don't see the benefit) is that by having a start position by core index, we ensure equal distribution of systemic chunks across a block, but paras are not all equal. Some could be heavier than others, hence it would be beneficial to change the mapping each block.

In my opinion we could also use the hash instead of the block number, here I think anything is likely better than static.

Copy link

@burdges burdges Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want slot or block I think. We're not overly adversarial here, but you can manipulate block numbers easily, while slot numbers represent an opportunity for doing something, so you always pay 19 DOTs or more to chose another one. I'd say slot not block.

We randomize the validator list based upon the randomness two epochs/sessions ago? Cool. Any idea if we similarly randomize the map from paraids to cores per era/session too? If yes, then maybe we could just progress sequentially through them?

let k = num_validators / num_cores;
let fist_validator_index_for_core = ((core_id - slot) * k % num_validators) as u32;

We'd prefer the randomization of core_id for this because otherwise you could still make hot spots. We could've hot spots even with this scheme, but not so easy to make them. We'd avoid those if we randomize per slot.

Also this exactly computation does not work due to signed vs unsigned, and the fact that it suggests things progress backwards as time progresses time, which again tried to avoid hot spots.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea if we similarly randomize the map from paraids to cores per era/session too?

AFAICT from the scheduler code, we don't (at least for the parachain auction model; for on-demand, I see more complicated logic which takes into account core affinities).

We're not overly adversarial here, but you can manipulate block numbers easily, while slot numbers represent an opportunity for doing something, so you always pay 19 DOTs or more to chose another one. I'd say slot not block.

I'm having a hard time understanding the advantage of slot number vs block number. This may be simply because I don't know that much about slots. AFAICT, slots are equally useful as block number for the mapping function (monotonically increasing by 1), except that there may be slots that go unoccupied (if the chain is stalled for example) and are therefore skipped. Is that correct?

so you always pay 19 DOTs or more to chose another one

what is this fee? where can I read more about this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@burdges I don't get your arguments. How can block numbers be manipulated? They are always increasing by 1, you clearly must be talking about forks. So an adversary could create multiple forks, with all the same load distribution and tries to overload validators this way? If we create forks or reversions we already have a performance problem anyway.

Really not getting how slot numbers are better here. I also don't get your argument about hot spots, my proposal above was precisely to avoid hot spots (by not using randomness). What do you mean by hot spots and how would randomness help here?

A single validator not providing its systemic chunk would be enough to break systemic recovery. I don't see how randomization schemes help here. If we were smart we could somehow track the validators withholding systemic chunks and then make an assignment where they are all pooled together into one candidate. This way, at least only one systemic recovery fails. (We could equally well just remove them entirely.)

But honestly, I would not worry too much about this here. If we ever found that validators try to mess with this on purpose, the threat is low enough that social (calling them out) and governance would be adequate measures to deal with them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the ideal would be to use the relay parent hash directly. Then we don't even need to be able to lookup the block number to determine systemic chunk indices. We will eventually have the core index in the candidate receipt - it would be really good to have this self contained at least once we have that.

Obviously hashes can be influenced trivially by block authors ... but is this a real issue? Assuming we have enough cores, load will be pretty much evenly distributed among validators no matter the hash. The only thing that changes is to which core one gets assigned. I don't mind too much if this can be influenced ... Are there any real concerns here?*)

*) As the system matures, I would assume (especially with CoreJam or similar developments) that candidates will be pretty evened out in load (maxed out). So a validator should not gain much by picking which parachain it wants to have a systemic chunk for.

Copy link
Contributor Author

@alindima alindima Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single validator not providing its systemic chunk would be enough to break systemic recovery

Slightly offtopic: In my WIP PR I added functionality to request up to 5 systematic chunks from the backing group as a backup solution, so that a couple of validators not returning their systematic chunks would not invalidate the entire procedure

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could permit requesting them all from the backing group, but make the availability providers the first choice, meaning maybe: 1st, try all systemic chunk providers. 2nd, try remaining systemic chunks from backers. 3rd, fetch random non-systemic chunks. The concern is just that we overload the backers.

It'll also make rewards somewhat more fragile, but likely worth the difficulty for the performance.

text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
text/0047-random-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
(from backing to inclusion).

Availability-recovery can currently be triggered by the following phases in the polkadot protocol:
1. During the approval voting process.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as recovery is possible in disputes, it should be fine if it can not always use systemic recovery. What is missing for me, is to understand why this needs to be an incompatible change. Shouldn't recovery be possible always, even if you don't know what the systemic chunks are?

Copy link

@ordian ordian Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, when you request a chunk, you need to know which chunk to request. If you specify a wrong chunk, you'll get an empty response. So everyone should be onboard how chunks are shuffled in order for recovery to work, not just from systematic chunks. Unless we rely on backers to have all the chunks, which we can't in case of a dispute.

We could add an additional request type to request any chunk you have. That could probably be a reasonable fallback in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add an additional request type to request any chunk you have. That could probably be a reasonable fallback in this case.

I don't think this achieves the purpose of availability-recovery working with an arbitrary mix of upgraded & non-upgraded validators. Adding a new request type (or even changing the meaning of the existing one) would also mean a node upgrade (because responding to the new type of request is only useful for the nodes that haven't yet upgraded to use the new mapping). So validators might as well just upgrade to the version that uses the new shuffle. I'm not sure this has any benefit

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant as a fallback in case we want to use core_ids which might be not readily available in some cases of disputes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant as a fallback in case we want to use core_ids which might be not readily available in some cases of disputes.

I see. In this case, it would be a reasonable fallback. But this would mean giving up systematic recovery in those cases. I'm wondering if it's worth doing all of this just to replace the para-id thingy with the core_id

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh fine. We'll loose the mappings sometime before the data expires? We should probably avoid doing that. We can obviously ask everyone and account for the DLs differently, but I'd prefer to be more like bittorrent trackers here, so that parachains can fetch data this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll loose the mappings sometime before the data expires?

It can happen if a validator imports disputes statements for a dispute ongoing on some fork that the node has not imported yet.
Or even if a validator was just launched and has no data recorded about what core the block was occupying while pending availability.

@ordian or @eskimor correct me if I'm wrong

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. All validators will know the correct mapping, if they have seen (and still know) the relay parent. There is a good chance that this is the case even in disputes, but it is not guaranteed and recovery must work even in that case (but can be slower).

Using the core index should be fine, what makes that even more acceptable to me is that we actually want (and need to) include the core index in the candidate receipt at some point. So the moment we have this change, then the block availability will cease to matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the core index should be fine, what makes that even more acceptable to me is that we actually want (and need to) include the core index in the candidate receipt at some point.

That's needed for CoreJam I would assume?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elastic scaling also.

@alindima alindima changed the title Random assignment of availability-chunk indices to validators Assignment of availability-chunk indices to validators Nov 27, 2023
@alindima
Copy link
Contributor Author

alindima commented Nov 27, 2023

Thanks to everybody for the feedback!
Based on this and especially my discussions with @eskimor , I will modify the RFC so that:

  1. The shuffling function does not contain any randomness (or other complex third party libraries). This mitigates the problem of the rand crate changing implementations and enables us to not embed the algorithm in the runtime. We want an implementation that can be easily coded and has minimal points of failure in terms of updates/supply chain issues. It'll also make it easier for alternate clients to implement the same algorithm.
  2. The shuffling function uses core_index instead of para_id. This enables us to easily implement an even distribution, because core indices form a monotonically increasing sequence. At the same time, core_index cannot be chosen by a parachain to influence the chunk assignment.
  3. The ChunkFetchingRequest will not ask for a specific chunk index. Validators will respond with the chunk they should be holding according to the assignment function. This enables availability-recovery in scenarios where we don't know the mapping (for example for collators or disputes happening on unimported forks).
  4. Disputes protocol will not always be able to perform systematic recovery. Since not all disputes have easy access to the core_index occupied by a candidate, this would be very complicated to implement. We'll consider this acceptable, as it's a very rare circumstance. Regular systematic recovery will work as before.

Note that we'll keep asserting that enabling this feature needs to happen atomically between validators and collators.

One thing that I've not decided on is the exact assignment function implementation. I think it could be (inspired by other comments):

pub fn get_chunk_index(
  n_validators: u32,
  validator_index: ValidatorIndex,
  block_number: BlockNumber,
  core_index: CoreIndex
) -> ChunkIndex {
  let threshold = systematic_threshold(n_validators); // Roughly n_validators/3
  let core_start_pos = abs(core_index - block_number) * threshold;

  (core_start_pos + validator_index) % n_validators
}

@ordian
Copy link

ordian commented Nov 28, 2023

The ChunkFetchingRequest will not ask for a specific chunk index. Validators will respond with the chunk they should be holding according to the assignment function. This enables availability-recovery in scenarios where we don't know the mapping (for example for collators or disputes happening on unimported forks).

Do you mean introducing v3 version of the req-resp protocol?
And how would that work for availability when we need to get our chunk from backers?

As a result, the implementation of this RFC will no longer be a breaking change for collators.

If the answer to the previous question is yes, I don't see how this is not a breaking change. If no, the receiving side is validating the chunk index matches the response, so I don't see how it's not a breaking change.

I think it would be reasonable to introduce req-resp v3 and later after some time enable this assignment via the runtime feature flag.

@alindima
Copy link
Contributor Author

Do you mean introducing v3 version of the req-resp protocol?

No, I didn't want to add a new version to the protocol. The idea was that we assume all validators have upgraded to have the same validator->index mapping and we also assume that two thirds of them are honest and responded with the right chunk they should be holding according to the mapping (but we don't check that they responded with the right chunk, because we may not know it beforehand).

And how would that work for availability when we need to get our chunk from backers?

That's a good point. Same as before. The validator would issue the request, and in this specific case also check the index.

The idea is that validators will usually check the indices they get (during availability-distribution or systematic recovery during approval checking), but the regular recovery will also work when the entity (collator or validator participating in a dispute) does not know/check the indices against the canonical mapping.

If no, the receiving side is validating the chunk index matches the response, so I don't see how it's not a breaking change.

you're right, it's a breaking change. I guess what I meant is that, after the upgrade, regular recovery will still work even if the collator does not have access to the canonical mapping.

I'll reword my comment.

I think it would be reasonable to introduce req-resp v3 and later after some time enable this assignment via the runtime feature flag.

Experimenting a bit more with what I proposed above, I realised that we either bump the protocol or we do some other arguably hacky way of discovering the chunk index on the receveing side (iterating through keys after reconstructing the partial trie from the proof and checking which value matches the chunk hash).

Considering that you called out the fact that even not bumping the protocol version would still be a breaking change to collators, I think it's best to bump the protocol version and add the chunk_index to the response.
This new protocol version would also have the semantic meaning that the requester MAY not always check the chunk index against the canonical mapping

@alindima alindima marked this pull request as ready for review November 28, 2023 15:08
@eskimor
Copy link

eskimor commented Nov 30, 2023

At the same time, core_index cannot be chosen by a parachain to influence the chunk assignment.

This is actually not true with agile core time: A parachain bids specifically on one particular core.

But it also should not matter much: We are rotating the mapping every block. On top of that, there is only so much a parachain can do here anyway.

The ChunkFetchingRequest will not ask for a specific chunk index. Validators will respond with the chunk they should be holding according to the assignment function. This enables availability-recovery in scenarios where we don't know the mapping (for example for collators or disputes happening on unimported forks).

This is not what we discussed. I suggested to keep requesting via validator index. Right now we also store in the av-store per validator index - if we don't change that, then a node can provide a chunk as requested by validator index, even if it does not know the mapping itself*). Hence everything would keep working with neither the requester nor the responder still knowing the mapping.

*) Assuming the chunk data itself gets stored together with the chunk index.

Once we have the core index in the receipt, this would strictly speaking no longer be necessary, but I also don't see much harm and if we really wanted to, we could change the network protocol again once we have this. Given that changing the candidate receipt is already a pita ... would not be that much of an additional annoyance.

Copy link

@eskimor eskimor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Ideally we could at least eventually make this self-contained, but block number or slot number of the including block would be a deal breaker. Either we figure out something better or we have to conclude that fully self contained mapping is not possible even with an updated candidate receipt.

I mean it is possible - worst thing that can happen is that load distribution is not optimal and that it can be influenced by carefully picking relay parents. 🤔

pub fn get_chunk_index(
n_validators: u32,
validator_index: ValidatorIndex,
block_number: BlockNumber,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we used block hash (relay parent), then by putting the core index into the candidate receipt we would have everything needed for the lookup to be self contained.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok using the relay parent of the candidate is at the very least not ideal because with async backing, the used relay parent could vary from candidate to candidate in the same block, which means that the equal load distribution might end up not being that equal after all.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we could avoid using block hash or number entirely and just use the core index. Assuming all cores are occupied, it will just mean fixed mapping within a session from validators to bulk parachains chunks (on-demand would still rotate their mapping to cores I assume). That way it might be easier to screw (intentionally or not) the systematic recovery for a particular parachain for an entire session. OTOH, we need to handle one (or two) missing systematic recovery chunk in practice and fall back to using backers to some extent. Maybe worth mentioning the fallback bit in the RFC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we could avoid using block hash or number entirely and just use the core index. Assuming all cores are occupied, it will just mean fixed mapping within a session from validators to bulk parachains chunks (on-demand would still rotate their mapping to cores I assume). That way it might be easier to screw (intentionally or not) the systematic recovery for a particular parachain for an entire session.

Yeah, as you said, I think it would be too easy to screw up. It could result in a not-so-even load distribution, because sessions can be quite long and some validators would be too unlucky to be assigned to a high-throughput para for several hours. We also don't know which validators are lazier or ill-intended and switching only once per session would make this visible for some parachains more than the others.

OTOH, we need to handle one (or two) missing systematic recovery chunk in practice and fall back to using backers to some extent. Maybe worth mentioning the fallback bit in the RFC.

Yes, I added this bit to the RFC. I suggest we request at most one chunk from each validator in the backing group as a fallback.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is whether this matters? Yes indeed, validators could then consistently mess with the systemic recovery of one particular parachain, instead of messing with systemic recovery of multiple parachains.... so what?

  1. It should not make that much of a difference: They are slowing down other validators. This is also in availability recovery, not distribution: Therefore the only affect that such an attack can have is that finality is a bit slower or that validators get overloaded and fail to back things for example. But both are independent of a parachain - it does not matter for which parachain causes this.
  2. Backers are still rotating. So if some validators refuse to provide systemic chunks, we can still fetch them from the backers.

Now the only real argument for rotation every block in my opinion is, again to even out load. In this particular case it would make a difference, if some paras always fully fill their blocks, while others are keeping them mostly empty. But, I would argue that we we should solve this problem by incentivizing full block utilization and not worry about this here too much, at least not until it manifests in a real problem. In fact, we also have another way of solving this if it ever proves beneficial: We could rotate para id to core id assignments instead.

TL;DR: I like @ordian 's idea to just not rotate. There are downsides to this, but having the recovery be self contained is quite valuable. Let's start simple and go only more complex if it proves necessary?

@burdges am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear what you mean here. We'll have the candidate reciept when fetching data, no?

Yes, and we'd like availability-recovery to be possible when having nothing more than the candidate receipt. The problem is that the receipt does not contain any info about the slot or block number.
For the moment, the core index isn't there either, but the plan is to add it.

Copy link

@burdges burdges Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you. We could only use some historical slot, not the slot where the candidate recipet gets backed.

All these options give some control of course, even the slot where backed, so relay parent hash works better than any slot choice, due to simplicity.

As I understand it @eskimor suggests our map depend upon only num_cores, num_validators, and core index, so each core has their systemic validators fixed for the session. It avoids bias except through selecting your core. I'd wager it worsens user experence, but only slightly.

We do however rotate backing groups and backers provide systemic chunks too, hence the slightly above. How do we determin the backers from the candidate recipet? Just because they signed the candidate recipet?

It's likely fine either way. We'll anyways have systemic reconstruction if each systemic chunk can be fetched from some backers or its one availability provider.

Actually who determines core index? We'd ideas where this occurs after the candidate reciept. We'd enable these if the map depends upon only num_cores, num_validators, relay parent, and paraid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these options give some control of course, even the slot where backed, so relay parent hash works better than any slot choice, due to simplicity.

As @eskimor said, "using the relay parent of the candidate is at the very least not ideal because with async backing, the used relay parent could vary from candidate to candidate in the same block, which means that the equal load distribution might end up not being that equal after all."

As I understand it @eskimor suggests our map depend upon only num_cores, num_validators, and core index, so each core has their systemic validators fixed for the session. It avoids bias except through selecting your core. I'd wager it worsens user experence, but only slightly.

yes, that's the suggestion AFAIU.

How do we determin the backers from the candidate recipet? Just because they signed the candidate recipet?

No, we only use the backers currently during approval-voting, when we have access to the relay block and we can query the session's validator groups from the runtime. For complexity reasons, we don't even verify that validators in the group all backed the candidate. We just assume that to be true (in practice, it's mostly true).

Actually who determines core index? We'd ideas where this occurs after the candidate reciept. We'd enable these if the map depends upon only num_cores, num_validators, relay parent, and paraid.

Currently, for the existing lease parachains, there's a fixed assignment between the paraid and the core index. @eskimor you know more here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just go with only the core index, from the discussion I don't see any real problems with that and once we have the core index in the candidate receipt we are golden from the simplicity/robustness perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, I'll update the document

this will generally not be true.

The requester will send the request to validator with index `V`. The responder will map the `V` validator index to the
`C` chunk index and respond with the `C`-th chunk.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The responder does not even need to do that, if we keep storing per validator index in the av store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does. It needs to supply the chunk index to the requester (for verification purposes and because the reconstruction algorithm seems to need it)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have sworn, that I wrote somewhere that we would need to store the chunk index with the chunk in the av-store for this of course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. yeah, that makes sense now 👍🏻

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a bit more.

The backing subsystem will issue a request to av-store to store all chunks. For that, we need to know the core index in order to store the ValidatorIndex -> Chunk as you suggest (so that we can compute the mapping).

Since we don't yet have the core index in the receipt, the backing subsystem needs to know the per-relay parent core_index assignment of the local validator.
From my knowledge, this would be just fine. When doing attestation, the availabiliy_cores runtime API already gets the core_index for us (but doesn't store it yet).

The slight caveat is that, when importing a statement, we may also have to call the availability_cores runtime API to see which core our para has been scheduled on. but it's no big deal, we need to have access to the relay parent anyway when doing candidate validation.

@eskimor please weigh in on my analysis. until elastic scaling, a para id couldn't be scheduled on multiple cores and the core assignment could only change on a relay-block boundary. And when we'll get elastic scaling, we'll have the core_index in the receipt anyway. so all good.


On the other hand, collators will not be required to upgrade, as regular chunk recovery will work as before, granted
that version 1 of the networking protocol has been removed. However, they are encouraged to upgrade in order to take
advantage of the faster systematic recovery.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think collators have to upgrade as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

I think they just need to upgrade to the v2 networking protocol. Once that's done, they don't really need to upgrade to use systematic recovery. They'll request regular chunks as before.
That's why I added this bit only on the step 2 (which enables systematic recovery). Upgrade for step 1 will still be needed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly unrelated, but collators don't use recovery in a happy path. In can be needed in case there are malicious collators withholding the data. So it doesn't need to be optimized.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think they just need to upgrade to the v2 networking protocol.

Ok, to me that sounded like an upgrade. In other words: If you already have the requirement that they upgrade their network protocol, the rest does not sound like an issue anymore.

@burdges
Copy link

burdges commented Dec 2, 2023

At the same time, core_index cannot be chosen by a parachain to influence the chunk assignment.

This is actually not true with agile core time: A parachain bids specifically on one particular core.

But it also should not matter much: We are rotating the mapping every block. On top of that, there is only so much a parachain can do here anyway.

It's likely fine. We want the systemic chunks spread evenly enough to minimize other protocol manipulation. It's possible some parachains chose their cores to make frierndly validators more profitable, which overworks some other validators, and causes reconstructions. It's probably fine though.

@alindima
Copy link
Contributor Author

alindima commented Dec 4, 2023

This is actually not true with agile core time: A parachain bids specifically on one particular core.
But it also should not matter much: We are rotating the mapping every block. On top of that, there is only so much a parachain can do here anyway.

Noted 👍🏻

This is not what we discussed. I suggested to keep requesting via validator index. Right now we also store in the av-store per validator index - if we don't change that, then a node can provide a chunk as requested by validator index, even if it does not know the mapping itself*). Hence everything would keep working with neither the requester nor the responder still knowing the mapping.

Yes, that's what I meant - requesting by validator index just as before. About storing in the DB by validator index - that would still require that one of the actors (requester/responder) needs to know the chunk index because the reconstruction algorithm needs the indices. As discussed, the requester may not know this, so that's why I suggested we return the ChunkIndex in the ChunkFetchingResponse. Whether we store in the DB the chunk index or the validator index I don't think is very relevant, but using the chunk index would be easier to use IMO

Copy link

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left small remarks, looking good otherwise


#### Step 1: Enabling new network protocol
In the beginning, both `/polkadot/req_chunk/1` and `/polkadot/req_chunk/2` will be supported, until all validators and
collators have upgraded to use the new version. V1 will be considered deprecated.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a heads up that parachains may take a very long time to upgrade their collators's polkadot-sdk branch, many of them are still based on 0.9.x versions


On the other hand, collators will not be required to upgrade, as regular chunk recovery will work as before, granted
that version 1 of the networking protocol has been removed. However, they are encouraged to upgrade in order to take
advantage of the faster systematic recovery.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly unrelated, but collators don't use recovery in a happy path. In can be needed in case there are malicious collators withholding the data. So it doesn't need to be optimized.

pub fn get_chunk_index(
n_validators: u32,
validator_index: ValidatorIndex,
block_number: BlockNumber,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, we could avoid using block hash or number entirely and just use the core index. Assuming all cores are occupied, it will just mean fixed mapping within a session from validators to bulk parachains chunks (on-demand would still rotate their mapping to cores I assume). That way it might be easier to screw (intentionally or not) the systematic recovery for a particular parachain for an entire session. OTOH, we need to handle one (or two) missing systematic recovery chunk in practice and fall back to using backers to some extent. Maybe worth mentioning the fallback bit in the RFC.

text/0047-assignment-of-availability-chunks.md Outdated Show resolved Hide resolved
Comment on lines +155 to +156
/// The index of the chunk to fetch.
pub index: ValidatorIndex,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be stay the same IIUC in v2, but the meaning (and the doc comment) will be different

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will stay the same, indeed. The meaning will change, in the sense that we cannot expect any more that the ValidatorIndex will be equal to the returned ChunkIndex. I hope I describe this in sufficient detail in the following paragraph

}
}

Decode::decode(&mut &systematic_bytes[..]).unwrap()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this remove the end zero padding if odd length?

Anyways we should be careful about the boundary here, like maybe this should live in the erasure coding crate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, scale decoding ignores the trailing zeros

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to know the exact number of zeroed bytes added as padding, we need to know the size of the input data that was encoded.
Unfortunately, we don't have easy access to that in polkadot, unless we add it to the CandidateReceipt.

But scale decoding it works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some roundtrip quickcheck tests on regular chunk reconstruction. The regular reed-solomon code can have extra zeroed padding when reconstructing. So truncation to the expected size was already needed.

let mut threshold = (n_validators - 1) / 3;
if !is_power_of_two(threshold) {
threshold = next_lower_power_of_2(threshold);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, next lower power of two is correct here.

We encode 2^{x+1} chunks where 2^x < n_validators <= 2^{x+1} of course. We need threshold < (n_validators-epsilon)/3 though, so assuming epsilon=1 then we reconstruct from 2^{x-1} chunks where 2^{x-1} <= (n_validators-1) / 3 < 2^x.

If n_validators = 2^{x+1}, so no wasted encoding, then we reconstruct from n_validators / 4 chunks, and our reconstruction threshold winds up (n_validators - 4) / 12 smaller than necessary. If otoh (n_validators-1)/3 = 2^{x-1}, so optimal reconstruction threshold, then n_validators = 3 * 2^{x-1} + 1 unecessarily encodes 2^{x-1} - 1 extra chuinks.

@alindima
Copy link
Contributor Author

alindima commented Jan 8, 2024

I think that the RFC is in a form that can be proposed and voted on. Please give another review and approve it if you think so.

alindima added a commit to paritytech/polkadot-sdk that referenced this pull request Jan 10, 2024
Previously, it was only possible to retry the same request on a
different protocol name that had the exact same binary payloads.

Introduce a way of trying a different request on a different protocol if
the first one fails with Unsupported protocol.

This helps with adding new req-response versions in polkadot while
preserving compatibility with unupgraded nodes.

The way req-response protocols were bumped previously was that they were
bundled with some other notifications protocol upgrade, like for async
backing (but that is more complicated, especially if the feature does
not require any changes to a notifications protocol). Will be needed for
implementing polkadot-fellows/RFCs#47

TODO:
- [x]  add tests
- [x] add guidance docs in polkadot about req-response protocol
versioning
@eskimor
Copy link

eskimor commented Jan 18, 2024

/rfc propose

@paritytech-rfc-bot
Copy link
Contributor

Hey @eskimor, here is a link you can use to create the referendum aiming to approve this RFC number 0047.

Instructions
  1. Open the link.

  2. Switch to the Submission tab.

  1. Adjust the transaction if needed (for example, the proposal Origin).

  2. Submit the Transaction


It is based on commit hash 4ae75296bfdeb1b2ca1e9d20f78c1a783475de47.

The proposed remark text is: RFC_APPROVE(0047,c36b1b369cadec4eaa46673d980076381d969ed22844fa871aec964ad302af59).

@alindima
Copy link
Contributor Author

/rfc process

@paritytech-rfc-bot
Copy link
Contributor

Please provider a block hash where the referendum confirmation event is to be found.
For example:

/rfc process 0x39fbc57d047c71f553aa42824599a7686aea5c9aab4111f6b836d35d3d058162
Instructions to find the block hashHere is one way to find the corresponding block hash.
  1. Open the referendum on Subsquare.

  2. Switch to the Timeline tab.


  1. Go to the details of the Confirmed event.

  1. Go to the details of the block containing that event.

  1. Here you can find the block hash.

@alindima
Copy link
Contributor Author

/rfc process 0xe750513eb583410686c094e8e5f00c0b18dc33e0296c1c8572c2e83f021c03a5

@paritytech-rfc-bot
Copy link
Contributor

The on-chain referendum has approved the RFC.

@paritytech-rfc-bot paritytech-rfc-bot bot merged commit efc6b7a into polkadot-fellows:main Jan 25, 2024
bgallois pushed a commit to duniter/duniter-polkadot-sdk that referenced this pull request Mar 25, 2024
Previously, it was only possible to retry the same request on a
different protocol name that had the exact same binary payloads.

Introduce a way of trying a different request on a different protocol if
the first one fails with Unsupported protocol.

This helps with adding new req-response versions in polkadot while
preserving compatibility with unupgraded nodes.

The way req-response protocols were bumped previously was that they were
bundled with some other notifications protocol upgrade, like for async
backing (but that is more complicated, especially if the feature does
not require any changes to a notifications protocol). Will be needed for
implementing polkadot-fellows/RFCs#47

TODO:
- [x]  add tests
- [x] add guidance docs in polkadot about req-response protocol
versioning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants