Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

(async backing) parachain-system: track limitations for unincluded blocks #2438

Merged
merged 26 commits into from
Apr 29, 2023

Conversation

slumber
Copy link
Contributor

@slumber slumber commented Apr 6, 2023

Resolves #1866

@slumber slumber requested a review from rphmeier April 6, 2023 19:23
pallets/parachain-system/src/unincluded_segment.rs Outdated Show resolved Hide resolved
pallets/parachain-system/src/lib.rs Outdated Show resolved Hide resolved
@slumber
Copy link
Contributor Author

slumber commented Apr 6, 2023

Also #1866 mentions

UMP messages processed

Was it supposed to be dmp?

@slumber
Copy link
Contributor Author

slumber commented Apr 6, 2023

And do we want it in master?

@rphmeier
Copy link
Contributor

rphmeier commented Apr 7, 2023

And do we want it in master?

Only if it's fully backwards-compatible

Was it supposed to be dmp?

Yes, just an oversight. I've edited the original issue to fix this as well. Thanks

@slumber
Copy link
Contributor Author

slumber commented Apr 10, 2023

Do we want to account for max_ancestry_len here?

@rphmeier
Copy link
Contributor

Do we want to account for max_ancestry_len here?

This comment doesn't point to any specific code, not sure what you mean by this.

@slumber
Copy link
Contributor Author

slumber commented Apr 11, 2023

Do we want to account for max_ancestry_len here?

This comment doesn't point to any specific code, not sure what you mean by this.

It's a general question, we have max unincluded length which is mapped to max_candidate_depth

But async backing parameters also have allowed_ancestry_len, which we're not accounting for in this PR

@rphmeier
Copy link
Contributor

But async backing parameters also have allowed_ancestry_len, which we're not accounting for in this PR

We could account for it in the runtime, but allowed_ancestry_len is a parameter of the relay chain block where the parachain block gets backed, whereas candidate_depth is a property of the relay chain block that is set as the relay-parent. We don't know ahead of time at one point a fresh parablock will get posted to the relay chain, so I'd imagined that the consensus/authoring layer that chooses relay parents would account for that instead.

@@ -428,6 +547,7 @@ pub mod pallet {
<RelayStateProof<T>>::put(relay_chain_state);
<RelevantMessagingState<T>>::put(relevant_messaging_state.clone());
<HostConfiguration<T>>::put(host_config);
<PendingDownwardMessages<T>>::put(downward_messages.clone());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there's no other way to get this value for bandwidth limits besides putting it into storage.
Second option is to introduce migration and save its len along with ProcessedDownwardMessages and rename it into something like DownwardQueueSize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's cleared in on_finalize then it never hits storage in practice, only the in-memory overlay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's inserted in set_validation_data, fetched in on_finalize, cleared in on_initialize

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two things stand out to me:

  • only the length of the list is used, not the messages themselves
  • it could be cleared in on_finalize too, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also doesn't seem well-defined to me what pending_downward_messages is in this context.

i.e. is it all pending downward messages that were present in the relay-parent state? If so, then we need to update process_inbound_downward_messages to avoid processing them in duplicate.

or is it only the pending downward messages which haven't been processed? In that case, we shouldn't have to make any changes to the processing logic.

For efficiency's sake, I'd favor the latter - we shouldn't include the same downward messages into parachain blocks multiple times.

horizontal logic probably needs the same, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the pattern of cleaning everything in on_initialize

As for the messages, yes, I suggested in the first comments to either put only len or

Second option is to introduce migration and save its len along with ProcessedDownwardMessages and rename it into something like DownwardQueueSize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. is it all pending downward messages that were present in the relay-parent state? If so, then we need to update process_inbound_downward_messages to avoid processing them in duplicate.

These are all messages from the downward queue. They're not used for processing in the current code, so it's not an issue. We should probably just use len.
But the thing is, according to the code each candidate exhaust the dm queue:

let dm_count = downward_messages.len() as u32;
// ...
ProcessedDownwardMessages::<T>::put(dm_count);

Do we even need to track this limitation at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we assume that ParachainInherentData only contains messages which have not been posted to any prior parachain block (which is the way it should be to avoid redundancy), I don't know that that current bandwidth limits approach works.

The bandwidth limits for DMP/HRMP-in should be based on the total size of the incoming queues at the relay-parent, whereas some (or most) of the messages may have already been processed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the code each candidate exhaust the dm queue

Yes, this looks currently guaranteed. We could avoid tracking this property as a result, and reintroduce the concept if we get to the point where we implement things more piecemeal.

@slumber slumber requested a review from rphmeier April 12, 2023 22:32
@slumber slumber added D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited A3-in_progress Pull request is in progress. No review needed at this stage. B1-note_worthy Changes should be noted in the release notes C1-low PR touches the given topic and has a low impact on builders. labels Apr 12, 2023

// sanity-check there's place for the block at finalization phase.
//
// TODO: this potentially restricts parachains from decreasing `MaxUnincludedLen` value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does - parachains manually adjusting MaxUnincludedLen would potentially brick their chain. This is fine as long as there is a warning in the storage item doc comment.

We will need good ways for parachains to adjust this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically referenced #2438 (comment), need to link to a github issue too before the merge

Copy link
Contributor

@rphmeier rphmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized something - I don't think that we can actually access the relay-chain state proof within the on_initialize area of code. It only gets handled later, in the set_inherent_data inherent. The logic we currently have in on_initialize needs to go there, actually.

@slumber
Copy link
Contributor Author

slumber commented Apr 13, 2023

I just realized something - I don't think that we can actually access the relay-chain state proof within the on_initialize area of code. It only gets handled later, in the set_inherent_data inherent.

Data put into storage by set_validation_data (= create_inherent) is available in on_finalize and dropped in on_initialize, why isn't state proof accessible?

@slumber slumber force-pushed the slumber-track-blocks-bandwidth branch from ae42744 to 0e4b14a Compare April 19, 2023 21:19
@slumber slumber changed the base branch from master to slumber-async-backing-feature April 19, 2023 21:19

// sanity-check there's place for the block at finalization phase.
//
// TODO: this potentially restricts parachains from decreasing `MaxUnincludedLen` value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the answer here, now that we've moved this to run in the inherent, is "it doesn't".

on_initialize always runs before inherents, so any pallet setting the MaxUnincludedLen in code that executes in the on_initialize of any pallet would run before this, so it is safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about scenario when there're 4 blocks in the unincluded segment, and the len is reduced to e.g. 1?

Copy link
Contributor

@rphmeier rphmeier Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is fine. For example, assume that there is some unconditional code that sets the maximum length to 1 in on_initialize at a certain block height. For more dynamically controlled systems, the reasoning is the same but slightly more complicated.

Let's say we are trying to build on a parent parachain block B, which has unincluded segment length 4. As long as the unincluded segment has length > 1, we can't author a new block. But new relay-parents will "drain" the unincluded segment from the back, so eventually* we will have a (relay-parent, B) pair where the unincluded segment will be short enough to push a new block B' onto. This is the desired behavior of shrinking the maximum segment length.

*eventually assuming B and its ancestors actually get backed and included in the relay chain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, failing create_inherent is ok, while on_initialize isn't. Thanks for explanation.

Copy link
Contributor

@rphmeier rphmeier Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's rather that we are creating a block validity rule based on the relay-parent. It'll be impossible to create valid blocks on top of some (para parent, relay parent) pairs, and that failure will manifest in the inherent that handles the storage proof. Anything that runs in on_initialize could alter the block validity condition (e.g. altering the maximum unincluded segment length). We will probably also want hooks for people to write dynamic block validity rules based on the relay-parent in question. These dynamic rules would need the relay parent and storage proof as a parameter, so they could only run in the inherent, once those things have been supplied.

In principle, it's OK to fail in on_initialize as well, but there are footguns like changing the unincluded segment length that could lead the chain to brick.

// the inherent shouldn't contain messages that were already processed by any of the
// ancestors.
//
// This invariant should be upheld by the node-side.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment for the benefit of other reviewers: the "node-side" here could also include the ProvideInherent implementation. That is probably the best way to ensure this in practice. Just trimming some MQCs.

Copy link
Contributor

@rphmeier rphmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@slumber slumber marked this pull request as ready for review April 21, 2023 22:52
@rphmeier rphmeier merged commit 86b5d3d into slumber-async-backing-feature Apr 29, 2023
@rphmeier rphmeier deleted the slumber-track-blocks-bandwidth branch April 29, 2023 17:19
paritytech-processbot bot pushed a commit that referenced this pull request Aug 18, 2023
* Update substrate & polkadot

* min changes to make async backing compile

* (async backing) parachain-system: track limitations for unincluded blocks (#2438)

* unincluded segment draft

* read para head from storage proof

* read_para_head -> read_included_para_head

* Provide pub interface

* add errors

* fix unincluded segment update

* BlockTracker -> Ancestor

* add a dmp limit

* Read para head depending on the storage switch

* doc comments

* storage items docs

* add a sanity check on block initialize

* Check watermark

* append to the segment on block finalize

* Move segment update into set_validation_data

* Resolve para head todo

* option watermark

* fix comment

* Drop dmq check

* fix weight

* doc-comments on inherent invariant

* Remove TODO

* add todo

* primitives tests

* pallet tests

* doc comments

* refactor unincluded segment length into a ConsensusHook (#2501)

* refactor unincluded segment length into a ConsensusHook

* add docs

* refactor bandwidth_out calculation

Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>

* test for limits from impl

* fmt

* make tests compile

* update comment

* uncomment test

* fix collator test by adding parent to state proof

* patch HRMP watermark rules for unincluded segment

* get consensus-common tests to pass, using unincluded segment

* fix unincluded segment tests

* get all tests passing

* fmt

* rustdoc CI

* aura-ext: limit the number of authored blocks per slot (#2551)

* aura_ext consensus hook

* reverse dependency

* include weight into hook

* fix tests

* remove stray println

Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>

* fix test warning

* fix doc link

---------

Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
Co-authored-by: Chris Sosnin <chris125_@live.com>

* parachain-system: ignore go ahead signal once upgrade is processed (#2594)

* handle goahead signal for unincluded segment

* doc comment

* add test

* parachain-system: drop processed messages from inherent data (#2590)

* implement `drop_processed_messages`

* drop messages based on relay parent number

* adjust tests

* drop changes to mqc

* fix comment

* drop test

* drop more dead code

* clippy

* aura-ext: check slot in consensus hook and remove all `CheckInherents` logic (#2658)

* aura-ext: check slot in consensus hook

* convert relay chain slot

* Make relay chain slot duration generic

* use fixed velocity hook for pallets with aura

* purge timestamp inherent

* fix warning

* adjust runtime tests

* fix slots in tests

* Make `xcm-emulator` test pass for new consensus hook (#2722)

* add pallets on_initialize

* tests pass

* add AuraExt on_init

* ".git/.scripts/commands/fmt/fmt.sh"

---------

Co-authored-by: command-bot <>

---------

Co-authored-by: Ignacio Palacios <ignacio.palacios.santos@gmail.com>

* update polkadot git refs

* CollationGenerationConfig closure is now optional (#2772)

* CollationGenerationConfig closure is now optional

* fix test

* propagate network-protocol-staging feature (#2899)

* Feature Flagging Consensus Hook Type Parameter (#2911)

* First pass

* fmt

* Added as default feature in tomls

* Changed to direct dependency feature

* Dealing with clippy error

* Update pallets/parachain-system/src/lib.rs

Co-authored-by: asynchronous rob <rphmeier@gmail.com>

---------

Co-authored-by: asynchronous rob <rphmeier@gmail.com>

* fmt

* bump deps and remove warning

* parachain-system: update RelevantMessagingState according to the unincluded segment (#2948)

* mostly address 2471 with a bug introduced

* adjust relevant messaging state after computing total

* fmt

* max -> min

* fix test implementation of xcmp source

* add test

* fix test message sending logic

* fix + test

* add more to unincluded segment test

* fmt

---------

Co-authored-by: Chris Sosnin <chris125_@live.com>

* Integrate new Aura / Parachain Consensus Logic in Parachain-Template / Polkadot-Parachain (#2864)

* add a comment

* refactor client/service utilities

* deprecate start_collator

* update parachain-template

* update test-service in the same way

* update polkadot-parachain crate

* fmt

* wire up new SubmitCollation message

* some runtime utilities for implementing unincluded segment runtime APIs

* allow parachains to configure their level of sybil-resistance when starting the network

* make aura-ext compile

* update to specify sybil resistance levels

* fmt

* specify relay chain slot duration in milliseconds

* update Aura to explicitly produce Send futures

also, make relay_chain_slot_duration a Duration

* add authoring duration to basic collator and document params

* integrate new basic collator into parachain-template

* remove assert_send used for testing

* basic-aura: only author when parent included

* update polkadot-parachain-bin

* fmt

* some fixes

* fixes

* add a RelayNumberMonotonicallyIncreases

* add a utility function for initializing subsystems

* some logging for timestamp adjustment

* fmt

* some fixes for lookahead collator

* add a log

* update `find_potential_parents` to account for sessions

* bound the loop

* restore & deprecate old start_collator and start_full_node functions.

* remove unnecessary await calls

* fix warning

* clippy

* more clippy

* remove unneeded logic

* ci

* update comment

Co-authored-by: Marcin S. <marcin@bytedude.com>

* (async backing) restore `CheckInherents` for backwards-compatibility (#2977)

* bring back timestamp

* Restore CheckInherents

* revert to empty CheckInherents

* make CheckInherents optional

* attempt

* properly end system blocks

* add some more comments

* ignore failing system parachain tests

* update refs after main feature branch merge

* comment out the offending tests because CI runs ignored tests

* fix warnings

* fmt

* revert to polkadot master

* cargo update -p polkadot-primitives -p sp-io

---------

Co-authored-by: asynchronous rob <rphmeier@gmail.com>
Co-authored-by: Ignacio Palacios <ignacio.palacios.santos@gmail.com>
Co-authored-by: Bradley Olson <34992650+BradleyOlson64@users.noreply.github.com>
Co-authored-by: Marcin S. <marcin@bytedude.com>
Co-authored-by: eskimor <eskimor@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A3-in_progress Pull request is in progress. No review needed at this stage. B1-note_worthy Changes should be noted in the release notes C1-low PR touches the given topic and has a low impact on builders. D9-needsaudit 👮 PR contains changes to fund-managing logic that should be properly reviewed and externally audited
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Computation of real message bandwidth available in asynchronous backing
3 participants