New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Approve multiple candidates with a single signature #1191
Merged
alexggh
merged 124 commits into
master
from
alexaggh/feature/approve_multiple_candidates_polkadot_sdk
Dec 13, 2023
Merged
Approve multiple candidates with a single signature #1191
alexggh
merged 124 commits into
master
from
alexaggh/feature/approve_multiple_candidates_polkadot_sdk
Dec 13, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
The pr migrates: - paritytech/polkadot#7554 Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
alexggh
force-pushed
the
alexaggh/feature/approve_multiple_candidates_polkadot_sdk
branch
from
August 27, 2023 11:20
342308e
to
619fff2
Compare
…tiple_candidates_polkadot_sdk
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
alexggh
force-pushed
the
alexaggh/feature/approve_multiple_candidates_polkadot_sdk
branch
from
August 28, 2023 09:22
1cb26cd
to
7bc13d3
Compare
…reim/the_v2_assignments
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
…reim/the_v2_assignments Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
…reim/the_v2_assignments Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
This reverts commit 5e004e1.
…o feature/approve_multiple_candidates_polkadot_sdk_v2
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Closed
…tiple_candidates_polkadot_sdk_v3
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Performed the final sanity checks on versi before merging, tested the following scenarios:
Checked:
|
V2 was not put into the list of fallbacks for the validation protocol, so the test wrongly fall-backed on v1. Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
alexggh
deleted the
alexaggh/feature/approve_multiple_candidates_polkadot_sdk
branch
December 13, 2023 06:43
This pull request has been mentioned on Polkadot Forum. There might be relevant details there: |
github-merge-queue bot
pushed a commit
that referenced
this pull request
Feb 5, 2024
## Summary Built on top of the tooling and ideas introduced in #2528, this PR introduces a synthetic benchmark for measuring and assessing the performance characteristics of the approval-voting and approval-distribution subsystems. Currently this allows, us to simulate the behaviours of these systems based on the following dimensions: ``` TestConfiguration: # Test 1 - objective: !ApprovalsTest last_considered_tranche: 89 min_coalesce: 1 max_coalesce: 6 enable_assignments_v2: true send_till_tranche: 60 stop_when_approved: false coalesce_tranche_diff: 12 workdir_prefix: "/tmp" num_no_shows_per_candidate: 0 approval_distribution_expected_tof: 6.0 approval_distribution_cpu_ms: 3.0 approval_voting_cpu_ms: 4.30 n_validators: 500 n_cores: 100 n_included_candidates: 100 min_pov_size: 1120 max_pov_size: 5120 peer_bandwidth: 524288000000 bandwidth: 524288000000 latency: min_latency: secs: 0 nanos: 1000000 max_latency: secs: 0 nanos: 100000000 error: 0 num_blocks: 10 ``` ## The approach 1. We build a real overseer with the real implementations for approval-voting and approval-distribution subsystems. 2. For a given network size, for each validator we pre-computed all potential assignments and approvals it would send, because this a computation heavy operation this will be cached on a file on disk and be re-used if the generation parameters don't change. 3. The messages will be sent accordingly to the configured parameters and those are split into 3 main benchmarking scenarios. ## Benchmarking scenarios ### Best case scenario *approvals_throughput_best_case.yaml* It send to the approval-distribution only the minimum required tranche to gathered the needed_approvals, so that a candidate is approved. ### Behaviour in the presence of no-shows *approvals_no_shows.yaml* It sends the tranche needed to approve a candidate when we have a maximum of *num_no_shows_per_candidate* tranches with no-shows for each candidate. ### Maximum throughput *approvals_throughput.yaml* It sends all the tranches for each block and measures the used CPU and necessary network bandwidth. by the approval-voting and approval-distribution subsystem. ## How to run it ``` cargo run -p polkadot-subsystem-bench --release -- test-sequence --path polkadot/node/subsystem-bench/examples/approvals_throughput.yaml ``` ## Evaluating performance ### Use the real subsystems metrics If you follow the steps in https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-grafana for installing locally prometheus and grafana, all real metrics for the `approval-distribution`, `approval-voting` and overseer are available. E.g: <img width="2149" alt="Screenshot 2023-12-05 at 11 07 46" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/cb8ae2dd-178b-4922-bfa4-dc37e572ed38"> <img width="2551" alt="Screenshot 2023-12-05 at 11 09 42" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/8b4542ba-88b9-46f9-9b70-cc345366081b"> <img width="2154" alt="Screenshot 2023-12-05 at 11 10 15" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/b8874d8d-632e-443a-9840-14ad8e90c54f"> <img width="2535" alt="Screenshot 2023-12-05 at 11 10 52" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/779a439f-fd18-4985-bb80-85d5afad78e2"> ### Profile with pyroscope 1. Setup pyroscope following the steps in https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-pyroscope, then run any of the benchmark scenario with `--profile` as the arguments. 2. Open the pyroscope dashboard in grafana, e.g: <img width="2544" alt="Screenshot 2024-01-09 at 17 09 58" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/58f50c99-a910-4d20-951a-8b16639303d9"> ### Useful logs 1. Network bandwidth requirements: ``` Payload bytes received from peers: 503993 KiB total, 50399 KiB/block Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block ``` 2. Cpu usage by the approval-distribution/approval-voting subsystems. ``` approval-distribution CPU usage 84.061s approval-distribution CPU usage per block 8.406s approval-voting CPU usage 96.532s approval-voting CPU usage per block 9.653s ``` 3. Time passed until a given block is approved ``` Chain selection approved after 3500 ms hash=0x0101010101010101010101010101010101010101010101010101010101010101 Chain selection approved after 4500 ms hash=0x0202020202020202020202020202020202020202020202020202020202020202 ``` ### Using benchmark to quantify improvements from #1178 + #1191 Using a versi-node we compare the scenarios where all new optimisations are disabled with a scenarios where tranche0 assignments are sent in a single message and a conservative simulation where the coalescing of approvals gives us just 50% reduction in the number of messages we send. Overall, what we see is a speedup of around 30-40% in the time it takes to process the necessary messages and a 30-40% reduction in the necessary bandwidth. #### Best case scenario comparison(minimum required tranches sent). Unoptimised ``` Number of blocks: 10 Payload bytes received from peers: 53289 KiB total, 5328 KiB/block Payload bytes sent to peers: 52489 KiB total, 5248 KiB/block approval-distribution CPU usage 6.732s approval-distribution CPU usage per block 0.673s approval-voting CPU usage 9.523s approval-voting CPU usage per block 0.952s ``` vs Optimisation enabled ``` Number of blocks: 10 Payload bytes received from peers: 32141 KiB total, 3214 KiB/block Payload bytes sent to peers: 37314 KiB total, 3731 KiB/block approval-distribution CPU usage 4.658s approval-distribution CPU usage per block 0.466s approval-voting CPU usage 6.236s approval-voting CPU usage per block 0.624s ``` #### Worst case all tranches sent, very unlikely happens when sharding breaks. Unoptimised ``` Number of blocks: 10 Payload bytes received from peers: 746393 KiB total, 74639 KiB/block Payload bytes sent to peers: 729151 KiB total, 72915 KiB/block approval-distribution CPU usage 118.681s approval-distribution CPU usage per block 11.868s approval-voting CPU usage 124.118s approval-voting CPU usage per block 12.412s ``` vs optimised ``` Number of blocks: 10 Payload bytes received from peers: 503993 KiB total, 50399 KiB/block Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block approval-distribution CPU usage 84.061s approval-distribution CPU usage per block 8.406s approval-voting CPU usage 96.532s approval-voting CPU usage per block 9.653s ``` ## TODOs [x] Polish implementation. [x] Use what we have so far to evaluate #1191 before merging. [x] List of features and additional dimensions we want to use for benchmarking. [x] Run benchmark on hardware similar with versi and kusama nodes. [ ] Add benchmark to be run in CI for catching regression in performance. [ ] Rebase on latest changes for network emulation. --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io> Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io> Co-authored-by: Andrei Sandu <andrei-mihail@parity.io> Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
alexggh
added a commit
to alexggh/runtimes
that referenced
this pull request
Feb 28, 2024
... to add approval_voting_params API which will allow us to enable approvals coalescing implementation from: - paritytech/polkadot-sdk#1191 Note! Bumping the version will not enable the new logic, that will be enable at a later date we we decide to call set_approval_voting_params with max_approval_coalesce_count greater than 1. Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
alexggh
added a commit
to alexggh/runtimes
that referenced
this pull request
Feb 28, 2024
... to add approval_voting_params API which will allow us to enable approvals coalescing implementation from: - paritytech/polkadot-sdk#1191 Note! Bumping the version will not enable the new logic, that will be enable at a later date we we decide to call set_approval_voting_params with max_approval_coalesce_count greater than 1. Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
fellowship-merge-bot bot
pushed a commit
to polkadot-fellows/runtimes
that referenced
this pull request
Feb 29, 2024
... to add approval_voting_params API which will allow us to enable approvals coalescing implementation from: - paritytech/polkadot-sdk#1191 Note! Bumping the version will not enable the new logic, that will be enable at a later date we we decide to call set_approval_voting_params with max_approval_coalesce_count greater than 1. <!-- Remember that you can run `/merge` to enable auto-merge in the PR --> <!-- Remember to modify the changelog. If you don't need to modify it, you can check the following box. Instead, if you have already modified it, simply delete the following line. --> --------- Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 8, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 8, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 8, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 8, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 9, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 9, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 9, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 9, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 9, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 9, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 10, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
serban300
pushed a commit
to serban300/polkadot-sdk
that referenced
this pull request
Apr 10, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
bkchr
pushed a commit
that referenced
this pull request
Apr 10, 2024
* transactions mortality in message and complex relays * logging + enable in test deployments * spellcheck * fmt
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
R0-silent
Changes should not be mentioned in any release notes
T8-polkadot
This PR/Issue is related to/affects the Polkadot network.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The pr migrates: paritytech/polkadot#7554, preliminary measurements and tests are discussed there.
Initial implementation for the plan discussed here: #701 on top of: #1178
Overall idea
When approval-voting checks a candidate and is ready to advertise the approval, defer it in a per-relay chain block until we either have MAX_APPROVAL_COALESCE_COUNT candidates to sign or a candidate has stayed MAX_APPROVALS_COALESCE_TICKS in the queue, in both cases we sign what candidates we have available.
This should allow us to reduce the number of approvals messages we have to create/send/verify. The parameters are configurable, so we should find some values that balance:
Other fixes:
TODO: