Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subsystem-bench: add regression tests for availability read and write #3311

Merged
merged 56 commits into from
Mar 1, 2024
Merged
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
c01894b
Split subsystem-bench to lib and cli tool
AndreiEres Feb 13, 2024
f71c502
Remove unused values from BenchCli
AndreiEres Feb 13, 2024
733f0eb
Move usage display to tests
AndreiEres Feb 13, 2024
fdc1392
Update displaying of the configuration
AndreiEres Feb 13, 2024
815c023
fixup: add log target
AndreiEres Feb 13, 2024
0a65126
Remove toml fmt
AndreiEres Feb 13, 2024
42963a9
fixup: remove toml fmt
AndreiEres Feb 13, 2024
98224fd
Move approval to lib
AndreiEres Feb 13, 2024
d98b899
Move availability to lib
AndreiEres Feb 13, 2024
5175897
Return usage
AndreiEres Feb 14, 2024
a721bca
Add skeleton tests
AndreiEres Feb 14, 2024
b27edaa
Add approval tests
AndreiEres Feb 15, 2024
9136955
Add regression tests
AndreiEres Feb 15, 2024
18a0c7e
Address clippy warnings
AndreiEres Feb 16, 2024
d72a035
Address clippy warnings
AndreiEres Feb 16, 2024
d161de1
Address clippy warnings
AndreiEres Feb 16, 2024
ab2e179
Update
AndreiEres Feb 16, 2024
1e59be1
Add draft ci run
AndreiEres Feb 16, 2024
b54b85d
Filter regression tests
AndreiEres Feb 16, 2024
0e4432a
Update nexttest config
AndreiEres Feb 16, 2024
3b6a3e9
Skip benches
AndreiEres Feb 16, 2024
2fec1c1
Skip benches
AndreiEres Feb 16, 2024
2657432
Rename test job
AndreiEres Feb 16, 2024
c037885
Fix label
AndreiEres Feb 16, 2024
d46e9fa
Update values
AndreiEres Feb 16, 2024
708e002
Update availability recovery bench values
AndreiEres Feb 19, 2024
d1b3631
Use average run
AndreiEres Feb 19, 2024
282745a
Address clippy errors
AndreiEres Feb 19, 2024
f6df907
Update README
AndreiEres Feb 19, 2024
f0aea37
Update values
AndreiEres Feb 19, 2024
ff7dac3
Update values
AndreiEres Feb 19, 2024
e1e6f2d
Fix md line length
AndreiEres Feb 20, 2024
68ec964
Update a wrong range
AndreiEres Feb 20, 2024
184eee1
Merge branch 'master' into AndreiEres/subsystem-bench-lib
AndreiEres Feb 20, 2024
f03f4a7
Disable linting in code block
AndreiEres Feb 20, 2024
4abf5d8
Update .gitlab/pipeline/test.yml
alvicsam Feb 20, 2024
9df54ec
Update values
AndreiEres Feb 20, 2024
3acfe18
Update values
AndreiEres Feb 20, 2024
3868962
Try warming up
AndreiEres Feb 22, 2024
e4f3c87
Fix format
AndreiEres Feb 22, 2024
f55ae88
Address clippy errors
AndreiEres Feb 22, 2024
222f791
Warm up availability-distribution-regression-bench
AndreiEres Feb 22, 2024
ddf4408
Run benches
AndreiEres Feb 22, 2024
4f40d1f
Update warm up
AndreiEres Feb 22, 2024
0e1b59e
Run benchmarks
AndreiEres Feb 22, 2024
31a5090
Run benchmarks
AndreiEres Feb 22, 2024
732b4ba
Disable approval-voting-regression-bench
AndreiEres Feb 22, 2024
c0e6f31
Update values
AndreiEres Feb 22, 2024
02d63b8
Update benches
AndreiEres Feb 22, 2024
341c8b7
Disable approval-voting tests
AndreiEres Feb 23, 2024
a03fd98
Merge branch 'master' into AndreiEres/subsystem-bench-lib
AndreiEres Feb 23, 2024
77f3da5
Remove approval-voting tests
AndreiEres Feb 29, 2024
36b3f6c
Remove regression tests from CI
AndreiEres Feb 29, 2024
ff91819
Add comments
AndreiEres Feb 29, 2024
46ec768
Merge branch 'master' into AndreiEres/subsystem-bench-lib
AndreiEres Mar 1, 2024
07b4662
Add comments
AndreiEres Mar 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion .gitlab/pipeline/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ test-linux-stable:
# "upgrade_version_checks_should_work" is currently failing
- |
time cargo nextest run \
--filter-expr 'not deps(/polkadot-subsystem-bench/)' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move these to a separate pipeline, that allows for CI to provide the same hw specs as for validators and guarantee CPU/mem resources per POD.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, my bad, we aren't running the tests here 🙈 . let's address this properly in #3530

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndreiEres could you please add a comment above the command with a brief description why you added the filter or a link to your PR.

--workspace \
--locked \
--release \
Expand Down Expand Up @@ -68,7 +69,7 @@ test-linux-stable-runtime-benchmarks:
# but still want to have debug assertions.
RUSTFLAGS: "-Cdebug-assertions=y -Dwarnings"
script:
- time cargo nextest run --workspace --features runtime-benchmarks benchmark --locked --cargo-profile testnet
- time cargo nextest run --filter-expr 'not deps(/polkadot-subsystem-bench/)' --workspace --features runtime-benchmarks benchmark --locked --cargo-profile testnet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this required-features = ["subsystem-benchmarks"] enough ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question, and I wish I knew the answer. Somehow, it keeps running regression tests while cargo test exclude them based on required feature.


# can be used to run all tests
# test-linux-stable-all:
Expand Down Expand Up @@ -492,3 +493,16 @@ test-syscalls:
printf "The x86_64 syscalls used by the worker binaries have changed. Please review if this is expected and update polkadot/scripts/list-syscalls/*-worker-syscalls as needed.\n";
fi
allow_failure: false # this rarely triggers in practice

subsystem-regression-tests:
stage: test
extends:
- .docker-env
- .common-refs
alvicsam marked this conversation as resolved.
Show resolved Hide resolved
- .run-immediately
script:
# - cargo test --profile=testnet -p polkadot-node-core-approval-voting --test approval-voting-regression-bench --features subsystem-benchmarks
- cargo test --profile=testnet -p polkadot-availability-recovery --test availability-recovery-regression-bench --features subsystem-benchmarks
- cargo test --profile=testnet -p polkadot-availability-distribution --test availability-distribution-regression-bench --features subsystem-benchmarks
tags:
- benchmark
3 changes: 3 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions polkadot/node/core/approval-voting/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,14 @@ kvdb-memorydb = "0.13.0"
test-helpers = { package = "polkadot-primitives-test-helpers", path = "../../../primitives/test-helpers" }
log = { workspace = true, default-features = true }
env_logger = "0.9.0"

polkadot-subsystem-bench = { path = "../../subsystem-bench" }

[[test]]
name = "approval-voting-regression-bench"
path = "tests/approval-voting-regression-bench.rs"
harness = false
required-features = ["subsystem-benchmarks"]

[features]
subsystem-benchmarks = []
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.

// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.

// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.

// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.

use polkadot_subsystem_bench::{
approval::{bench_approvals, prepare_test, ApprovalsOptions},
configuration::TestConfiguration,
usage::BenchmarkUsage,
};

const BENCH_COUNT: usize = 3;
const WARM_UP_COUNT: usize = 20;
const WARM_UP_PRECISION: f64 = 0.01;

fn main() -> Result<(), String> {
sandreim marked this conversation as resolved.
Show resolved Hide resolved
let mut messages = vec![];

messages.extend(approvals_no_shows()?);
// messages.extend(approvals_throughput()?);
// messages.extend(approvals_throughput_best_case()?);
sandreim marked this conversation as resolved.
Show resolved Hide resolved

if messages.is_empty() {
Ok(())
} else {
eprintln!("{}", messages.join("\n"));
Err("Regressions found".to_string())
}
}

fn test_configuration() -> TestConfiguration {
let mut config = TestConfiguration::default();
config.n_validators = 500;
config.n_cores = 100;
config.min_pov_size = 1120;
config.max_pov_size = 5120;
config.peer_bandwidth = 524288000000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this value correct as per validator specs (500 Mbit/s (= 62.5 MB/s)) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexggh what do you think? I just copied it from your yaml.

config.bandwidth = 524288000000;
config.num_blocks = 10;
config.generate_pov_sizes();

config
}

fn warm_up(test_case: &str, options: ApprovalsOptions) -> Result<(), String> {
println!("Warming up...");
let mut prev_run: Option<BenchmarkUsage> = None;
for _ in 0..WARM_UP_COUNT {
let curr = run(test_case, options.clone());
if let Some(ref prev) = prev_run {
let ap_distr_diff =
curr.cpu_usage_diff(prev, "approval-distribution").expect("Must exist");
let ap_vot_diff = curr.cpu_usage_diff(prev, "approval-voting").expect("Must exist");
if ap_distr_diff < WARM_UP_PRECISION && ap_vot_diff < WARM_UP_PRECISION {
return Ok(())
}
}
prev_run = Some(curr);
}

Err("Can't warm up".to_string())
}

fn benchmark(test_case: &str, options: ApprovalsOptions) -> BenchmarkUsage {
println!("Benchmarking...");
let usages: Vec<BenchmarkUsage> =
(0..BENCH_COUNT).map(|_| run(test_case, options.clone())).collect();
let usage = BenchmarkUsage::average(&usages);
println!("{}", usage);
usage
}

fn run(test_case: &str, options: ApprovalsOptions) -> BenchmarkUsage {
let (mut env, state) = prepare_test(test_configuration(), options.clone(), false);
env.runtime().block_on(bench_approvals(test_case, &mut env, state))
}

fn approvals_no_shows() -> Result<Vec<String>, String> {
let mut messages = vec![];
let test_case = "approvals_no_shows";
let options = ApprovalsOptions {
coalesce_mean: 3.0,
coalesce_std_dev: 1.0,
enable_assignments_v2: true,
last_considered_tranche: 89,
stop_when_approved: true,
coalesce_tranche_diff: 12,
num_no_shows_per_candidate: 10,
workdir_prefix: "/tmp".to_string(),
};

warm_up(test_case, options.clone())?;
let usage = benchmark(test_case, options.clone());

messages.extend(usage.check_network_usage(&[
("Received from peers", 7000.0, 0.05),
("Sent to peers", 8100.0, 0.05),
]));
messages.extend(usage.check_cpu_usage(&[
("approval-distribution", 0.990, 0.05),
("approval-voting", 1.310, 0.05),
]));

Ok(messages)
}

#[allow(dead_code)]
fn approvals_throughput() -> Result<Vec<String>, String> {
let mut messages = vec![];
let test_case = "approvals_throughput";
let options = ApprovalsOptions {
coalesce_mean: 3.0,
coalesce_std_dev: 1.0,
enable_assignments_v2: true,
last_considered_tranche: 89,
stop_when_approved: false,
coalesce_tranche_diff: 12,
num_no_shows_per_candidate: 0,
workdir_prefix: "/tmp".to_string(),
};

warm_up(test_case, options.clone())?;
let usage = benchmark(test_case, options.clone());

messages.extend(usage.check_network_usage(&[
("Received from peers", 52850.0, 0.05),
("Sent to peers", 63500.0, 0.05),
]));
messages.extend(usage.check_cpu_usage(&[
("approval-distribution", 8.645, 0.05),
("approval-voting", 11.085, 0.05),
]));

Ok(messages)
}

#[allow(dead_code)]
fn approvals_throughput_best_case() -> Result<Vec<String>, String> {
let mut messages = vec![];
let test_case = "approvals_throughput_best_case";
let options = ApprovalsOptions {
coalesce_mean: 3.0,
coalesce_std_dev: 1.0,
enable_assignments_v2: true,
last_considered_tranche: 89,
stop_when_approved: true,
coalesce_tranche_diff: 12,
num_no_shows_per_candidate: 0,
workdir_prefix: "/tmp".to_string(),
};

warm_up(test_case, options.clone())?;
let usage = benchmark(test_case, options.clone());

messages.extend(usage.check_network_usage(&[
("Received from peers", 2950.0, 0.05),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have these been calibrated on reference hw ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests itself run on reference hw.

("Sent to peers", 3250.0, 0.05),
]));
messages.extend(usage.check_cpu_usage(&[
("approval-distribution", 0.48, 0.05),
("approval-voting", 0.625, 0.05),
]));

Ok(messages)
}
11 changes: 11 additions & 0 deletions polkadot/node/network/availability-distribution/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,14 @@ sc-network = { path = "../../../../substrate/client/network" }
futures-timer = "3.0.2"
assert_matches = "1.4.0"
polkadot-primitives-test-helpers = { path = "../../../primitives/test-helpers" }
polkadot-subsystem-bench = { path = "../../subsystem-bench" }


[[test]]
name = "availability-distribution-regression-bench"
path = "tests/availability-distribution-regression-bench.rs"
harness = false
required-features = ["subsystem-benchmarks"]

[features]
subsystem-benchmarks = []
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.

// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.

// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.

// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.

sandreim marked this conversation as resolved.
Show resolved Hide resolved
use polkadot_subsystem_bench::{
availability::{benchmark_availability_write, prepare_test, TestDataAvailability, TestState},
configuration::{PeerLatency, TestConfiguration},
usage::BenchmarkUsage,
};

const BENCH_COUNT: usize = 3;
const WARM_UP_COUNT: usize = 20;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many benches in practice are needed until WARM_UP_PRECISION is achieved ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on previous runs you never know, can be 3, can be 13.

const WARM_UP_PRECISION: f64 = 0.01;

fn main() -> Result<(), String> {
let mut messages = vec![];
let mut config = TestConfiguration::default();
config.latency = Some(PeerLatency { mean_latency_ms: 30, std_dev: 2.0 });
config.n_validators = 1000;
config.n_cores = 200;
config.max_validators_per_core = 5;
config.min_pov_size = 5120;
config.max_pov_size = 5120;
config.peer_bandwidth = 52428800;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different than approval voting setup. It would be better to have the correct value set in default impl of config, and include a comment when we override it.

config.bandwidth = 52428800;
config.connectivity = 75;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 75 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from availability_write.yaml

config.num_blocks = 3;
config.generate_pov_sizes();

warm_up(config.clone())?;
let usage = benchmark(config.clone());

messages.extend(usage.check_network_usage(&[
("Received from peers", 4330.0, 0.05),
("Sent to peers", 15900.0, 0.05),
]));
messages.extend(usage.check_cpu_usage(&[
("availability-distribution", 0.025, 0.05),
("bitfield-distribution", 0.085, 0.05),
("availability-store", 0.180, 0.05),
]));

if messages.is_empty() {
Ok(())
} else {
eprintln!("{}", messages.join("\n"));
Err("Regressions found".to_string())
}
}

fn warm_up(config: TestConfiguration) -> Result<(), String> {
println!("Warming up...");
let mut prev_run: Option<BenchmarkUsage> = None;
for _ in 0..WARM_UP_COUNT {
let curr = run(config.clone());
if let Some(ref prev) = prev_run {
let av_distr_diff =
curr.cpu_usage_diff(prev, "availability-distribution").expect("Must exist");
let bitf_distr_diff =
curr.cpu_usage_diff(prev, "bitfield-distribution").expect("Must exist");
let av_store_diff =
curr.cpu_usage_diff(prev, "availability-store").expect("Must exist");
if av_distr_diff < WARM_UP_PRECISION &&
bitf_distr_diff < WARM_UP_PRECISION &&
av_store_diff < WARM_UP_PRECISION
{
return Ok(())
}
}
prev_run = Some(curr);
}

Err("Can't warm up".to_string())
}

fn benchmark(config: TestConfiguration) -> BenchmarkUsage {
println!("Benchmarking...");
let usages: Vec<BenchmarkUsage> = (0..BENCH_COUNT).map(|_| run(config.clone())).collect();
let usage = BenchmarkUsage::average(&usages);
println!("{}", usage);
usage
}

fn run(config: TestConfiguration) -> BenchmarkUsage {
let mut state = TestState::new(&config);
let (mut env, _protocol_config) =
prepare_test(config.clone(), &mut state, TestDataAvailability::Write, false);
env.runtime()
.block_on(benchmark_availability_write("data_availability_write", &mut env, state))
}
7 changes: 7 additions & 0 deletions polkadot/node/network/availability-recovery/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@ sc-network = { path = "../../../../substrate/client/network" }

polkadot-node-subsystem-test-helpers = { path = "../../subsystem-test-helpers" }
polkadot-primitives-test-helpers = { path = "../../../primitives/test-helpers" }
polkadot-subsystem-bench = { path = "../../subsystem-bench" }

[[test]]
name = "availability-recovery-regression-bench"
path = "tests/availability-recovery-regression-bench.rs"
harness = false
required-features = ["subsystem-benchmarks"]

[features]
subsystem-benchmarks = []
Loading
Loading