Introduce block authorship soft deadline #9663

tomusdrw · 2021-08-31T15:19:41Z

Currently, when packing transactions to a block we have few conditions when we decide the block is ready to be signed by the consensus engine and gossiped to the network:

block size limit is reached,
block weight limit is reached (indirectly through "exhausts resources" errors)
we run out of transactions in the transaction pool,
deadline to produce the block is reached.

The first 3 I'd consider "regular" conditions, the last one is rather a safety valve, ensuring that the block producer does not miss it's slot due to some irregular slowness.

However, since optimal block packing is a difficult problem (i.e. producing a block with optimal utilisation, while maximising the cumulative priority of all transactions in that block), we use a very simple greedy heurstic currently, with a twist, that whenever we run into first transaction that reports resources exhaustion, we attempt to insert at least MAX_SKIPPED_TRANSACTIONS more to the block before concluding it's actually full.

This heuristic is obviously sub-optimal and can be gamed, hence the PR introduces another variant of the heuristic to potentially increase block utilisation, but without reaching the hard deadline defined by the consensus engine.

The PR introduces "soft deadline" (half of the hard deadline time). Before soft deadline is reached we can try as many transactions as desired from the transaction pool, even if they report resources exhaustion. After soft deadline we switch to the previous heuristic of trying at most MAX_SKIPPED_TRANSACTIONS.

gilescope · 2021-09-02T08:05:49Z

What's the smallest size a tx can be? Maybe we should be checking if block_size + MIN_POSSIBLE_TX_SIZE > block_size_limit because there's always going to be a little bit left over but there may be zero chance that any tx can realistically fit in there.

tomusdrw · 2021-09-02T11:51:48Z

What's the smallest size a tx can be?

That's a very good point, but it's up to the runtime to decide and we currently have no good way to ask the runtime about that. Also note that any additional heuristic we add, it should be added for both the size limit and the "resource/weight" limit to make sure we cater for all kinds of runtimes.

tomusdrw · 2021-09-13T13:50:47Z

@bkchr can you take a look if soft deadline is okay, or if you'd rather prefer to add transactions up to the hard deadline?

gww-parity · 2021-09-28T15:13:43Z

What about failed test:

failures:
---- basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping stdout ----
thread 'basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `2`', client/basic-authorship/src/basic_authorship.rs:759:13
stack backtrace:
   0: rust_begin_unwind
             at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/std/src/panicking.rs:515:5
   1: core::panicking::panic_fmt
             at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/core/src/panicking.rs:92:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
   4: sc_basic_authorship::basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping::{{closure}}
   5: sc_basic_authorship::basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping
   6: core::ops::function::FnOnce::call_once
             at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
failures:
    basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping
test result: FAILED. 6 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.25s
error: test failed, to rerun pass '-p sc-basic-authorship --lib'

?

gww-parity

what's below

Introduce block authorship soft deadline #9663 (comment) of test

gww-parity · 2021-09-29T13:41:04Z

client/basic-authorship/src/basic_authorship.rs

+		);
+
+		// when
+		let deadline = time::Duration::from_secs(300);


as this number is hardcoded:

should it be explained in few words why 300 and when may user want to tweak it?

or maybe make const out of it?

This is a test.

Still I don't know why this value for test. To me tests are documentation, and I don't know how to read it -> is it random? Or semi-random? etc. Overall, yes, it's not blocking from approval, just something imho nice to polish.

Seeing tests as docs is really a bad idea.

substrate/primitives/consensus/common/src/lib.rs

Line 208 in 9d058d4

/// Create a proposal.

here are the docs

The deadline is that high to not have it triggered.

Seeing tests as docs is really a bad idea.

I see we come from different background and experience ;).

Otherwise, thank you for pointers and links.

Again, I agree with you that this is just a detail, so whatever will be fine with me (may be even ignored) and as you see my approval for PR was already given regardless :).

I see we come from different background and experience ;).

I mean, I would not say that I do not read tests from time to time, but this normally means that the docs are shit and I need some examples on how to use it :D

:). I consider that good tests have a lot of value. They can document expectations in non-ambiguous way (contrary to human natural language), and keep assumptions in check (e.g. setting time limits with not too much slack, can help you in detecting if someone accidentally introduces performance regression (that may be not caught by benchmarking or make diagnosis easier).... maybe not in this particular piece of code but in general).
Ideally, tests may cover expected and edge cases, so I can read from them who edge cases may look like, but yeah, not always worth the effort.

Regarding code review process, asking questions about tests I consider practice motivating to rethink hidden assumptions, and if we can do better. E.g. here, maybe 300 is 10x too much and maybe we can go 10x lower? Maybe it will help to catch other issues? Or maybe it's not worth it and we keep 300 and move on. :)

TL;DR- to me documentation and tests have potential (but not always have to) document different things, from different angles :).

bkchr · 2021-09-29T19:12:57Z

client/basic-authorship/src/basic_authorship.rs

+		);
+
+		// when
+		let deadline = time::Duration::from_secs(300);


This is a test.

bkchr · 2021-09-29T19:19:28Z

client/basic-authorship/src/basic_authorship.rs

@@ -386,6 +390,13 @@ where
 						MAX_SKIPPED_TRANSACTIONS - skipped,
 					);
 					continue
+				} else if (self.now)() < soft_deadline {


Suggested change

} else if (self.now)() < soft_deadline {

} else if now < soft_deadline {

I wonder, isn't actually original (self.now)() intentional to get more recent reading at a time of test?

Otherwise, to me both are good enough to not block approval.

It's a good suggestion, there is nothing intensive happening between these calls, and it's worth saving this one extra syscall. That was my intention when I introduced now variable initially as well, but somehow lost it in all the refactorings.

bkchr · 2021-09-29T19:21:36Z

What about failed test:

failures:
---- basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping stdout ----
thread 'basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `2`', client/basic-authorship/src/basic_authorship.rs:759:13
stack backtrace:
   0: rust_begin_unwind
             at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/std/src/panicking.rs:515:5
   1: core::panicking::panic_fmt
             at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/core/src/panicking.rs:92:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
   4: sc_basic_authorship::basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping::{{closure}}
   5: sc_basic_authorship::basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping
   6: core::ops::function::FnOnce::call_once
             at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
failures:
    basic_authorship::tests::should_not_remove_invalid_transactions_when_skipping
test result: FAILED. 6 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.25s
error: test failed, to rerun pass '-p sc-basic-authorship --lib'

?

This is a flaky test

tomusdrw · 2021-10-04T14:30:34Z

bot merge

ghost · 2021-10-04T14:30:46Z

Trying merge.

* master: (125 commits) Update multiple dependencies (#9936) Speed up timestamp generation when logging (#9933) First word should be Substrate not Polkadot (#9935) Improved file not found error message (#9931) don't read events in elections anymore. (#9898) Remove incorrect sanity check (#9924) Require crypto scheme for `insert-key` (#9909) chore: refresh of the substrate_builder image (#9808) Introduce block authorship soft deadline (#9663) Rework Transaction Priority calculation (#9834) Do not propagate host RUSTFLAGS when checking for WASM toolchain (#9926) Small quoting comment fix (#9927) add clippy to CI (#9694) Ensure BeforeBestBlockBy voting rule accounts for base (#9920) rm `.maintain` lock (#9919) Downstream `node-template` pull (#9915) Implement core::fmt::Debug for BoundedVec (#9914) Quickly skip invalid transactions during block authorship. (#9789) Add SS58 prefix for Automata (#9805) Clean up sc-peerset (#9806) ...

crystalin · 2021-10-24T18:23:47Z

@tomusdrw it looks great. Would it be possible to have either the MAX_SKIPPED_TRANSACTIONS or the soft_deadline be configurable. In our case we often have to skip transactions. With this PR we can have half the time used, which is better than before but we can't fill a block as the MAX_SKIPPED_TRANSACTIONS can easily be exceeded already when reaching half the block allocated time.

bkchr · 2021-10-24T18:52:11Z

@crystalin why do you often have to skip transactions?

crystalin · 2021-10-24T21:00:20Z

@bkchr
Because in Moonbeam, we are allowing native Ethereum transactions. To deal with weight, we simply do a multiplication of the gas_limit provided with a given ratio number.
However, this allows a user to provide a transaction with a very high gas_limit and so with a very high weight associated, even if the actual weight used would be a lot lower.

In Ethereum maintnet the logic is to try executing the transaction even if the current block gas + the current gas_limit is over the block maximum gas.
But in substrate, the transaction with an expected weight that might go over the block weight limit are skipped (which could be a good optimization)

We observed however that when someone spam few transactions with very high gas_limit, it can prevent the block from producing more than 8 transactions per block.

This PR improves the situation making sure we would use a least 1/2 of the time in block production no matter if many skipped transactions

bkchr · 2021-10-25T11:31:13Z

@crystalin if you prepare a pr, we can merge it.

crystalin · 2021-10-25T12:43:57Z

@bkchr What would the preferred way ? allow to control the MAX_SKIPPED_TRANSACTIONS or the soft_deadline (or both) ?

bkchr · 2021-10-25T17:27:06Z

Both? IDK. What do you need?

tomusdrw · 2021-10-27T10:23:51Z

I think soft_deadline is less of a footgun (basically it should be configured as a percentage of the hard deadline), with configuration max skipped transactions your are risking reaching the hard_deadline and potentially missing a slot.

I find it a bit surprising that the current hardcoded soft deadline is insufficient though. If we assume the block time of 6s and hard deadline of 3s, having around 1.5s (soft deadline) to produce a block should be more than enough - if block production/import takes more than this I feel that maybe the client is not running on powerful enough hardware or the maximal block weight is quite high.

crystalin · 2021-10-27T12:02:54Z

@tomusdrw the issue is not with blockchain but with parachain:
The hard_deadline is set to 333ms (2/3rd of the max block time 500ms), so the soft_deadline is around 166ms. It is enough to include some transactions but still very limiting

tomusdrw · 2021-10-30T08:22:27Z

I see, making soft_deadline(_percentage) configurable makes sense then.

tomusdrw added 6 commits August 24, 2021 10:45

Soft limit.

f862f1a

Merge branch 'master' into td-limit

08dee90

Merge branch 'master' into td-limit

d81aa66

Merge branch 'master' into td-limit

e8e6e82

Add soft deadline tests.

d4f8473

Merge branch 'master' into td-limit

411d428

tomusdrw added A0-please_review Pull request needs code review. B5-clientnoteworthy C3-medium PR touches the given topic and has a medium impact on builders. labels Aug 31, 2021

tomusdrw requested review from pepyakin and gww-parity August 31, 2021 15:19

tomusdrw added the D5-nicetohaveaudit ⚠️ PR contains trivial changes to logic that should be properly reviewed. label Aug 31, 2021

cargo +nightly fmt --all

f0dcb32

tomusdrw requested a review from bkchr September 13, 2021 13:50

tomusdrw added 4 commits September 13, 2021 15:51

Merge branch 'master' into td-limit

fad8c2f

Fix sc-service test.

b62732d

Merge branch 'master' into td-limit

959e05e

Merge branch 'master' into td-limit

9d058d4

gww-parity reviewed Sep 29, 2021

View reviewed changes

bkchr approved these changes Sep 29, 2021

View reviewed changes

gww-parity approved these changes Sep 29, 2021

View reviewed changes

tomusdrw added 3 commits September 30, 2021 16:57

Merge branch 'master' into td-limit

cae5871

Improving tests

04903c9

Merge branch 'master' into td-limit

0b5244b

ghost merged commit 125092f into master Oct 4, 2021

ghost deleted the td-limit branch October 4, 2021 14:30

tomusdrw mentioned this pull request Oct 30, 2021

Make authorship soft deadline configurable. #10125

Merged

ghzlatarev mentioned this pull request Nov 2, 2021

[Manta-PC] Update dependencies to v0.9.12 Manta-Network/Manta#242

Merged

19 tasks

github-actions bot mentioned this pull request Nov 2, 2021

Update substrate/polkadot/cumulus from v0.9.11 to v0.9.12 moonbeam-foundation/moonbeam#951

Closed

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce block authorship soft deadline #9663

Introduce block authorship soft deadline #9663

tomusdrw commented Aug 31, 2021

gilescope commented Sep 2, 2021

tomusdrw commented Sep 2, 2021

tomusdrw commented Sep 13, 2021

gww-parity commented Sep 28, 2021

gww-parity left a comment

gww-parity Sep 29, 2021

bkchr Sep 29, 2021

gww-parity Sep 30, 2021

bkchr Sep 30, 2021

bkchr Sep 30, 2021

bkchr Sep 30, 2021

gww-parity Sep 30, 2021

bkchr Sep 30, 2021

gww-parity Sep 30, 2021

bkchr Sep 29, 2021

bkchr Sep 29, 2021

gww-parity Sep 30, 2021 •

edited

Loading

tomusdrw Oct 4, 2021

bkchr commented Sep 29, 2021

tomusdrw commented Oct 4, 2021

ghost commented Oct 4, 2021

crystalin commented Oct 24, 2021

bkchr commented Oct 24, 2021

crystalin commented Oct 24, 2021

bkchr commented Oct 25, 2021

crystalin commented Oct 25, 2021

bkchr commented Oct 25, 2021

tomusdrw commented Oct 27, 2021

crystalin commented Oct 27, 2021

tomusdrw commented Oct 30, 2021

	} else if (self.now)() < soft_deadline {
	} else if now < soft_deadline {

Introduce block authorship soft deadline #9663

Introduce block authorship soft deadline #9663

Conversation

tomusdrw commented Aug 31, 2021

gilescope commented Sep 2, 2021

tomusdrw commented Sep 2, 2021

tomusdrw commented Sep 13, 2021

gww-parity commented Sep 28, 2021

gww-parity left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gww-parity Sep 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bkchr commented Sep 29, 2021

tomusdrw commented Oct 4, 2021

ghost commented Oct 4, 2021

crystalin commented Oct 24, 2021

bkchr commented Oct 24, 2021

crystalin commented Oct 24, 2021

bkchr commented Oct 25, 2021

crystalin commented Oct 25, 2021

bkchr commented Oct 25, 2021

tomusdrw commented Oct 27, 2021

crystalin commented Oct 27, 2021

tomusdrw commented Oct 30, 2021

gww-parity Sep 30, 2021 •

edited

Loading