-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
ISSUE
Overview
This issue is about the following quote from tokio::task::spawn_blocking (used by us to create threads in the worker process):
Tokio will spawn more blocking threads when they are requested through this function until the upper limit configured on the Builder is reached. After reaching the upper limit, the tasks are put in a queue. The thread limit is very large by default, because
spawn_blockingis often used for various kinds of IO operations that cannot be performed asynchronously. When you run CPU-bound code usingspawn_blocking, you should keep this large upper limit in mind.
Why is this an issue? Because for deterministic results we require that all worker threads are spawned at the same time, and not staggered. For example, if the CPU time monitor thread is late to start, we may timeout later than other validators, and accept candidates that others do not. (Also, we may at some point start using the measurements from the memory stats thread to reject PVFs, though right now this is preparation-only so inaccurate measurements cannot lead to disputes.)
Proposal
If we cannot spawn all the threads immediately, we should return an internal error (i.e. not dispute). This would lead to a retry (since paritytech/polkadot#7011).
That said, as the default limit in tokio is very high, it is not clear how it can ever be reached in practice. Each execution job uses 2 threads, prepare jobs use 3. So I would say this is low priority, but something to keep in mind.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status