Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Initial: Offchain Workers #1942

Merged
merged 46 commits into from Mar 25, 2019
Merged

Initial: Offchain Workers #1942

merged 46 commits into from Mar 25, 2019

Conversation

tomusdrw
Copy link
Contributor

@tomusdrw tomusdrw commented Mar 7, 2019

Initial version of offchain workers (#1458)

  • The worker task is started after every block.
  • All modules by default do nothing in the offchain task, they can implement a behaviour by specifying fn offchain_worker() function in decl_module (alike on_initialise)
  • Runtime uses Executive module to dispatch the call to all modules.
  • Offchain tasks run with a different Context of execution, it means that additional externalities are available for those functions (currently only submit_extrinsic).
  • All inherents produced during execution end up in a very simple InherentsPool and are included as extrinsics (after regular inherents, but before transactions) when new block is created.

Additional work to folllow:

  • Implement a concrete example for delegated voting.
  • Support async fetch.
  • Handle long-running tasks gracefuly.
  • Require extrinsics to be included before a particular block
  • Use OpaqueTransaction for submit_extrinsic transaction type

@tomusdrw tomusdrw added A0-please_review Pull request needs code review. M4-core labels Mar 7, 2019
@@ -364,6 +364,7 @@ where
syncing: cli.syncing_execution.into(),
importing: cli.importing_execution.into(),
block_construction: cli.block_construction_execution.into(),
offchain_worker: ExecutionStrategy::NativeWhenPossible,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it always be NativeWhenPossible or configurable by the user at some point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be configurable primarily to keep it consistent with all the other wasm execution. But I don't see any instance where you would not want to use native if it's there.

Maybe in debugging...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was not sure as well, but if it does not need to be configurable, we don't need to touch this struct at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the option for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that native workers are not consensus critical, NativeWhenPossible seems like the most reasonable strategy

/// or to the next produced block (inherent).
fn submit_extrinsic(&mut self, extrinsic: Vec<u8>);
}
impl<T: OffchainExt + ?Sized> OffchainExt for Box<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do wen need this implementation? Box supports DerefMut, shouldn't that be enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but DerefMut just means that we can use functions from that trait directly on Box<T>, we need to pass Box<T> to a generic function accepting T: OffchainExt though.

I wasn't sure about the API initially and chose to handle generic parameters instead of Box, but currently the only way we actually pass the offchain extensions is by ExecutionContext, and making it generic was introducing too much noise in the code so I chose to Box, but only there.

Let me know if you think it's better to change the other apis and structs to store Box instead of generic param.

[dependencies]
client = { package = "substrate-client", path = "../../core/client" }
consensus = { package = "substrate-consensus-common", path = "../../core/consensus/common" }
futures = "0.1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
futures = "0.1"
futures = "0.1.25"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why it's better. Cargo.lock manages the concrete version anyway and it's just misleading to see 0.1.17 in some other packages (like core/client/Cargo.toml) while after all 0.1.25 is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomusdrw

Cargo.lock manages the concrete version anyway

not on downstream crates. we've been bitten by that before.

You should use futures = "0.1.x" where x is whatever you need for the features you use in this crate. Probably some of them were added after 0.1 -- your Cargo.toml says that any futures crate including 0.1 is supposed to work.

The best strategy is as Basti suggested; just use the latest patch version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I forgot that substrate is used as a library, I'm still not convinced though that it's a good strategy to just use latest patch version, the minimal futures version might be just managed in a single place (say service), cause one crate in the dependency tree is enough to define the minimal required version (specifying latest patch might be overly restrictive), that would also give us some consistency in other crates, right now it seems mostly random.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have said the easiest strategy, not best :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough :)

authors = ["Parity Technologies <admin@parity.io>"]
edition = "2018"

[lib]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm?

|| self.prepare_environment_block(at),
manager,
native_call,
Some(&mut ext),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not just do:

match context {
    ExecutionContext::OffchainWorker(mut ext) => Some(&mut ext),
    _ => None,
}

or something comparable, instead of cloning this function call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can do that although it will result in a different type of Option<>. The NeverOffchainExt should be optimized out by the compiler, since it's not instantiable, but I guess it won't matter that much.

#![warn(missing_docs)]

use client::decl_runtime_apis;
use runtime_primitives::traits::{Header as HeaderT, Block as BlockT};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use runtime_primitives::traits::{Header as HeaderT, Block as BlockT};
use runtime_primitives::traits::NumberFor;

core/offchain/src/api.rs Show resolved Hide resolved
let has_api = runtime.has_api::<OffchainWorkerApi<Block>>(&at);
debug!("Checking offchain workers at {:?}: {:?}", at, has_api);

if let Ok(true) = has_api {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let Ok(true) = has_api {
if has_api.unwrap_or(false) {

/// Depending on the kind of extrinsic it will either be:
/// 1. scheduled to be included in the next produced block (inherent)
/// 2. added to the pool and propagated (transaction)
pub fn submit_extrinsic(data: Vec<u8>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we maybe want to switch to the following interface?

pub fn submit_extrinsic<T: Encode>(data: T) {

@@ -518,6 +518,14 @@ impl_function_executor!(this: FunctionExecutor<'e, E>,

Ok(0)
},
ext_submit_extrinsic(msg_data: *const u8, len: u32) => {
let extrinsic = this.memory.get(msg_data, len as usize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we maybe want to fail, when someone tries to call it on-chain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.ext.submit_extrinsic is going to fail or (as currently) trigger a warning. We can either panic (which I'm not a fan of in consensus code) or just return an Error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah return an Error. I don't know what happens with this error, but we will see it :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke with Sergei and errors are converted to traps in the wasm environment, so I've also added a panicking implementation in case of std environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke with Sergei and the Errors are converted to traps in wasm environment so I've also added a panicking implementation in sr_io::with_std

@@ -329,6 +340,17 @@ where
self.changes_trie_transaction = root_and_tx;
root
}

fn submit_extrinsic(&mut self, extrinsic: Vec<u8>) -> Result<(), ()> {
let _guard = panic_handler::AbortGuard::new(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could possibly be wrong, but could you, please, verify @tomusdrw :

  1. most of times runtime calls are performed with NeverOffchainExt;
  2. NeverOffchainExt::submit_extrinsic is simply unreachable!();
  3. this guard means that if we met panic!(), unreachable!(), etc in the code, the process will exit(1) (see panic_handler);
  4. the runtime could try to call ext_submit_extrinsic anytime (i.e. from initialise_block), and, given (1)..(3) the process will exit with 1 instead of failing execution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait :) NeverOffchainExt::new returns None, so we will go to else branch here :)

// You should have received a copy of the GNU General Public License
// along with Substrate. If not, see <http://www.gnu.org/licenses/>.

//! Substrate service. Starts a thread that spins up the network, client, and extrinsic pool.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like docs are for substrate_service::Service

@@ -81,23 +80,33 @@ where
/// `storage_changes_root` is called matters + we need to remember additional
/// data at this moment (block number).
changes_trie_transaction: Option<(u64, MemoryDB<H>, H::Out)>,
/// Additional externalities for offchain workers.
/// If None the some methods from the trait might not supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// If None the some methods from the trait might not supported.
/// If None some methods from the trait might not supported.

My english is bad but this is a typo isn't it ?

Copy link
Contributor

@pepyakin pepyakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! A few nits and one Q

@@ -375,7 +375,7 @@ mod tests {
};

let changes_trie_storage = InMemoryChangesTrieStorage::new();
let mut ext = Ext::new(&mut overlay, &backend, Some(&changes_trie_storage));
let mut ext = Ext::new(&mut overlay, &backend, Some(&changes_trie_storage), crate::NeverOffchainExt::new());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap this?

Suggested change
let mut ext = Ext::new(&mut overlay, &backend, Some(&changes_trie_storage), crate::NeverOffchainExt::new());
let mut ext = Ext::new(
&mut overlay,
&backend,
Some(&changes_trie_storage),
crate::NeverOffchainExt::new(),
);

@@ -324,7 +347,8 @@ impl<'a, H, B, T, Exec> StateMachine<'a, H, B, T, Exec> where
R: Decode + Encode + PartialEq,
NC: FnOnce() -> result::Result<R, &'static str> + UnwindSafe,
{
let mut externalities = ext::Ext::new(self.overlay, self.backend, self.changes_trie_storage);
let offchain = self.offchain_ext.as_mut();
let mut externalities = ext::Ext::new(self.overlay, self.backend, self.changes_trie_storage, offchain.map(|x| &mut **x));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap this?

Suggested change
let mut externalities = ext::Ext::new(self.overlay, self.backend, self.changes_trie_storage, offchain.map(|x| &mut **x));
let mut externalities = ext::Ext::new(
self.overlay,
self.backend,
self.changes_trie_storage,
offchain.map(|x| &mut **x),
);

///
/// The extrinsic will either go to the pool (signed)
/// or to the next produced block (inherent).
fn submit_extrinsic(&mut self, extrinsic: Vec<u8>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cool if we introduced a type alias for an OpaqueExtrinsic. This way it will be a) more apparent from the code that we deal with an extrinsic and b) will allow us to add some docs on why we use Vec for representing an extrinsic.

(Not to be addressed in this PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm going to address that when valid_until parameter is introduced, then this will be a common type for Vec<u8> and BlockNumber

config.offchain_worker = match (cli.offchain_worker, role) {
(params::OffchainWorkerEnabled::WhenValidating, service::Roles::AUTHORITY) => true,
(params::OffchainWorkerEnabled::Always, _) => true,
_ => false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use exhaustive matching so we don't trip up accidently if we add a new variant here.

Suggested change
_ => false,
(params::OffchainWorkerEnabled::Never, _) => false,


if has_api.unwrap_or(false) {
let (api, runner) = api::Api::new(pool.clone(), self.inherents_pool.clone(), at.clone());
self.executor.spawn(runner.process());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't know much about Rust futures, but can you ELI5 how this terminates?
Is the receiver returns Ready when tx is dropped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so when the work runtime is doing is finished the api transmitting end will be dropped, that in turn will cause the receiving.for_each stream to be terminated, so the entire future returned from process() will terminate.

But we currently allow the offchain workers to run arbitrarily long, it might make sense actually to kill them after some timeout, but that's something I don't really know how to do:

  1. I think it's not possible to terminate wasm task currently
  2. We could error when the worker attempts to use ext_submit_extrinsic (or even any other externality), but obviously it can also never happen.

A: ChainApi<Block=Block> + 'static,
{
let runtime = self.client.runtime_api();
let at = BlockId::number(*number);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last question: why block is identified by number (i.e. not the hash)? What if not best block is imported? Then the offchain worker could be executed with state of some other block (and probably it could be executed several times if there are several competing forks of the height number). Is this intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that's expected. Since the offchain worker produces inherents for next block they might look different if an alternative fork is imported.
We might implement a way to invalidate previously generated inherents/extrinsics though, however
offchain workers are also supposed to have a local storage in the future, so two "competing" tasks could have some sort of communication between runs i.e. the second run for block at height X will be able to tell it's on a fork that now became canon and for instance submit extrinsics that will replace the previous ones.

Numeric block number will also later be used by the offchain worker code to calculate lifetime of produced extrinsics (i.e. you can say I submit this inherent after performing a super-long-running computation, but it's only valid before block bn + 3), or you can also use that to execute some computation every X blocks.

@gavofyork gavofyork merged commit a30d6a1 into master Mar 25, 2019
@gavofyork gavofyork deleted the td-offlineworker branch March 25, 2019 22:23
MTDK1 pushed a commit to bdevux/substrate that referenced this pull request Apr 12, 2019
* Refactor state-machine stuff.

* Fix tests.

* WiP

* WiP2

* Service support for offchain workers.

* Service support for offchain workers.

* Testing offchain worker.

* Initial version working.

* Pass side effects in call.

* Pass OffchainExt in context.

* Submit extrinsics to the pool.

* Support inherents.

* Insert to inherents pool.

* Inserting to the pool asynchronously.

* Add test to offchain worker.

* Implement convenience syntax for modules.

* Dispatching offchain worker through executive.

* Fix offchain test.

* Remove offchain worker from timestamp.

* Update Cargo.lock.

* Address review comments.

* Use latest patch version for futures.

* Add CLI parameter for offchain worker.

* Fix compilation.

* Fix test.

* Fix extrinsics format for tests.

* Fix RPC test.

* Bump spec version.

* Fix executive.

* Fix support macro.

* Address grumbles.

* Bump runtime
@Polkadot-Forum
Copy link

This pull request has been mentioned on Polkadot Forum. There might be relevant details there:

https://forum.polkadot.network/t/offchain-workers-design-assumptions-vulnerabilities/2548/1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants