Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delegated Consensus: working example with a devnet version genesis file #1766

Merged
merged 20 commits into from
Aug 15, 2022

Conversation

aakoshh
Copy link
Collaborator

@aakoshh aakoshh commented Aug 3, 2022

Summary of changes
Changes introduced in this pull request:

  • Adds the devnet CIDs as acceptable for the builtin actors.
  • Added devnet as an acceptable network name, so we can use it in a config file. Previously it had to be calibnet or mainnet.
  • Workaround for the mishandling of t and f prefixes in keys.
  • Option to set tipset_sample_size to 0 to have a standalone node that doesn't try to sync with any network, it goes straight into following mode, so that it can start producing blocks (it doesn't have peers to sync with).
  • Added config.toml files for a standalone proposer and a follower bootstrapping from it, to try delegated consensus.
  • Print the PeerId during startup so it's easier to add it to the config of a node that wants to bootstrap from the one we're running.
  • Factored out the reward message calculation from the VM in the form of a trait which is given to it like Arc<dyn RewardCalc>. Tried to use static typing but the cascading effect is huge: requires adding a generic type to each and every JSON-API method if it's added to StateManager type, or every method of the StateMangager if it's added to the ones that create VMs, which again spirals out as even a key-to-id resolution can result in the execution of state changes.

Troubles generating a genesis.car file

First I tried to generate a genesis.car file using Lotus with the actor CIDs that Forest expects: either calibnet or mainnet.

Here are the steps:

make lotus-seed
mkdir -p genesis-files
rm -rf genesis-files/*
lotus-seed --sector-dir ./genesis-files pre-seal
lotus-seed --sector-dir ./genesis-files genesis new "./genesis-files/genesis.json"
lotus-seed --sector-dir ./genesis-files genesis add-miner "./genesis-files/genesis.json" "./genesis-files/pre-seal-f01000.json"
lotus-seed --sector-dir ./genesis-files genesis car --out "./genesis-files/genesis.car" "./genesis-files/genesis.json"

This fails with the following error:

2022-08-02T16:15:10.771Z	WARN	lotus-seed	lotus-seed/main.go:54	make genesis block: failed to verify presealed data: failed to create verifier: doExec apply message failed: implicit message failed with exit code: 16 and error: message failed with backtrace:
00: f06 (method 2) -- Allowance 2048 below minimum deal size for add verifier f081 (16)

It didn't matter if I changed the NetworkVersion to 16 or left it at 0.

Generating a CAR file for devnet worked before, so that's what I did. For that, Lotus needs to be compiled differently:

GOFLAGS=-tags=2k make lotus-seed 

or any of the following:

make debug
make 2k

Debugging

I built Forest with the deleg_cns feature:

cargo build --release --features deleg_cns --bin forest

Then I ran the node with the following command:

RUST_LOG=debug RUST_BACKTRACE=1 ./target/release/forest --encrypt-keystore false --target-peer-count 1 \
        --genesis blockchain/consensus/deleg_cns/genesis-files/genesis.car \
        --config blockchain/consensus/deleg_cns/configs/proposer-config.toml \
        2>&1 | tee debug.log

The logs show that the node can finally process its own blocks:

2022-08-03T14:11:43.220Z INFO  deleg_cns::proposer      > Proposed block bafy2bzacebaegphuiuz3xb3lnuazloebi4kim7agqyk5vxvw4hkaq5olnsgi4 with 0 messages, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h5da7cfd51bda61a3
Downloading headers 1 / 1 [===============================] 100.00 % 21742.91/s  2022-08-03T14:11:43.220Z WARN  forest_libp2p::service   > Failed to send gossipsub message: InsufficientPeers, fn_name=forest_libp2p::service::Libp2pService<DB>::run::{{closure}}::h1a3d7edb53382ff1
 2022-08-03T14:11:43.220Z INFO  chain_sync::tipset_syncer > Validating tipset: EPOCH = 1, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hf83b47905b70b3ed
 2022-08-03T14:11:43.220Z INFO  chain_sync::tipset_syncer > Validating tipset: EPOCH = 2, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hf83b47905b70b3ed
 2022-08-03T14:11:43.222Z INFO  chain::store::chain_store > New heaviest tipset: TipsetKeys { cids: [Cid(bafy2bzacebaegphuiuz3xb3lnuazloebi4kim7agqyk5vxvw4hkaq5olnsgi4)] }, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::ha892825728132b94
 2022-08-03T14:11:43.222Z INFO  chain_sync::tipset_syncer > Successfully synced tipset range: [1, 2], fn_name=<chain_sync::tipset_syncer::TipsetProcessor<DB,C> as core::future::future::Future>::poll::h0ea8dac023eee333
 2022-08-03T14:11:49.554Z INFO  deleg_cns::proposer       > Proposing a block on top 2 in epoch bafy2bzacebaegphuiuz3xb3lnuazloebi4kim7agqyk5vxvw4hkaq5olnsgi4, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h5da7cfd51bda61a3
 2022-08-03T14:11:49.561Z INFO  deleg_cns::proposer       > Proposed block bafy2bzacebhihvs676diohdnlba75q5i7p3grdnq6ac7rcwotkkdupmea5fgu with 0 messages, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h5da7cfd51bda61a3
Downloading headers 1 / 1 [===============================] 100.00 % 19461.31/s  2022-08-03T14:11:49.561Z WARN  forest_libp2p::service    > Failed to send gossipsub message: InsufficientPeers, fn_name=forest_libp2p::service::Libp2pService<DB>::run::{{closure}}::h1a3d7edb53382ff1
Downloading headers 1 / 1 [===============================] 100.00 % 15641.62/s  2022-08-03T14:11:49.561Z INFO  chain_sync::tipset_syncer > Validating tipset: EPOCH = 2, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hf83b47905b70b3ed
 2022-08-03T14:11:49.561Z INFO  chain_sync::tipset_syncer > Validating tipset: EPOCH = 3, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hf83b47905b70b3ed
 2022-08-03T14:11:49.564Z INFO  chain::store::chain_store > New heaviest tipset: TipsetKeys { cids: [Cid(bafy2bzacebhihvs676diohdnlba75q5i7p3grdnq6ac7rcwotkkdupmea5fgu)] }, fn_name=<core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::ha892825728132b94
 2022-08-03T14:11:49.564Z INFO  chain_sync::tipset_syncer > Successfully synced tipset range: [2, 3], fn_name=<chain_sync::tipset_syncer::TipsetProcessor<DB,C> as core::future::future::Future>::poll::h0ea8dac023eee333

Started a second node with the following command and saw it sync:

RUST_BACKTRACE=1 ./target/release/forest --encrypt-keystore false --target-peer-count 1 \ 
        --genesis blockchain/consensus/deleg_cns/genesis-files/genesis.car \
        --config blockchain/consensus/deleg_cns/configs/delegator-config.toml

Other information and links

This is a followup for #1714 where I concluded that the genesis file didn't work. I assumed that was because Lotus put together a state tree with NetworkVersion 0. I was looking at how to replicate the state tree construction in Forest when I noticed that it can actually produce genesis files with any network version, in fact it was version 16 last time; only mainnet and calibnet would use version 0. That lead to a second round of investigation on why Forest didn't accept it, which lead to the discovery that it was the devnet CIDs it didn't like. This allowed me to finish fixing and testing Delegated Consensus, so it's not dead code.

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good to me, just a few trifles. Feel free to rebase on top of the current main, there was a dependency issue that triggered audit errors which I temporarily silenced.

@@ -92,6 +95,11 @@ impl DelegatedConsensus {
let state_cid = genesis.state_root();
let work_addr = state_manager.get_miner_work_addr(*state_cid, &self.chosen_one)?;

info!(
"The work address of the chosen proposer {} is {}",
self.chosen_one, work_addr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully, it won't end up killing a bunch of younglings 🤣

Comment on lines 158 to 163
// Reward calculation is needed by the VM to calculate state, which can happen essentially anywhere the `StateManager` is called.
// It is consensus specific, but threading it through the type system would be a nightmare, which is why dynamic dispatch is used.
#[cfg(all(feature = "fil_cns", not(any(feature = "deleg_cns"))))]
let reward_calc = fil_cns::reward_calc();
#[cfg(feature = "deleg_cns")]
let reward_calc = deleg_cns::reward_calc();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm completely fine with dynamic dispatch, I bet Forest has bigger bottlenecks than looking up the vtable. What I'm unsure of are the features. Ideally, there would be only one binary to do everything. Forest switched recently to a config-based approach when it comes to mainnet/calibnet selection (thanks to @elmattic) and not a compile-based one like it is still in Lotus. If the performance hit is small then it should be fine to have it set by configuration to increase usability. What do you think? Or am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @LesnyRumcajs to have the consensus built as a runtime feature.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for advising on this issue. I'm not happy about all the noise of setting up the different consensus types in daemon.rs and thought about moving the stuff into their respective modules to only keep a single switch here. It would be more boilerplate in the consensus modules because they could no longer capture variables from the code, but being explicit would most likely be a good thing in this case.

I agree that dynamic dispatch shouldn't be a problem in this case because rewards are calculated once per block. Validation and block creation are similarly coarse grained. I thought maybe the tipset weight calculation benefits from being statically dispatched as it could be called multiple times, and it doesn't have an instance either, so there's nothing to dispatch on at the moment.

But I'm not sure if you would want to compile all kinds of experimental consensus types which are more heavy than this Delegated Consensus into every Forest binary. They can vary wildly in their dependencies; in the current examples there are ones that talk to Tendermint, others do their own networking. The vision is that a parent subnet should not necessarily be aware of what consensus the child subnet is running, and wanted to avoid sleepwalking into harcoding the possible types in the Forest codebase.

Let me know what you think!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved consensus initialisation out of daemon.rs, so you see what it would look like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aakoshh Thanks for the explanation, it makes sense now.

vm/interpreter/src/vm.rs Outdated Show resolved Hide resolved
{ height = "Turbo", epoch = -1 },
{ height = "Hyperdrive", epoch = -1 },
{ height = "Chocolate", epoch = -1 },
{ height = "OhSnap", epoch = -1 },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok/safe to leave the pre Skyr heights at -1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the part that looks for the current network version traverses height_infos in reverse until it finds the first suitable entry. At least for that code it doesn't matter if the others entries would also be usable at height 0, as long as the entries are in correct order in the config file.

I made sure everything has an entry that exists in the calibnet version, but I found it confusing that it has a non-monotonic height schedule, unlike the mainnet variant. Maybe the order of the entries is not as significant as I thought and you can have some activated early, others later.

let miner_addr = header.miner_address();

// Workaround for the bug where Forest strips the network type from the Address
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an issue for this bug in the forest repo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some issues related to this:

As I understand the problem is that the protocol doesn't serialise the network to bytes, so once we convert to ID the information is lost, and to be consistent with that, the string formatting is also lossy. It probably wasn't meant to appear on its own, I'm not sure if its use in JSON config and the wallet was part of the design.

I don't think there is an issue to recognise the friction it causes, but it's possible it just wasn't revealed in my quick search.

So the issue is that

let a0 = Address::from_str("t01000").unwrap();
assert_eq!(a0.network, Network::Testnet);
if let &Payload::ID(id) = a0.payload() {
    let a1 = Address::from_id(id);
    assert_eq!(a1.network, Network::Mainnet);
}

This is basically what's happening in StateManager::lookup_id.

As I reported on Slack earlier, this also affects methods like Address::new_bls and Address::new_secp256k1, where the default Network::Mainnet is attached, which the wallet_helpers::new_address doesn't try to alter with set_network because it's not part of the WalletImportParams in the first place. As I found out later it might actually be cancelled by the fact that the state tree produced by Lotus also has f addresses, even though in the genesis.json file they were t addresses 🤷

I generally found working with the Address confusing, seeing some code that turns it from one format to another format and clearly expecting that the Payload will be of a certain variant, but without the type system indicating this, so I opened some PRs in the actor code of a fork we work on for Hierarchical Consensus to improve the situation by adding a TAddress which tells exactly what it can be:

This is similar to my earlier PR about typed CIDs which now live here.

Let me know if you find interest in adding these to Forest, I'm not sure what would be the best place for them, but I would not live without them.

Comment on lines 158 to 163
// Reward calculation is needed by the VM to calculate state, which can happen essentially anywhere the `StateManager` is called.
// It is consensus specific, but threading it through the type system would be a nightmare, which is why dynamic dispatch is used.
#[cfg(all(feature = "fil_cns", not(any(feature = "deleg_cns"))))]
let reward_calc = fil_cns::reward_calc();
#[cfg(feature = "deleg_cns")]
let reward_calc = deleg_cns::reward_calc();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @LesnyRumcajs to have the consensus built as a runtime feature.

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really solid PR, as always. Thanks a lot @aakoshh

@aakoshh aakoshh merged commit 8f68fbe into ChainSafe:main Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants