-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delegated Consensus: working example with a devnet version genesis file #1766
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good to me, just a few trifles. Feel free to rebase on top of the current main
, there was a dependency issue that triggered audit errors which I temporarily silenced.
@@ -92,6 +95,11 @@ impl DelegatedConsensus { | |||
let state_cid = genesis.state_root(); | |||
let work_addr = state_manager.get_miner_work_addr(*state_cid, &self.chosen_one)?; | |||
|
|||
info!( | |||
"The work address of the chosen proposer {} is {}", | |||
self.chosen_one, work_addr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully, it won't end up killing a bunch of younglings 🤣
forest/src/daemon.rs
Outdated
// Reward calculation is needed by the VM to calculate state, which can happen essentially anywhere the `StateManager` is called. | ||
// It is consensus specific, but threading it through the type system would be a nightmare, which is why dynamic dispatch is used. | ||
#[cfg(all(feature = "fil_cns", not(any(feature = "deleg_cns"))))] | ||
let reward_calc = fil_cns::reward_calc(); | ||
#[cfg(feature = "deleg_cns")] | ||
let reward_calc = deleg_cns::reward_calc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm completely fine with dynamic dispatch, I bet Forest has bigger bottlenecks than looking up the vtable. What I'm unsure of are the features. Ideally, there would be only one binary to do everything. Forest switched recently to a config-based approach when it comes to mainnet/calibnet selection (thanks to @elmattic) and not a compile-based one like it is still in Lotus. If the performance hit is small then it should be fine to have it set by configuration to increase usability. What do you think? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @LesnyRumcajs to have the consensus built as a runtime feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for advising on this issue. I'm not happy about all the noise of setting up the different consensus types in daemon.rs
and thought about moving the stuff into their respective modules to only keep a single switch here. It would be more boilerplate in the consensus modules because they could no longer capture variables from the code, but being explicit would most likely be a good thing in this case.
I agree that dynamic dispatch shouldn't be a problem in this case because rewards are calculated once per block. Validation and block creation are similarly coarse grained. I thought maybe the tipset weight calculation benefits from being statically dispatched as it could be called multiple times, and it doesn't have an instance either, so there's nothing to dispatch on at the moment.
But I'm not sure if you would want to compile all kinds of experimental consensus types which are more heavy than this Delegated Consensus into every Forest binary. They can vary wildly in their dependencies; in the current examples there are ones that talk to Tendermint, others do their own networking. The vision is that a parent subnet should not necessarily be aware of what consensus the child subnet is running, and wanted to avoid sleepwalking into harcoding the possible types in the Forest codebase.
Let me know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved consensus initialisation out of daemon.rs
, so you see what it would look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aakoshh Thanks for the explanation, it makes sense now.
{ height = "Turbo", epoch = -1 }, | ||
{ height = "Hyperdrive", epoch = -1 }, | ||
{ height = "Chocolate", epoch = -1 }, | ||
{ height = "OhSnap", epoch = -1 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ok/safe to leave the pre Skyr
heights at -1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the part that looks for the current network version traverses height_infos
in reverse until it finds the first suitable entry. At least for that code it doesn't matter if the others entries would also be usable at height 0, as long as the entries are in correct order in the config file.
I made sure everything has an entry that exists in the calibnet
version, but I found it confusing that it has a non-monotonic height schedule, unlike the mainnet variant. Maybe the order of the entries is not as significant as I thought and you can have some activated early, others later.
let miner_addr = header.miner_address(); | ||
|
||
// Workaround for the bug where Forest strips the network type from the Address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an issue for this bug in the forest repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some issues related to this:
- Include the networkID when converting an address from str #610
- Fix string decode handling network #611
- Allow Address network prefix to be overriden for printing #233
As I understand the problem is that the protocol doesn't serialise the network to bytes, so once we convert to ID
the information is lost, and to be consistent with that, the string formatting is also lossy. It probably wasn't meant to appear on its own, I'm not sure if its use in JSON config and the wallet was part of the design.
I don't think there is an issue to recognise the friction it causes, but it's possible it just wasn't revealed in my quick search.
So the issue is that
let a0 = Address::from_str("t01000").unwrap();
assert_eq!(a0.network, Network::Testnet);
if let &Payload::ID(id) = a0.payload() {
let a1 = Address::from_id(id);
assert_eq!(a1.network, Network::Mainnet);
}
This is basically what's happening in StateManager::lookup_id.
As I reported on Slack earlier, this also affects methods like Address::new_bls
and Address::new_secp256k1
, where the default Network::Mainnet
is attached, which the wallet_helpers::new_address doesn't try to alter with set_network
because it's not part of the WalletImportParams in the first place. As I found out later it might actually be cancelled by the fact that the state tree produced by Lotus also has f
addresses, even though in the genesis.json
file they were t
addresses 🤷
I generally found working with the Address
confusing, seeing some code that turns it from one format to another format and clearly expecting that the Payload
will be of a certain variant, but without the type system indicating this, so I opened some PRs in the actor code of a fork we work on for Hierarchical Consensus to improve the situation by adding a TAddress
which tells exactly what it can be:
- https://github.com/adlrocha/builtin-actors/pull/15
- https://github.com/adlrocha/builtin-actors/pull/16
This is similar to my earlier PR about typed CIDs which now live here.
Let me know if you find interest in adding these to Forest, I'm not sure what would be the best place for them, but I would not live without them.
forest/src/daemon.rs
Outdated
// Reward calculation is needed by the VM to calculate state, which can happen essentially anywhere the `StateManager` is called. | ||
// It is consensus specific, but threading it through the type system would be a nightmare, which is why dynamic dispatch is used. | ||
#[cfg(all(feature = "fil_cns", not(any(feature = "deleg_cns"))))] | ||
let reward_calc = fil_cns::reward_calc(); | ||
#[cfg(feature = "deleg_cns")] | ||
let reward_calc = deleg_cns::reward_calc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @LesnyRumcajs to have the consensus built as a runtime feature.
f178144
to
2d78707
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really solid PR, as always. Thanks a lot @aakoshh
Summary of changes
Changes introduced in this pull request:
devnet
CIDs as acceptable for the builtin actors.devnet
as an acceptable network name, so we can use it in a config file. Previously it had to becalibnet
ormainnet
.t
andf
prefixes in keys.tipset_sample_size
to 0 to have a standalone node that doesn't try to sync with any network, it goes straight into following mode, so that it can start producing blocks (it doesn't have peers to sync with).config.toml
files for a standalone proposer and a follower bootstrapping from it, to try delegated consensus.PeerId
during startup so it's easier to add it to the config of a node that wants to bootstrap from the one we're running.VM
in the form of a trait which is given to it likeArc<dyn RewardCalc>
. Tried to use static typing but the cascading effect is huge: requires adding a generic type to each and every JSON-API method if it's added toStateManager
type, or every method of theStateMangager
if it's added to the ones that create VMs, which again spirals out as even a key-to-id resolution can result in the execution of state changes.Troubles generating a genesis.car file
First I tried to generate a
genesis.car
file using Lotus with the actor CIDs that Forest expects: eithercalibnet
ormainnet
.Here are the steps:
This fails with the following error:
It didn't matter if I changed the
NetworkVersion
to 16 or left it at 0.Generating a CAR file for
devnet
worked before, so that's what I did. For that, Lotus needs to be compiled differently:or any of the following:
Debugging
I built Forest with the
deleg_cns
feature:Then I ran the node with the following command:
The logs show that the node can finally process its own blocks:
Started a second node with the following command and saw it sync:
Other information and links
This is a followup for #1714 where I concluded that the genesis file didn't work. I assumed that was because Lotus put together a state tree with NetworkVersion 0. I was looking at how to replicate the state tree construction in Forest when I noticed that it can actually produce genesis files with any network version, in fact it was version 16 last time; only mainnet and calibnet would use version 0. That lead to a second round of investigation on why Forest didn't accept it, which lead to the discovery that it was the devnet CIDs it didn't like. This allowed me to finish fixing and testing Delegated Consensus, so it's not dead code.