Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running an unstaked Access Node #911

Merged
merged 33 commits into from
Jul 21, 2021
Merged

Running an unstaked Access Node #911

merged 33 commits into from
Jul 21, 2021

Conversation

vishalchangrani
Copy link
Contributor

@vishalchangrani vishalchangrani commented Jun 29, 2021

This is the first set of changes needed to run an Unstaked Access node (an access node whose identity is not part of the identity table). This brings up the AN as a stand alone program or as part of the localnet.

There are two different networks - unstaked network and staked network. An unstaked access node only joins the unstaked network. A staked access node joins both the unstaked and the staked network.

UnstakedAN_(1)

This PR introduces the following changes:

  1. Changes to scaffold.go to allow initializing an unstaked access node a little differently than a regular node via new PreInit functions and exported public functions.
  2. Adding AccessNodeBuilder to initialize and run a staked or an unstaked access node.
  3. Changes to localnet to generate an unstaked access node along with other nodes. The keys for the unstaked AN are generated like any other node but the root snapshot excludes the unstaked node.

Currently, the unstaked Access node does expect the genesis data (root snapshot) AND the node-info.json. It reads the nodeid and the networking key from it (not the staking key).

I could generate the networking key and the node id on the fly as the node starts but I am wondering if there is value in asking the node operator to provide a node id (maybe for auditing traffic coming into a staked access node).

the unstaked access node can be run using the args:

--nodeid=a0e1d65a9a63accda037691321df0de0605e36577ce1eba6683ace6751ea9aea
--bootstrapdir=/Users/vishalchangrani/go/src/github.com/onflow/flow-go/integration/localnet/bootstrap
--datadir=/Users/vishalchangrani/go/src/github.com/onflow/flow-go/integration/localnet/data
--loglevel=DEBUG
--staked=false
--unstaked-bind-addr=localhost:9658
--staked-access-node-id eb5baa26e18f98f90769c844bd2414a465dc6f8e50f33d73985d65b0c03b0aa8

the staked access node can be run using the args:

--nodeid=eb5baa26e18f98f90769c844bd2414a465dc6f8e50f33d73985d65b0c03b0aa8
--bootstrapdir=/Users/vishalchangrani/go/src/github.com/onflow/flow-go/integration/localnet/bootstrap
--datadir=/Users/vishalchangrani/go/src/github.com/onflow/flow-go/integration/localnet/data
--loglevel=DEBUG
--staked=true
--unstaked-bind-addr=localhost:9658
--bind=localhost:8569

To run in IDE specify params here -
Screen Shot 2021-06-29 at 3 19 03 PM

@@ -124,6 +128,15 @@ func main() {
panic(err)
}

fmt.Println("Node bootstrapping data generated...")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to help distinguish between the staked versus the unstaked AN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. output

Node bootstrapping data generated...
1: execution-9897aa81bbf63ff72dab471ae7f025af81c8227704a7287ce04f6411663bf36b@execution_1:2137=1000
2: collection-71b7873bc7421c7e6c5c269768c49f07e1de7e1e7a886b20f678625c7b04b29a@collection_1:2137=1000
3: consensus-d039bfff32380e4482a5272c35a822d303ac2cd97fc1b5ca2242bff72039c81d@consensus_1:2137=1000
4: verification-e07013fb410df9f9d3a19566d6341bf7a77dd6eeefc3cb1a9412969fdca6cead@verification_1:2137=1000
5: access-4b4381b604a5fcb60e98a485c29adef262e5ab1617b43a1ec3137a2b8563e8b4@access_1:2137=1000
6: access-4f86c7a6a33ff21d4cf3e4612840a8fb5f90788fb2c7ef0c71a4f7d524ddac8b@access_2:2137=1000 (unstaked)

@@ -583,19 +584,43 @@ func BootstrapNetwork(networkConf NetworkConfig, bootstrapDir string) (*flow.Blo
return nil, nil, nil, nil, fmt.Errorf("failed to setup keys: %w", err)
}

// write private key files for each node
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved up to separate out staked vs non-staked nodes before calling runDKG for only the staked nodes.

@codecov-commenter
Copy link

codecov-commenter commented Jun 29, 2021

Codecov Report

Merging #911 (0a677c6) into master (20969da) will decrease coverage by 0.05%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #911      +/-   ##
==========================================
- Coverage   54.85%   54.79%   -0.06%     
==========================================
  Files         284      285       +1     
  Lines       18872    18876       +4     
==========================================
- Hits        10352    10343       -9     
- Misses       7123     7138      +15     
+ Partials     1397     1395       -2     
Flag Coverage Δ
unittests 54.79% <0.00%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
network/p2p/middleware.go 0.00% <0.00%> (ø)
network/p2p/network.go 0.00% <ø> (ø)
network/topology/fixedListTopology.go 0.00% <0.00%> (ø)
...sus/approvals/assignment_collector_statemachine.go 40.38% <0.00%> (-9.62%) ⬇️
cmd/util/ledger/migrations/storage_v4.go 42.16% <0.00%> (+0.60%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20969da...0a677c6. Read the comment docs.

@@ -114,7 +114,7 @@ func NewMiddleware(log zerolog.Logger,
}
}

func defaultValidators(log zerolog.Logger, flowID flow.Identifier) []network.MessageValidator {
func DefaultValidators(log zerolog.Logger, flowID flow.Identifier) []network.MessageValidator {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making public so that it can be used in scaffold.

cmd/scaffold.go Outdated
@@ -135,6 +136,7 @@ type FlowNodeBuilder struct {
postInitFns []func(*FlowNodeBuilder)
stakingKey crypto.PrivateKey
networkKey crypto.PrivateKey
Unstaked bool
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the flag used to determine if the node should run as staked (Unstaked=false) or unstaked (Unstaked=true). Although it is only used for an access node, it is define here in scaffold because it needs to influence how the node is initialized by the flow node builder.

cmd/access/main.go Outdated Show resolved Hide resolved
cmd/scaffold.go Outdated
Comment on lines 210 to 212
// filter out messages sent by this node itself
validator.NewSenderValidator(fnb.Me.NodeID()),
// but retain all the 1-k messages even if they are not intended for this node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this part a little more? Why do we filter out messages sent by this node itself, and what is the significance of 1-k?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure..currently, when you send a 1-k message (in other words, publish a message on a topic), the network layers seems to also loopback the message to the sender if the sender has also subscribed to the topic. This is not desired and hence a validator was added to filter out such messages which have the Origin field in the message equal to the node ID of the node (loop back messages).

Also, when you send a 1-k message, the sender has to specify a list of TargetIDs which indicate the nodes for which the message is intended. This is generally a subset of nodes subscribed on the topic. e.g. If a collection node sends a tx to collection node 1, 2 and 3, it will be received by ALL collection nodes in the system since they are all subscribed to the send-transaction channel. The target validator makes sure that only the intended targets receive the message and the other nodes just drop it.

Copy link
Contributor

@synzhu synzhu Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is, why does the behavior of the staked vs unstaked node need to be different?

Can you explain how exactly the behavior will be different here between the staked and unstaked version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smnzhu I think what @vishalchangrani is saying is that the normal pattern is to address messages to a TargetID which requires to have the AN maintain current a set of unstaked ANs. By not doing that, we remove the duty to care about that set, but we also make validators that look for a node's own targetID pointless at the unstaked receiver AN.

cmd/access/main.go Outdated Show resolved Hide resolved
if staked {
return
}
if strings.TrimSpace(stakedAccessNodeIDHex) == "" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yet to add code to actually use the staked access node id

…zation; changing scafold to to seperate out initialition from construction
@vishalchangrani
Copy link
Contributor Author

I may be missing something, but would it be possible to have an interface at the package level of scaffold, and the implementation of the AccessNodeBuilder would embed a FlowNodeBuilder, with the proper indirections / reimplementations on top?

I think that will be possible. Lemme work on that. I didn't realize I would need so many changes on the base class until after I finished it.

@vishalchangrani
Copy link
Contributor Author

vishalchangrani commented Jul 15, 2021

Refactored the code to address this comment.

Now, the structure is as follows:

  1. A new interface NodeBuilder defines all methods needed to bootstrap a Flow node.
  2. Existing FlowNodeBuilder and the new AccessNodeBuilder implement this interface.
  3. A new interface AccessNodeBuilder which extends the NodeBuilder interface and adds Access Node specific functions.
  4. Separate StakedAccessNodeBuilder and UnstakedAccessNodeBuilder (which are composed of AccessNodeBuilder) to take care of bringing up Staked and Unstaked AN resp.
  5. Changes to all existing startup files (main.go) to use the NodeBuilder functions instead of directly using the implementation.

@vishalchangrani
Copy link
Contributor Author

I have a couple questions on concept for the setup of both the unstaked node and that staked access node that may connect to them:

  • since unstaked access nodes don't suffer from leaving the network, how do we make sure the access node periodically trims a connection to them if they are down?
  • could we bootstrap the unstaked access node from a separate fileset, even in our localnet? And could we have the feature that Init() crashes on unstaked nodes if they have a set of participants that includes anything with a staking key but their upstream AN? Could we even not put the upstream AN's staking key in their config, so we're sure they don't get more info than needed? That way we detect if we have unduly given staked network info to unstaked nodes.
  • that configuration file could be read by the staked AN to bootstrap its unstaked network as well.
  • Would a crasher in the unstaked network's libp2p Host bring down the staked access node? if so, how do we track that risk?

On the code more specifically:

  • 👍 on the breakdown of the FlowNode config, Init / PreInit makes sense,
  • with that said, we're opening a very wide door in making baseFlags, enqueueNetworkInit, etc public. Might we leave at least some of them private, hide them behind an interface, have both nodes (staked & unstaked) implement that and run one single public method run through the initialization sequence on the interface?

@huitseeker - To address the issue of not letting the unstaked AN know about the identities of the staked node at all, I have created this new issue The unstaked Access Node should bootstrap without the identity list of the staked network

@vishalchangrani
Copy link
Contributor Author

@jordanschalm @huitseeker @smnzhu - To help make the code review a little easier, I moved all the bootstrap changes to scaffold into its own separate PR here - #983. This new PR is based off of this branch (unstaked_an). I reverted unstaked_an to what it was before those changes.

cmd/access/access_node_builder.go Outdated Show resolved Hide resolved
integration/localnet/Makefile Outdated Show resolved Hide resolved
unstakedMiddleware *p2p.Middleware
}

func FlowAccessNode() *AccessNodeBuilder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move this and the existing FlowNodeBuilder to a scaffold package together to avoid all the newly public methods in FlowNodeBuilder and keep the scaffolding logic together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lemme do this in the other PR since that one is primarily for such scaffold related changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, lemme do that as different PR all together once I have 983 merged into this one otherwise it will just become even more difficult to review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created this issue to address this comment

// It is composed of the FlowNodeBuilder
type AccessNodeBuilder struct {
*cmd.FlowNodeBuilder
staked bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a different term than "unstaked" that we can use for this new mode of operation for the Access Node. Being unstaked doesn't really describe the new mode of operation very well IMO, and the notions of stake and weight are already a bit overloaded and frankly confusing in parts in the protocol code (see here for an example).

For example, in the NodeConfig we have a field called Stake, which really means Weight and we have a field called Unstaked which only applies to access nodes, doesn't have much to do with the field called Stake, and determines whether an AN runs in this new external network or as part of the core Flow network.

Some ideas: External AN / network, Proxy AN / network, Adjunct AN / network

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public network? (versus our current private network)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh - i like the word unstaked network it succinctly describes this network as a network of unstaked actors. Any other name seems to be requiring more explanation since our flow network is public and external.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in places where the new field / name is near an existing stake/weight reference we can expand the name to differentiate or add some comments to differentiate


// SupportUnstakedNodes returns True if this a staked Access Node which also participates in the unstaked network,
// False otherwise
func (builder *StakedAccessNodeBuilder) supportUnstakedNodes() bool {
Copy link
Contributor

@synzhu synzhu Jul 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should put this function on the parent interface AccessNodeBuilder, and rename to something like ParticipatesInUnstakedNetwork.

This is because the main.go will need to use this function to determine whether to create a splitter engine and relay engine for a staked access node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@vishalchangrani
Copy link
Contributor Author

@jordanschalm @huitseeker @smnzhu - thanks for the diligent review on this PR so far! ❤️
I have merged in the other PR that was reviewed by y'all into this one.
Please can you review and approve this PR as well? I believe I have addressed all the outstanding comments.
thanks 🙇

func (anb *FlowAccessNodeBuilder) parseFlags() {
// supportUnstakedNodes returns True if this a staked Access Node which also participates in the unstaked network,
// False otherwise
func (builder *FlowAccessNodeBuilder) supportUnstakedNodes() bool {
Copy link
Contributor

@synzhu synzhu Jul 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was to add it to the AccessNodeBuilder interface, not only the FlowAccessNodeBuilder implementation. See this for example.

We would rename the method as ParticipatesInUnstakedNetwork (or some other name which does not imply whether the node is staked or not), and it would always return true for unstaked nodes.

Of course, instead of doing this I could just do a type assertion here, but I think this way is a little bit cleaner.

But if we don't want to add the method to the interface, then IMO we should probably just leave it on the StakedAccessNodeBuilder and do the type assertion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh shoot! I forgot to push the commit with that change. Yes, 100% agree that method should be on the interface.

Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superb work!

Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 🎸

@vishalchangrani
Copy link
Contributor Author

Nice work 🎸

thanks for all the amazing feedback..really appreciate it!

@vishalchangrani vishalchangrani merged commit 5c9a7c9 into master Jul 21, 2021
@vishalchangrani vishalchangrani deleted the vishal/unstaked_an branch July 21, 2021 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants