-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate secure GRPC client to DKG and QC clients #1224
Conversation
- removes ability for LN and SN to be run without machine account config - adds new flags --insecure-access-api (allows GRPC connection to access node to be insecure) & --access-node-grpc-public-key (networking public key of the access node being connected to) - adds func to get GRPC dial options with TLS config to be used with the flow-client
Codecov Report
@@ Coverage Diff @@
## master #1224 +/- ##
==========================================
- Coverage 56.16% 56.08% -0.08%
==========================================
Files 484 485 +1
Lines 29807 29870 +63
==========================================
+ Hits 16742 16754 +12
- Misses 10782 10831 +49
- Partials 2283 2285 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
- adds additional flags for secure GRPC conf during networking bootstrapping - updates integration tests to now use secured GRPC conn
- access node needed for secure GRPC conn
cmd/collection/main.go
Outdated
accessAddress string | ||
securedAccessAddress string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it is simpler to only have one access address flag, since we will only ever connect to one of them based on the --insecure-access-api
flag.
accessAddress string | |
securedAccessAddress string | |
accessAPIAddress string |
cmd/collection/main.go
Outdated
err = flowClient.Ping(context.Background()) | ||
if err != nil { | ||
return nil, fmt.Errorf("failed to ping, flow client may be misconfigured check GRPC options %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFYI this will be replaced by https://github.com/dapperlabs/flow-go/issues/5792, which will check for connectivity as well as correct configuration.
integration/localnet/bootstrap.go
Outdated
for i, c := range containers { | ||
fmt.Printf("%d: %s", i+1, c.Identity().String()) | ||
if c.Unstaked { | ||
fmt.Printf(" (unstaked)") | ||
} | ||
|
||
if c.Role == flow.RoleAccess { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if c.Role == flow.RoleAccess { | |
if c.Role == flow.RoleAccess && !c.Unstaked { |
I don't think we want to use unstaked ANs for this
integration/localnet/bootstrap.go
Outdated
fmt.Sprintf("--insecure-access-api=false"), | ||
fmt.Sprintf("--secured-access-address=%s", securedAccessAddress), | ||
fmt.Sprintf("--access-node-grpc-public-key=%s", accessNodeGRPCPubKey), | ||
) | ||
|
||
// IMPORTANT: additional flags will contain correct flags for secure GRPC conn | ||
service.Command = append(service.Command, container.AdditionalFlags...) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to make use of the AdditionalFlags
(which seems like a good idea to me), why do we also need to specify the flag values on line 411-413 as well?
What is in AdditionalFlags
which is not on lines 411-413, and can we just put that in AdditionalFlags
to avoid needing to pass in the new arguments?
// construct QC contract client | ||
qcContractClient, err := createQCContractClient(node, accessAddress) | ||
qcContractClient, err := createQCContractClient(node, accessAddress, flowClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -138,6 +143,9 @@ func main() { | |||
|
|||
// epoch qc contract flags | |||
flags.StringVar(&accessAddress, "access-address", "", "the address of an access node") | |||
flags.StringVar(&securedAccessAddress, "secured-access-address", "", "the address for secured GRPC conn to an access node") | |||
flags.StringVar(&accessApiNodePubKey, "access-node-grpc-public-key", "", "the networking public key of the secured access node being connected to") | |||
flags.BoolVar(&insecureAccessAPI, "insecure-access-api", true, "required if insecure GRPC connection should be used") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should default to false
. For the upcoming spork, a node operator should not need to specify this flag at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration tests are not configured properly to handle new flags , can we create separate ticket to make secure API default and update integration tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Sure, sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd/consensus/main.go
Outdated
accessAddress string | ||
securedAccessAddress string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about using one flag
integration/localnet/bootstrap.go
Outdated
fmt.Sprintf("%d:%d", AccessAPIPort+i, RPCPort), | ||
fmt.Sprintf("%d:%d", AccessAPIPort+(i+1), SecuredRPCPort), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will overlap ports with >1 AN
fmt.Sprintf("%d:%d", AccessAPIPort+i, RPCPort), | |
fmt.Sprintf("%d:%d", AccessAPIPort+(i+1), SecuredRPCPort), | |
fmt.Sprintf("%d:%d", AccessAPIPort+2*i, RPCPort), | |
fmt.Sprintf("%d:%d", AccessAPIPort+(2*i+1), SecuredRPCPort), |
- now LN and SN nodes depend on the AN - only use staked AN for secured GRPC connection - fix port assignment
…into 5793/secured-grpc-client
- --secure-access-node-id will be used to look up networking key of node from state
Co-authored-by: Jordan Schalm <jordan@dapperlabs.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One final suggestion, otherwise this looks good to me.
Where is this ticket at with the latest updates? Have we tested an epoch transition with these changes on localnet?
services[container.ContainerName] = prepareConsensusService( | ||
container, | ||
numConsensus, | ||
"access_1:9001", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This address is essentially a constant, I think it would be better to store it with the other constants and use the value directly in preparareConsensusService
.
flow-go/integration/localnet/bootstrap.go
Lines 38 to 40 in 1ccefc0
AccessAPIPort = 3569 | |
MetricsPort = 8080 | |
RPCPort = 9000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried this may mix two distinct notions:
-
do we connect securely to a TLS-ed GRPC server using a pre-shared key to check authentication?
That's the server that's introduced in bringing up a GRPC secure sever for the access node #989
This is about using a capability. -
do we connect securely (where securely is defined as above) to our "favorite" access node, which is the one node we trust above others for various reasons ?
That's the new functionality this PR introduces, this is meant to make an identity selection.
I note that the former has a wider range of application than the later: I might want to connect to any, no or every access node that runs the necessary server using GRPC + TLS + pre-shared key. I may also, besides that, restrict the access nodes I rely on ("trust") to any subset of those I connect to securely.
I would welcome:
- sunsetting insecure encryption for everything but tests (but Auto Cadence Update: Add container mutation tests for inserting functions into dictionaries #1574 is enough for me),
- encryption flags that do not require specifying a NodeID (but that may require a fallback, for a while, to insecure access for nodes that happen to not run the secure server => open an issue to mark it for later?),
- and on top of that, an option that clearly states "only pick this access node, it's my favorite" (and imply a secure connection is the only one that can be used)
WDYT?
/cc @vishalchangrani
} | ||
} else { | ||
if secureAccessNodeID == "" { | ||
return nil, fmt.Errorf("invalid flag --secure-access-node-id required") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, fmt.Errorf("invalid flag --secure-access-node-id required") | |
return nil, fmt.Errorf("invalid flag, --secure-access-node-id required") |
otherwise the reader may think the invalid qualifier applies to your requirement
The main issue is that within the protocol state, we know an access node's networking public key and libp2p networking address, but we do not know its GRPC server address (hence the |
That assumption (same host, port 9001) is always true for our access nodes, and since transport authentication against a pre-shared key (itself determined by convention from libp2p identifiers, btw) is a prerequisite to using the connection in any way, there is not risk of misunderstanding on method or identity. |
|
// create flow client with correct GRPC configuration for QC contract client | ||
var flowClient *client.Client | ||
if insecureAccessAPI { | ||
flowClient, err = common.InsecureFlowClient(accessAddress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the grpc client is created here and never re-created, what happens if the access node is restarted and the connection is broken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will investigate this .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this issue grpc/grpc-go#351 it looks like the GRPC client should try to automatically reconnect. Because nothing changed as far as where the client is being created, we should see the same behavior for insecure and secure client connections. This probably needs it's own separate ticket for investigation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we probably need a factory to be passed in, instead of the actual client since the client will only be used once in an epoch and it is highly probable that the access node gets restarted in-between.
Separate ticket also sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to test this by running localnet and restarting the access node before the epoch setup phase starts. I would agree with khalil and would expect that the client would recreate the connection as needed
return nil, fmt.Errorf("could not find identity of secure access node: %s", secureAccessNodeID) | ||
} | ||
|
||
flowClient, err = common.SecureFlowClient(accessAddress, identities[0].NetworkPubKey.String()[2:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you throw in a comment for the [2:]
…into 5793/secured-grpc-client
@kc1116 My comments were definitely meant as non-blocking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm (left some minor nit
comments)
bors merge |
This PR integrates secure access client into collection/consensus node