Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate secure GRPC client to DKG and QC clients #1224

Merged
merged 36 commits into from
Sep 1, 2021

Conversation

kc1116
Copy link
Contributor

@kc1116 kc1116 commented Aug 28, 2021

- removes ability for LN and SN to be run without machine account config

- adds new flags --insecure-access-api (allows GRPC connection to access node to be insecure) & --access-node-grpc-public-key (networking public key of the access node being connected to)

- adds func to get GRPC dial options with TLS config to be used with the flow-client
@codecov-commenter
Copy link

codecov-commenter commented Aug 28, 2021

Codecov Report

Merging #1224 (ad89c53) into master (b322b29) will decrease coverage by 0.07%.
The diff coverage is 12.90%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1224      +/-   ##
==========================================
- Coverage   56.16%   56.08%   -0.08%     
==========================================
  Files         484      485       +1     
  Lines       29807    29870      +63     
==========================================
+ Hits        16742    16754      +12     
- Misses      10782    10831      +49     
- Partials     2283     2285       +2     
Flag Coverage Δ
unittests 56.08% <12.90%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
engine/access/rpc/backend/connection_factory.go 75.51% <ø> (ø)
engine/execution/rpc/engine.go 46.37% <ø> (ø)
module/ingress/ingress.go 13.95% <ø> (ø)
utils/grpcutils/grpc.go 51.85% <ø> (ø)
cmd/collection/main.go 2.85% <3.84%> (-0.48%) ⬇️
cmd/consensus/main.go 3.34% <3.84%> (-0.35%) ⬇️
utils/grpcutils/grpc_client.go 60.00% <60.00%> (ø)
cmd/bootstrap/cmd/machine_account_key.go 61.90% <0.00%> (-6.52%) ⬇️
fvm/script.go 89.74% <0.00%> (-4.86%) ⬇️
fvm/fvm.go 82.05% <0.00%> (-4.44%) ⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b322b29...ad89c53. Read the comment docs.

Comment on lines 88 to 89
accessAddress string
securedAccessAddress string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it is simpler to only have one access address flag, since we will only ever connect to one of them based on the --insecure-access-api flag.

Suggested change
accessAddress string
securedAccessAddress string
accessAPIAddress string

Comment on lines 420 to 423
err = flowClient.Ping(context.Background())
if err != nil {
return nil, fmt.Errorf("failed to ping, flow client may be misconfigured check GRPC options %w", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFYI this will be replaced by https://github.com/dapperlabs/flow-go/issues/5792, which will check for connectivity as well as correct configuration.

for i, c := range containers {
fmt.Printf("%d: %s", i+1, c.Identity().String())
if c.Unstaked {
fmt.Printf(" (unstaked)")
}

if c.Role == flow.RoleAccess {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if c.Role == flow.RoleAccess {
if c.Role == flow.RoleAccess && !c.Unstaked {

I don't think we want to use unstaked ANs for this

Comment on lines 411 to 418
fmt.Sprintf("--insecure-access-api=false"),
fmt.Sprintf("--secured-access-address=%s", securedAccessAddress),
fmt.Sprintf("--access-node-grpc-public-key=%s", accessNodeGRPCPubKey),
)

// IMPORTANT: additional flags will contain correct flags for secure GRPC conn
service.Command = append(service.Command, container.AdditionalFlags...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to make use of the AdditionalFlags (which seems like a good idea to me), why do we also need to specify the flag values on line 411-413 as well?

What is in AdditionalFlags which is not on lines 411-413, and can we just put that in AdditionalFlags to avoid needing to pass in the new arguments?

// construct QC contract client
qcContractClient, err := createQCContractClient(node, accessAddress)
qcContractClient, err := createQCContractClient(node, accessAddress, flowClient)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Another reason to not have 2 flags for the access address. Here we are checking the validity w.r.t. the unsecured access address, regardless of whether we will be connecting to it.

@@ -138,6 +143,9 @@ func main() {

// epoch qc contract flags
flags.StringVar(&accessAddress, "access-address", "", "the address of an access node")
flags.StringVar(&securedAccessAddress, "secured-access-address", "", "the address for secured GRPC conn to an access node")
flags.StringVar(&accessApiNodePubKey, "access-node-grpc-public-key", "", "the networking public key of the secured access node being connected to")
flags.BoolVar(&insecureAccessAPI, "insecure-access-api", true, "required if insecure GRPC connection should be used")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should default to false. For the upcoming spork, a node operator should not need to specify this flag at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration tests are not configured properly to handle new flags , can we create separate ticket to make secure API default and update integration tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Sure, sounds good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 96 to 97
accessAddress string
securedAccessAddress string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about using one flag

cmd/consensus/main.go Show resolved Hide resolved
cmd/test/main.go Outdated Show resolved Hide resolved
Comment on lines 465 to 466
fmt.Sprintf("%d:%d", AccessAPIPort+i, RPCPort),
fmt.Sprintf("%d:%d", AccessAPIPort+(i+1), SecuredRPCPort),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will overlap ports with >1 AN

Suggested change
fmt.Sprintf("%d:%d", AccessAPIPort+i, RPCPort),
fmt.Sprintf("%d:%d", AccessAPIPort+(i+1), SecuredRPCPort),
fmt.Sprintf("%d:%d", AccessAPIPort+2*i, RPCPort),
fmt.Sprintf("%d:%d", AccessAPIPort+(2*i+1), SecuredRPCPort),

@kc1116 kc1116 requested a review from tarakby August 31, 2021 13:52
Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final suggestion, otherwise this looks good to me.

Where is this ticket at with the latest updates? Have we tested an epoch transition with these changes on localnet?

services[container.ContainerName] = prepareConsensusService(
container,
numConsensus,
"access_1:9001",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This address is essentially a constant, I think it would be better to store it with the other constants and use the value directly in preparareConsensusService.

AccessAPIPort = 3569
MetricsPort = 8080
RPCPort = 9000

Copy link
Member

@jordanschalm jordanschalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎸

Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried this may mix two distinct notions:

  • do we connect securely to a TLS-ed GRPC server using a pre-shared key to check authentication?
    That's the server that's introduced in bringing up a GRPC secure sever for the access node #989
    This is about using a capability.

  • do we connect securely (where securely is defined as above) to our "favorite" access node, which is the one node we trust above others for various reasons ?
    That's the new functionality this PR introduces, this is meant to make an identity selection.

I note that the former has a wider range of application than the later: I might want to connect to any, no or every access node that runs the necessary server using GRPC + TLS + pre-shared key. I may also, besides that, restrict the access nodes I rely on ("trust") to any subset of those I connect to securely.

I would welcome:

  • sunsetting insecure encryption for everything but tests (but Auto Cadence Update: Add container mutation tests for inserting functions into dictionaries #1574 is enough for me),
  • encryption flags that do not require specifying a NodeID (but that may require a fallback, for a while, to insecure access for nodes that happen to not run the secure server => open an issue to mark it for later?),
  • and on top of that, an option that clearly states "only pick this access node, it's my favorite" (and imply a secure connection is the only one that can be used)

WDYT?
/cc @vishalchangrani

}
} else {
if secureAccessNodeID == "" {
return nil, fmt.Errorf("invalid flag --secure-access-node-id required")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("invalid flag --secure-access-node-id required")
return nil, fmt.Errorf("invalid flag, --secure-access-node-id required")

otherwise the reader may think the invalid qualifier applies to your requirement

@jordanschalm
Copy link
Member

jordanschalm commented Aug 31, 2021

@huitseeker

encryption flags that do not require specifying a NodeID (but that may require a fallback, for a while, to insecure access for nodes that happen to not run the secure server => open an issue to mark it for later?),

The main issue is that within the protocol state, we know an access node's networking public key and libp2p networking address, but we do not know its GRPC server address (hence the --access-address flag). We could make the assumption that a particular relationship exists between the libp2p networking address and the GRPC server address (for example, it uses the canonical port on the same hostname), but we would need to make sure that assumption is always true. Personally I'm more confident specifying the parameters explicitly.

@huitseeker
Copy link
Contributor

T We could make the assumption that a particular relationship exists between the libp2p networking address and the GRPC server address, but we would need to make sure that assumption is always true

That assumption (same host, port 9001) is always true for our access nodes, and since transport authentication against a pre-shared key (itself determined by convention from libp2p identifiers, btw) is a prerequisite to using the connection in any way, there is not risk of misunderstanding on method or identity.

@kc1116
Copy link
Contributor Author

kc1116 commented Aug 31, 2021

I'm a bit worried this may mix two distinct notions:

  • do we connect securely to a TLS-ed GRPC server using a pre-shared key to check authentication?
    That's the server that's introduced in bringing up a GRPC secure sever for the access node #989
    This is about using a capability.
  • do we connect securely (where securely is defined as above) to our "favorite" access node, which is the one node we trust above others for various reasons ?
    That's the new functionality this PR introduces, this is meant to make an identity selection.
  • secured implies TLS-ed GRPC connection , to the secure GRPC sever AN bring up
  • There is no concept of a "favorite AN", this PR does not introduce that concept. Your only required to provide the node ID of the access node you are already connecting to in order to use secure GRPC . Under the hood the node ID is used to find the networking key from protocol state of that AN to create TLS'ed GRPC conn. Using NodeID allows networking keys to change.

I note that the former has a wider range of application than the later: I might want to connect to any, no or every access node that runs the necessary server using GRPC + TLS + pre-shared key. I may also, besides that, restrict the access nodes I rely on ("trust") to any subset of those I connect to securely.

  • I agree , this is nice to have and is not immediate requirement

I would welcome:

  • sunsetting insecure encryption for everything but tests (but Auto Cadence Update: Add container mutation tests for inserting functions into dictionaries #1574 is enough for me),
  • encryption flags that do not require specifying a NodeID (but that may require a fallback, for a while, to insecure access for nodes that happen to not run the secure server => open an issue to mark it for later?),
  • and on top of that, an option that clearly states "only pick this access node, it's my favorite" (and imply a secure connection is the only one that can be used)

WDYT?
/cc @vishalchangrani

@huitseeker

// create flow client with correct GRPC configuration for QC contract client
var flowClient *client.Client
if insecureAccessAPI {
flowClient, err = common.InsecureFlowClient(accessAddress)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the grpc client is created here and never re-created, what happens if the access node is restarted and the connection is broken?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will investigate this .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this issue grpc/grpc-go#351 it looks like the GRPC client should try to automatically reconnect. Because nothing changed as far as where the client is being created, we should see the same behavior for insecure and secure client connections. This probably needs it's own separate ticket for investigation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we probably need a factory to be passed in, instead of the actual client since the client will only be used once in an epoch and it is highly probable that the access node gets restarted in-between.
Separate ticket also sounds good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to test this by running localnet and restarting the access node before the epoch setup phase starts. I would agree with khalil and would expect that the client would recreate the connection as needed

return nil, fmt.Errorf("could not find identity of secure access node: %s", secureAccessNodeID)
}

flowClient, err = common.SecureFlowClient(accessAddress, identities[0].NetworkPubKey.String()[2:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you throw in a comment for the [2:]

@huitseeker
Copy link
Contributor

@kc1116 My comments were definitely meant as non-blocking.

Copy link
Contributor

@vishalchangrani vishalchangrani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (left some minor nit comments)

@jordanschalm jordanschalm changed the title 5793/secured grpc client Integrate secure GRPC client to DKG and QC clients Sep 1, 2021
@jordanschalm
Copy link
Member

bors merge

@bors
Copy link
Contributor

bors bot commented Sep 1, 2021

@bors bors bot merged commit b75db19 into master Sep 1, 2021
@bors bors bot deleted the 5793/secured-grpc-client branch September 1, 2021 01:19
@huitseeker huitseeker mentioned this pull request Sep 1, 2021
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants