Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Admin interface for node operators #1222

Merged
merged 29 commits into from Sep 21, 2021
Merged

Admin interface for node operators #1222

merged 29 commits into from Sep 21, 2021

Conversation

synzhu
Copy link
Contributor

@synzhu synzhu commented Aug 28, 2021

closes https://github.com/dapperlabs/flow-go/issues/5818

TODO

  • Write tests
  • Document the protobuf file
  • Provide a rest interface using grpc Gateway

@codecov-commenter
Copy link

codecov-commenter commented Aug 28, 2021

Codecov Report

Merging #1222 (4998a6a) into master (a783037) will increase coverage by 0.10%.
The diff coverage is 76.51%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1222      +/-   ##
==========================================
+ Coverage   54.70%   54.81%   +0.10%     
==========================================
  Files         502      504       +2     
  Lines       31769    31918     +149     
==========================================
+ Hits        17380    17496     +116     
- Misses      12026    12048      +22     
- Partials     2363     2374      +11     
Flag Coverage Δ
unittests 54.81% <76.51%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
admin/server.go 50.00% <50.00%> (ø)
admin/command_runner.go 79.69% <79.69%> (ø)
engine/collection/synchronization/engine.go 62.90% <0.00%> (ø)
...ngine/common/synchronization/finalized_snapshot.go 72.91% <0.00%> (+4.16%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a783037...4998a6a. Read the comment docs.

@synzhu synzhu marked this pull request as ready for review September 8, 2021 00:28
Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting started on the admin interface! Besides the points I made inline:

  1. Authentication:
    I would note we already have tools in utils/grpcutils/grpc.go for server-side uthentication: creating a gRPC server with a self-signed TLS certificate and a client that checks against the corresponding (pre-shared) PubKey. It would only require minor changes for the server to check the client against a certificate that the gRPC server would have been started with. Not that this is enough - I'd expect this to eventually require token auth - but since it's there ...

  2. Security:
    In order to augment the predictability of this interface - incidentally limitiing opportunities for mischief: would it be possible to have RegisterHandler|RegisterValidator be part of the API of a CommandRunnerBootstrapper, rather than part of the runtime?

@@ -722,6 +743,10 @@ func (fnb *FlowNodeBuilder) Initialize() NodeBuilder {
fnb.RegisterBadgerMetrics()
}

if fnb.adminAddr != NotSet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about adminAddr at NotSet but with a specified adminHttpAddr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we would just ignore the adminHttpAddr. I can add validation for this although I don't think it's really that necessary?

The adminHttpAddr only makes sense if adminAddr is set. We cannot have an http server if the underlying gRPC service doesn not even exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I think you can serve GRPC over a Unix domain socket rather than a TCP socket (and in this context, this might actually not even be such a bad idea).

But more to my original point, you can interpret the presence of an adminHttpAddr alone as an authorization to bind to a random port for the GRPC endpoint, especially if you document that it is the contract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I guess we can do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually on second thought, we cannot just choose a random port because we also need to know which interface to bind to.

As I mentioned in the previous comment, I'm not really a fan of limiting the interface to localhost only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can have a safe default, and that an address parameter that specifies, e.g. just a port, can be interpreted as a default binding to localhost.
With that said, we distribute this through Docker, which won't expose an unexported port, so at this stage it's a nit.

Comment on lines 96 to 97
adminAddr string
adminHttpAddr string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making these booleans, binding only to localhost? This way we're less vulnerable to inadvertent firewall openings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we need some way to specify the port?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, let's specify just the port then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there are very valid reasons why one might want to bind to an interface other than localhost (for example, if they want the admin service to be accessible remotely).

@synzhu
Copy link
Contributor Author

synzhu commented Sep 13, 2021

Thanks for getting started on the admin interface! Besides the points I made inline:

  1. Authentication:
    I would note we already have tools in utils/grpcutils/grpc.go for server-side uthentication: creating a gRPC server with a self-signed TLS certificate and a client that checks against the corresponding (pre-shared) PubKey. It would only require minor changes for the server to check the client against a certificate that the gRPC server would have been started with. Not that this is enough - I'd expect this to eventually require token auth - but since it's there ...
  2. Security:
    In order to augment the predictability of this interface - incidentally limitiing opportunities for mischief: would it be possible to have RegisterHandler|RegisterValidator be part of the API of a CommandRunnerBootstrapper, rather than part of the runtime?

@huitseeker

  1. In my mind, I don't think we really need authentication at the gRPC server level. I think that securing the admin interface should be something that is left up to the node operator. In particular, they should be responsible for ensuring that the address the admin service is configured to run on (via command line arguments) has proper firewall rules setup to restrict incoming traffic on the specified interface / port. For example, one could setup firewall rule to only allow incoming connections to the admin service from a specified bastion host. Then, one could setup SSH rules on the bastion host so that only certain users can access it.
  2. Not entirely sure what you mean here, can you add more details?

@huitseeker
Copy link
Contributor

huitseeker commented Sep 14, 2021

In my mind, I don't think we really need authentication at the gRPC server level. I think that securing the admin interface should be something that is left up to the node operator.

In theory, I agree with you. In practice, assuming our app is always deployed in a context where a vigilant and wise sysadmin has the time to do that securization well is a non-starter, unfortunately. Moreover, the code necessary to perform this "give credential at bootup, check it later when receiving commands", is, as I mentioned, quite close by.

I mean that in the current approach of the admin interface API it would be easy for a distant code component to dynamically add -or worse remove- a handler / validator to the admin interface, possibly under certain conditions. That makes it more complex to reason about whether the admin interface is exploitable, because it opens the possibility that an adversary would try to manipulate those conditions to activate or deactivate the handlers / validators it wishes. On the other hand, if handlers and validators can only be mutated at the start of the node, and not later, that reasoning is much more simple (hence the builder pattern).
Moreover, if we end up really needing dynamic handlers / validators, nothing will prevent us rom re-adding them at that point.

@synzhu
Copy link
Contributor Author

synzhu commented Sep 14, 2021

In my mind, I don't think we really need authentication at the gRPC server level. I think that securing the admin interface should be something that is left up to the node operator.

In theory, I agree with you. In practice, assuming our app is always deployed in a context where a vigilant and wise sysadmin has the time to do that securization well is a non-starter, unfortunately. Moreover, the code necessary to perform this "give credential at bootup, check it later when receiving commands", is, as I mentioned, quite close by.

I mean that in the current approach of the admin interface API it would be easy for a distant code component to dynamically add -or worse remove- a handler / validator to the admin interface, possibly under certain conditions. That makes it more complex to reason about whether the admin interface is exploitable, because it opens the possibility that an adversary would try to manipulate those conditions to activate or deactivate the handlers / validators it wishes. On the other hand, if handlers and validators can only be mutated at the start of the node, and not later, that reasoning is much more simple (hence the builder pattern).
Moreover, if we end up really needing dynamic handlers / validators, nothing will prevent us rom re-adding them at that point.

@huitseeker Okay, I agree with 2 and will make that change.

I'll look into 1. I will have to enable mutual authentication right?

(Reference for self: grpc/grpc-go#403)

@huitseeker
Copy link
Contributor

I'll look into 1. I will have to enable mutual authentication right?

Essentially, yes. @vishalchangrani is quite familiar with the feature and can probably help.

Copy link
Contributor

@huitseeker huitseeker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! Thanks a lot!

}

if handler := r.getHandler(command.command); handler != nil {
// TODO: we can probably merge the command context with the worker context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent idea, which is probably worth tracking in an issue, and has links with #1275 (or at least the context-derived variant thereof I suggest in #1308 ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +139 to +141
fnb.flags.StringVar(&fnb.BaseConfig.adminCert, "admin-cert", defaultConfig.adminCert, "admin cert file (for TLS)")
fnb.flags.StringVar(&fnb.BaseConfig.adminKey, "admin-key", defaultConfig.adminKey, "admin key file (for TLS)")
fnb.flags.StringVar(&fnb.BaseConfig.adminClientCAs, "admin-client-certs", defaultConfig.adminClientCAs, "admin client certs (for mutual TLS)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably open an issue for the convenience of auto-generating this certificate pair (the generation would happen at boot, for a later use).
Auto-generation may fit the needs of e.g. a partner running a single node, whereas we'd be more interested in passing our own cert, to manage a group of nodes with the same creds.

@@ -722,6 +743,10 @@ func (fnb *FlowNodeBuilder) Initialize() NodeBuilder {
fnb.RegisterBadgerMetrics()
}

if fnb.adminAddr != NotSet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can have a safe default, and that an address parameter that specifies, e.g. just a port, can be interpreted as a default binding to localhost.
With that said, we distribute this through Docker, which won't expose an unexported port, so at this stage it's a nit.

tools.go Outdated Show resolved Hide resolved
Copy link
Contributor

@vishalchangrani vishalchangrani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@peterargue peterargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. great work

@synzhu
Copy link
Contributor Author

synzhu commented Sep 21, 2021

bors merge

bors bot added a commit that referenced this pull request Sep 21, 2021
1222: Admin interface for node operators r=smnzhu a=smnzhu

closes https://github.com/dapperlabs/flow-go/issues/5818

### TODO
- [x] Write tests
- [x] Document the protobuf file
- [x] Provide a rest interface using grpc Gateway

Co-authored-by: Simon Zhu <simon.zsiyan@gmail.com>
@bors
Copy link
Contributor

bors bot commented Sep 21, 2021

Build failed:

@synzhu
Copy link
Contributor Author

synzhu commented Sep 21, 2021

bors merge

bors bot added a commit that referenced this pull request Sep 21, 2021
1222: Admin interface for node operators r=smnzhu a=smnzhu

closes https://github.com/dapperlabs/flow-go/issues/5818

### TODO
- [x] Write tests
- [x] Document the protobuf file
- [x] Provide a rest interface using grpc Gateway

Co-authored-by: Simon Zhu <simon.zsiyan@gmail.com>
@bors
Copy link
Contributor

bors bot commented Sep 21, 2021

Build failed:

@synzhu
Copy link
Contributor Author

synzhu commented Sep 21, 2021

bors retry

@bors
Copy link
Contributor

bors bot commented Sep 21, 2021

@bors bors bot merged commit b4e29d9 into master Sep 21, 2021
@bors bors bot deleted the smnzhu/access-node-controls branch September 21, 2021 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants