Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for server discovery, TLS and ACL login #143

Merged
merged 12 commits into from
Jun 23, 2023

Conversation

Ganeshrockz
Copy link
Contributor

@Ganeshrockz Ganeshrockz commented Jun 22, 2023

Changes proposed in this PR:

This PR introduces the following changes to consul ECS

  • Adds consul server connection manager discover the consul server agents from the ECS control plane in the new architecture
  • Adds support for ACL login in the Control plane module
  • Adds support for protocol specific TLS (HTTPS & gRPC) configurations
  • Adds the consulServers field to schema.json. The top level field looks like
{
   "consulServers": {
          "hosts": "<addresses>",
          "https": true, // <Defaults to true>, 
          "tls": true, // <Defaults to true>,
          "httpPort":  8501, //<Defaults to 8501 since https is enabled by default>
          "grpcPort": 8503, //<Defaults to 8503 since tls is enabled by default>
          "caCertFile": "",
          "httpsCACertFile": "",
          "skipServerWatch": false,
          "tlsServerName": ""
    }
}
  • Configures the new client which can talk to the server directly.
  • Stores the ACL token returned by the server connection manager in a separate file and writes it to a shared volume. This will be later used by the dataplane to configure itself.
  • Server watch isn't implemented yet. Will be done in subsequent iterations.

How I've tested this PR:

Unit tests for now. I have decided to split the agentless implementation into multiple PRs and have the ganeshrockz/agentless branch as the target branch. Once all the necessary changes go into it, we need to rely on acceptance/integration tests before merging to main.

How I expect reviewers to test this PR:

Checklist:

  • Tests added
  • CHANGELOG entry added (Not needed)

@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 08:11 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 08:14 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 08:52 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 09:11 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 09:39 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 09:58 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 10:21 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 22, 2023 10:43 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz changed the title Added support for server discover, TLS and ACL login Added support for server discovery, TLS and ACL login Jun 22, 2023
@Ganeshrockz Ganeshrockz marked this pull request as ready for review June 22, 2023 10:50
@Ganeshrockz Ganeshrockz requested review from a team and pglass and removed request for a team June 22, 2023 10:50
subcommand/mesh-init/command_test.go Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
config/config.go Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
subcommand/mesh-init/command.go Outdated Show resolved Hide resolved
subcommand/mesh-init/command.go Show resolved Hide resolved
subcommand/mesh-init/command.go Outdated Show resolved Hide resolved
subcommand/mesh-init/command.go Show resolved Hide resolved
testutil/config.go Outdated Show resolved Hide resolved
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 23, 2023 09:54 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 23, 2023 10:20 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 23, 2023 10:26 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz requested a review from pglass June 23, 2023 10:29
Copy link

@pglass pglass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one comment that is a "must fix" is setting the partition in the login data.

The rest of the comments are minor/non-blocking (feel free to address/ignore how you feel is best).

Otherwise, approved so you can merge when ready. This is great work!

config/config.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
config/config.go Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
config/config_test.go Outdated Show resolved Hide resolved
Comment on lines +15 to +23
"hosts": "",
"https": null,
"httpPort": null,
"grpcPort": null,
"httpsCACertFile": null,
"caCertFile": null,
"tls": null,
"tlsServerName": null,
"skipServerWatch": null
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking / thinking aloud: What do you think about restructuring the config fields with nested objects for the http/grpc settings?

This would look like the following.

{
  "consulServers": {
    "hosts": "",
    "skipServerWatch": null,
    "defaults": {   ## Common/default TLS settings for both http and grpc
      "caCertFile": null,
      "tls": true,
      "tlsServerName": null
    },
    "http": {   ## HTTP-only settings (overrides `defaults` for http)
      "caCertFile": null,
      "port": null,
      "tls": null,
      "tlsServerName": null
    },
    "grpc": {  ## GRPC-specific settings (overrides `defaults` for gRPC)
      "caCertFile": null,
      "port": null,
      "tls": null,
      "tlsServerName": null
    }
  }
}

Although, looking at https://developer.hashicorp.com/consul/docs/k8s/helm#externalservers I think consul-k8s assumes the same TLS settings for https and grpc+tls to the Consul servers? So perhaps we follow their lead (I know I previously said we should have separate settings for those, but maybe it's uncommon to do so) 🤔

Maybe this is overcomplicating. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think consul-k8s assumes the same TLS settings for https and grpc+tls to the Consul servers?

We support configuring individual TLS CA certs for HTTP and gRPC in the current version of Consul ECS. The mesh task's terraform module exposes two variables to control that. So I guess we should continue maintaining both the settings going forward as well. What do you think?

That said, I do like the way you restructured these inputs so that users can go with a default setting for both the protocols. I will merge this PR in (since it has already become huge) and raise a separate one to address this restructuring.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should continue maintaining both the settings going forward as well

Yeah, that makes sense. Though, the existing RPC cert was technically for Consul client agent's internal RPC (port 8300, not gRPC which is 8502/8503). Anyway, we can make breaking changes in a new version, so if you feel that it's easiest/simplest to start out with one set of settings for both grpc and https, I'm good with that.

I do like the shared block of tls settings, because it's probably pretty common that the TLS settings for http and grpc are the same. We also probably want to also support other tls settings, like client certificates, skip tls verification, so I like the nested objects to organize the many tls settings (especially if we have defaults and settings for each specific protocol).

subcommand/mesh-init/command.go Show resolved Hide resolved
testutil/consul.go Show resolved Hide resolved
@pglass
Copy link

pglass commented Jun 23, 2023

I suspect the CI tests are failing with Consul 1.13 because of gRPC TLS settings that were weird in 1.13.

You could try to fix this, or we can drop tests for Consul 1.13. We typically maintain and test with N-2 versions of Consul. So with Consul 1.16 GA that would be 1.16, 1.15, and 1.14.

@Ganeshrockz
Copy link
Contributor Author

Ganeshrockz commented Jun 23, 2023

We typically maintain and test with N-2 versions of Consul.

If that's the case, I can merge this PR first and later on drop the tests once Consul 1.16 becomes GA. Since this isn't getting merged to main I think it should be fine.

@Ganeshrockz Ganeshrockz temporarily deployed to dockerhub/hashicorpdev June 23, 2023 15:18 — with GitHub Actions Inactive
@Ganeshrockz Ganeshrockz merged commit a965b10 into ganeshrockz/agentless Jun 23, 2023
@Ganeshrockz Ganeshrockz deleted the net-4565/server-discovery branch June 23, 2023 15:37
Ganeshrockz added a commit that referenced this pull request Aug 21, 2023
#161)

* Added support for server discovery, TLS and ACL login (#143)

* Added support for server discover, TLS and ACL login

* Fix lint

* Test fix

* Fix config tests

* Fix tests

* Still fixing tests

* Fix tests

* Fix tests

* Address comments

* Added protocol specific TLS configs

* Fix tests

* Address comments

* Ability to register services, gateway and proxy (#144)

* Ability to register services, gateway and proxy

* Fix lint

* Refactor registration logic

* Add node meta

* Refactor functions

* Address comments

* Fix unit tests

* Set additionalProperties to false for service object

* Added ability to generate dataplane configuration  (#145)

* Added ability to generate dataplane configuration

* Remove default configs

* Modify approach to use encoding/json

* Added ability to sync health checks back to Consul

* Fix main

* Fix lint

* Write dataplane config to shared volume

* Fix tests

* Check enterprise CI

* Fix tests

* Fix enterprise tests

* Address comments

* Refactor task meta

* Refactoring tests

* Address nit comments

* Support to gracefully shutdown control-plane (#147)

* Support to gracefully shutdown control-plane

* Fix lint

* Address comments

* Address comments

* Add tests

* Deprecate mesh-init in favour of control-plane (#148)

* Deprecate mesh-init in place of control-plane

* Address comments

* Controller changes  to support new ECS architecture - Part 1 (#150)

* Controller changes for agentless

* Fix tests

* Add node tests

* Fix comments

* Fix tests

* Server watch for agentless (#149)

* Server watch implemented

* Add tests

* Address comments

* Added mocks for server connection manager

* Deprecate health-sync command (#152)

* Deprecated health sync command

* Fix tests

* Fix lint

* Fix lint

* Fix tests

* Controller changes to support new architecture - Part 2 (#153)

* Make controller reconcile services

* Fix tests

* Fix tests

* Fix lint

* Fix tests

* Fix enterprise tests

* Fix tests

* Remove dataplane tag

* Refactor reconcile services

* Address comments

* Fix tests

* Fix lint

* Fix lint

* Fix non enterprise tests

* Fixing enterprise tests

* Still trying to fix enterprise tests :(

* Address comments

* Fix tests

* Fix null ref

* Fix comments

* Address comments

* Rename acl controller

* Refactor TLS inputs in ECS_CONFIG_JSON (#154)

* Refactor server TLS inputs

* Fix tests

* Fix merge issues

* Miscellaneous changes for dataplane architecture (#156)

* Added ability to write CA cert to shared volume

* Fix tests

* Comments

* Address comments

* Fix addresses for service and proxy registrations (#157)

* Added ability to write CA cert to shared volume

* Fix tests

* Comments

* Pass ReadyBindAddress and Port to dataplane

* Address comments

* Fix lint

* Address comments

* Fix lint

* Add destinationPeer to upstream config (#159)

* Fix clusterARN login meta for EC2 (#158)

* Fix clusterARN login meta for EC2

* Address comments

* Added go-discover binary to the ECS image (#160)

* Added go-discover binary to the ECS image

* Bump golang version

* Add changelog and make some minor changes

* Fix tests

* Address comments

* Remove breaking changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants