Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add root_cert_ttl option for consul connect, vault ca providers #11428

Merged
merged 5 commits into from
Nov 2, 2021

Conversation

acpana
Copy link
Contributor

@acpana acpana commented Oct 26, 2021

Overview

This change adds an option to the CommonCA config to allow a root_cert_ttl value which will control the TTL (time to live) for root certs issued by supported CA providers. It's mostly plumbing.

PR Notes/ Callouts

  • At present the only providers supported are consul connect and vault.
  • Let's focus on the functionality (going QSF -- quality, security, feature), before delving into all the refactoring opportunities
  • No corresponding "cli flag" option added as it's not a pattern for our current ca config

Config checklist

I used the config file checklist here. Click below to expand on a "filled in" version of it.

Click to expand!

Legend

  • n/a -- not applicable
  • ? -- I believe I may have fully completed it

Adding a Simple Config Field for Client Agents

  • Add the field to the Config struct (or an appropriate sub-struct) in
    agent/config/config.go.
  • Add the field to the actual RuntimeConfig struct in
    agent/config/runtime.go.
  • Add an appropriate parser/setter in agent/config/builder.go to
    translate.
  • Add the new field with a random value to both the JSON and HCL files in
    agent/config/testdata/full-config.*, which should cause the test to fail.
    Then update the expected value in TestLoad_FullConfig in
    agent/config/runtime_test.go to make the test pass again.
  • Run go test -run TestRuntimeConfig_Sanitize ./agent/config -update to update
    the expected value for TestRuntimeConfig_Sanitize. Look at git diff to
    make sure the value changed as you expect.
  • [?] If your new config field needed some validation as it's only valid in
    some cases or with some values (often true).
    • [?] Add validation to Validate in agent/config/builder.go.
    • [?] Add a test case to the table test TestLoad_IntegrationWithFlags in
      agent/config/runtime_test.go.
  • If your new config field needs a non-zero-value default.
    • Add that to DefaultSource in agent/config/defaults.go.
    • Add a test case to the table test TestLoad_IntegrationWithFlags in
      agent/config/runtime_test.go.
  • [?] If your config should take effect on a reload/HUP.
    • Add necessary code to trigger a safe (locked or atomic) update to
      any state the feature needs changing. This needs to be added to one or
      more of the following places:
      • ReloadConfig in agent/agent.go if it needs to affect the local
        client state or another client agent component.
      • ReloadConfig in agent/consul/client.go if it needs to affect
        state for client agent's RPC client.
    • Add a test to agent/agent_test.go similar to others with prefix
      TestAgent_reloadConfig*.
  • [?] Add documentation to website/content/docs/agent/options.mdx.

Adding a Simple Config Field for Servers

  • [?] Do all of the steps in Adding a Simple Config
    Field For Client Agents
    .
  • Add the new field to Config struct in agent/consul/config.go
  • [?] Add code to set the values from the RuntimeConfig in the confusingly
    named consulConfig method in agent/agent.go
  • [n/a] If needed, add a test to agent_test.go if there is some non-trivial
    behavior in the code you added in the previous step. We tend not to test
    simple assignments from one to the other since these are typically caught by
    higher-level tests of the actual functionality that matters but some examples
    can be found prefixed with TestAgent_consulConfig*
  • [n/a] If your config should take effect on a reload/HUP
    • [n/a] Add necessary code to ReloadConfig in agent/consul/server.go this
      needs to be adequately synchronized with any readers of the state being
      updated.
      • [n/a] Add a new test or a new assertion to TestServer_ReloadConfig

Issues related

TODO:

Steps to expand!
  1. make linux
  2. Use the binary with https://github.com/dhiaayachi/consul-local-cluster
  3. Add a RootCertTTL value to the vault-provider.json.template file locally to something like:
{
  "Provider": "vault",
  "Config": {
      "LeafCertTTL": "72h",
      "Address": "http://myvault:8200",
      "Token": {{VAULT_TOKEN}},
      "RootPKIPath": "connect-root",
      "IntermediatePKIPath": "connect-intermediate",
      "RootCertTTL": "8761h"
  },
  "ForceWithoutCrossSigning": false
}
  1. Assert root cert properties are as expected -- consul connect ca get-config -token=<TOKEN>

Signed-off-by: FFMMM FFMMM@users.noreply.github.com

@acpana acpana requested a review from a team as a code owner October 26, 2021 19:27
@github-actions github-actions bot added theme/api Relating to the HTTP API interface theme/config Relating to Consul Agent configuration, including reloading theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies type/docs Documentation needs to be created/updated/clarified labels Oct 26, 2021
@hashicorp-ci
Copy link
Contributor

🤔 This PR has changes in the website/ directory but does not have a type/docs-cherrypick label. If the changes are for the next version, this can be ignored. If they are updates to current docs, attach the label to auto cherrypick to the stable-website branch after merging.

Copy link
Contributor

@kisunji kisunji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a bunch of nit-picky comments to consider!

agent/config/testdata/full-config.json Outdated Show resolved Hide resolved
agent/connect/ca/provider_consul_test.go Outdated Show resolved Hide resolved
agent/connect/ca/provider_consul_test.go Outdated Show resolved Hide resolved
agent/connect/ca/provider_consul_test.go Outdated Show resolved Hide resolved
agent/structs/connect_ca.go Outdated Show resolved Hide resolved
agent/structs/connect_ca.go Outdated Show resolved Hide resolved
agent/structs/connect_ca.go Outdated Show resolved Hide resolved
@@ -38,6 +38,7 @@ type CAConfig struct {
// CommonCAProviderConfig is the common options available to all CA providers.
type CommonCAProviderConfig struct {
LeafCertTTL time.Duration
RootCertTTL time.Duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding, why is this under Common and not Consul (which has a field called RootCert?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions!

IIUC, the ConsulCAProvider is one of the 3 providers we support. RootCertTTL is a common option to all providers. So I thought we should add it in the CommonCAProviderConfig once, rather than 3 times for all providers.

sdk/testutil/server.go Outdated Show resolved Hide resolved
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging October 28, 2021 19:10 Inactive
@vercel vercel bot temporarily deployed to Preview – consul October 28, 2021 19:10 Inactive
@vercel vercel bot temporarily deployed to Preview – consul October 28, 2021 19:22 Inactive
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging October 28, 2021 19:22 Inactive
@vercel vercel bot temporarily deployed to Preview – consul October 28, 2021 20:23 Inactive
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging October 28, 2021 20:23 Inactive
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging October 29, 2021 00:21 Inactive
FFMMM and others added 2 commits November 1, 2021 09:50
Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging November 1, 2021 16:53 Inactive
@vercel vercel bot temporarily deployed to Preview – consul November 1, 2021 16:53 Inactive
Copy link
Contributor

@kyhavlov kyhavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, left a couple comments but nothing blocking - once the doc change is done this should be good to merge 👍

@@ -380,6 +382,17 @@ func (c CommonCAProviderConfig) Validate() error {
return nil
}

// if the root cert ttl is not set, set it to the default
if c.RootCertTTL == 0 {
c.RootCertTTL = 10 * 365 * 24 * time.Hour
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do c.RootCertTTL = time.ParseDuration(DefaultRootCertTTL) here and use the default that's already defined above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is Validate, but here we are normalizing (not validating).

I suspect this should this return an error instead? In normal operations we expect RuntimeConfig to always have set a default, so users should only encounter this problem if they explicit set a 0 for TTL (I think).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but here we are normalizing
In normal operations we expect RuntimeConfig to always have set a default,

Hey both, thanks for the engagement here -- I think per Daniel's comment, I'm inclined to take out this chunk of code.

If a callee actually sets the value to less than the intermediate cert ttl than the right error should come out.

I also see no such "normalization"/ defaulting for leaf and intermediate ttls

@@ -1267,6 +1267,11 @@ bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr
for more than twice the _current_ `leaf_cert_ttl`, it will be removed
from the trusted list.

- `root_cert_ttl` ((#todo)) todo. Defaults to 10 years as `87600h`.
todo restrictions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needs a description (leftover todo)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By restrictions I guess you mean "is only used when the CA system is first initialized, or when a configuration change causes a rotation or the root certificate" ? I think we should say something about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was punting on the docs a bit to make sure that the stuff here was something we eventually wanted to check in.

Which looks like it is case. So I added the docs now :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FFMMM Can you also make sure this field gets documented under https://github.com/hashicorp/consul/blob/main/website/content/partials/http_api_connect_ca_common_options.mdx?

The options in this file are displayed under the provider-specific CA config options. For example, https://www.consul.io/docs/connect/ca/vault#common-ca-config-options.

Copy link
Contributor

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I see you found all the places where we plumb this config.

Mostly just small suggestions about testing values and docs.

.changelog/11428.txt Outdated Show resolved Hide resolved
agent/config/runtime_test.go Outdated Show resolved Hide resolved
agent/config/testdata/full-config.hcl Outdated Show resolved Hide resolved
Comment on lines +173 to +177
// the max lease ttl denotes the maximum ttl that secrets are created from the engine
// the default lease ttl is the kind of ttl that will *reliably* set the ttl to v.config.RootCertTTL
// https://www.vaultproject.io/docs/secrets/pki#configure-a-ca-certificate
MaxLeaseTTL: v.config.RootCertTTL.String(),
DefaultLeaseTTL: v.config.RootCertTTL.String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

@@ -380,6 +382,17 @@ func (c CommonCAProviderConfig) Validate() error {
return nil
}

// if the root cert ttl is not set, set it to the default
if c.RootCertTTL == 0 {
c.RootCertTTL = 10 * 365 * 24 * time.Hour
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is Validate, but here we are normalizing (not validating).

I suspect this should this return an error instead? In normal operations we expect RuntimeConfig to always have set a default, so users should only encounter this problem if they explicit set a 0 for TTL (I think).

@@ -1267,6 +1267,11 @@ bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr
for more than twice the _current_ `leaf_cert_ttl`, it will be removed
from the trusted list.

- `root_cert_ttl` ((#todo)) todo. Defaults to 10 years as `87600h`.
todo restrictions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By restrictions I guess you mean "is only used when the CA system is first initialized, or when a configuration change causes a rotation or the root certificate" ? I think we should say something about that.

Signed-off-by: FFMMM <FFMMM@users.noreply.github.com>
Co-authored-by: Daniel Nephin <dnephin@hashicorp.com>
Copy link
Contributor

@kyhavlov kyhavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge, just had one tiny suggestion 👍

website/content/docs/agent/options.mdx Outdated Show resolved Hide resolved
Co-authored-by: Kyle Havlovitz <kylehav@gmail.com>
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging November 2, 2021 17:29 Inactive
@acpana acpana merged commit 4ddf973 into main Nov 2, 2021
@acpana acpana deleted the ffmmm/f-10762 branch November 2, 2021 18:02
@hc-github-team-consul-core
Copy link
Collaborator

🍒 If backport labels were added before merging, cherry-picking will start automatically.

To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/491685.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/api Relating to the HTTP API interface theme/config Relating to Consul Agent configuration, including reloading theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies type/docs Documentation needs to be created/updated/clarified
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants