Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ca: properly handle the case where the secondary initializes after the primary #11514

Merged
merged 4 commits into from
Nov 9, 2021

Conversation

dnephin
Copy link
Contributor

@dnephin dnephin commented Nov 5, 2021

Fixes #11367

Previously secondaryInitialize would return nil in this case, which prevented the
deferred initialize from happening, and left the CA in an uninitialized state until a config
update or root rotation.

To fix this I extracted the common parts into the delegate implementation. However looking at this
again, it seems like the handling in secondaryUpdateRoots is impossible, because that function
should never be called before the secondary is initialzied. I beleive we can remove some of that
logic in a follow up.

TODO:

  • how can we test this?
  • why is ClusterID even in the config?

Previously secondaryInitialize would return nil in this case, which prevented the
deferred initialize from happening, and left the CA in an uninitialized state until a config
update or root rotation.

To fix this I extracted the common parts into the delegate implementation. However looking at this
again, it seems like the handling in secondaryUpdateRoots is impossible, because that function
should never be called before the secondary is initialzied. I beleive we can remove some of that
logic in a follow up.
The secondary must get the clusterID from the primary
@dnephin dnephin added the theme/certificates Related to creating, distributing, and rotating certificates in Consul label Nov 5, 2021
@github-actions github-actions bot removed the theme/certificates Related to creating, distributing, and rotating certificates in Consul label Nov 5, 2021
@@ -1110,15 +1109,18 @@ func TestLeader_SecondaryCA_UpgradeBeforePrimary(t *testing.T) {
require.Equal(r, roots1[0].ID, roots2[0].ID)
require.Equal(r, roots1[0].RootCert, roots2[0].RootCert)

secondaryProvider, _ := getCAProviderWithLock(s2)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This had to move into the retry loop, otherwise it was pointing at the old not initialized provider that existed before the secondary could be initialized.

Comment on lines -183 to -184
// Allow the ClusterID to change if it's provided by an internal operation, such
// as a primary datacenter being switched to secondary mode.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted this comment since I came across it while tracing this code and it is misleading. We always allow changing the ClusterID. All these next lines do is default it when the new config has it unset.

@dnephin dnephin added pr/no-changelog PR does not need a corresponding .changelog entry theme/certificates Related to creating, distributing, and rotating certificates in Consul labels Nov 5, 2021
Currently getCARoots could return an empty object with an empty trust
domain before the CA is initialized. This commit returns an error while
there is no CA config or no trust domain.

There could be a CA config and no trust domain because the CA config can
be created in InitializeCA before initialization succeeds.
@vercel vercel bot temporarily deployed to Preview – consul November 8, 2021 23:54 Inactive
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging November 8, 2021 23:54 Inactive
Copy link
Contributor

@kyhavlov kyhavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good - we could probably remove ClusterID from the config altogether in the future (we don't document it as a configurable field) and just persist it in the roots object in the state store. I think having it in two places is definitely a source of unnecessary confusion.

@freddygv freddygv added backport/1.10 and removed pr/no-changelog PR does not need a corresponding .changelog entry labels Nov 9, 2021
@vercel vercel bot temporarily deployed to Preview – consul November 9, 2021 00:14 Inactive
@vercel vercel bot temporarily deployed to Preview – consul-ui-staging November 9, 2021 00:14 Inactive
@freddygv freddygv merged commit 7d95d90 into main Nov 9, 2021
@freddygv freddygv deleted the dnephin/ca-fix-secondary-init branch November 9, 2021 00:16
@hc-github-team-consul-core
Copy link
Collaborator

🍒 If backport labels were added before merging, cherry-picking will start automatically.

To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/496536.

@hc-github-team-consul-core
Copy link
Collaborator

🍒❌ Cherry pick of commit 7d95d90 onto release/1.10.x failed! Build Log

freddygv added a commit that referenced this pull request Nov 9, 2021
ca: properly handle the case where the secondary initializes after the primary
freddygv added a commit that referenced this pull request Nov 9, 2021
Backport of #11514

---------

Fixes #11367

Previously `secondaryInitialize` would return nil in this case, which prevented the
deferred initialize from happening, and left the CA in an uninitialized state until a config
update or root rotation.

To fix this I extracted the common parts into the delegate implementation. However looking at this
again, it seems like the handling in secondaryUpdateRoots is impossible, because that function
should never be called before the secondary is initialized. I believe we can remove some of that
logic in a follow up.
Comment on lines +33 to +34
indexedRoots.TrustDomain = signingID.Host()
if indexedRoots.TrustDomain == "" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Host() returns clusterID.Domain, and Domain is hardcoded to consul in SpiffeIDSigningForCluster, so I don't think this will ever eventuate to true. Should this only check the ClusterID from config?

cc @freddygv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/certificates Related to creating, distributing, and rotating certificates in Consul
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ca: do not generate a ClusterID in the secondary
4 participants