Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP certrotation: don't let multiple controllers rewrite metadata of the shared resource #1693

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

vrutkovs
Copy link
Member

This prevents http 409 errors when both labels and content need updating in one sync. See this test failure when new library-go is being pulled in cluster-kube-apiserver-operator

@openshift-ci openshift-ci bot requested review from mfojtik and stlaz March 13, 2024 15:51
@vrutkovs
Copy link
Member Author

/hold

Testing this in openshift/cluster-kube-apiserver-operator#1655

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 13, 2024
@vrutkovs
Copy link
Member Author

/hold cancel

Seems to work as expected

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 14, 2024
@@ -64,10 +64,11 @@ func (c CABundleConfigMap) EnsureConfigMapCABundle(ctx context.Context, signingC
}
needsMetadataUpdate = c.AdditionalAnnotations.EnsureTLSMetadataUpdate(&caBundleConfigMap.ObjectMeta) || needsMetadataUpdate
if needsMetadataUpdate && len(caBundleConfigMap.ResourceVersion) > 0 {
_, _, err := resourceapply.ApplyConfigMap(ctx, c.Client, c.EventRecorder, caBundleConfigMap)
actualCABundleConfigMap, _, err := resourceapply.ApplyConfigMap(ctx, c.Client, c.EventRecorder, caBundleConfigMap)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mind adding a unit test ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, not sure if it really ensures hotlooping is not happening (imo fuzzing would confirm it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't solve the hotloop issue. The hotloop issue was resolved by openshift/cluster-kube-apiserver-operator#1661.

There were 4 controllers fighting over kube-control-plane-signer-ca

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this by allowing metadata change to the first controller only (whoever has set the first ownership ref).

I'll run this a few time to make sure this won't make TLS metadata jitter from run to run

@p0lyn0mial
Copy link
Contributor

In the future we could refactor the controller to only issue a single update when meta and content require updating.

@vrutkovs vrutkovs force-pushed the EnsureConfigMapCABundle-apply-fix branch from 7b0793e to 6df5e95 Compare April 12, 2024 14:08
@p0lyn0mial
Copy link
Contributor

@vrutkovs a wrong update ? I don't see a unit test.

@vrutkovs vrutkovs force-pushed the EnsureConfigMapCABundle-apply-fix branch from 6df5e95 to 9f8acda Compare April 12, 2024 14:23
This prevents http 409 errors when both labels and content need updating in one sync
@vrutkovs vrutkovs force-pushed the EnsureConfigMapCABundle-apply-fix branch from 9f8acda to 079ddac Compare April 12, 2024 14:33
@p0lyn0mial
Copy link
Contributor

See this test failure when new library-go is being pulled in cluster-kube-apiserver-operator

For the record, the above failure was due to openshift/cluster-kube-apiserver-operator#1661 not due to the issue addressed by this PR.

if !actions[2].Matches("get", "configmaps") {
t.Error(actions[2])
}
if !actions[3].Matches("update", "configmaps") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the second call is unnecessary, we should refactor the controller to issue only a single API call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in addition it is worth to also examine the payload of the request (updated configmap), please add it to your todo list.

@p0lyn0mial
Copy link
Contributor

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 15, 2024
@p0lyn0mial
Copy link
Contributor

/assign @stlaz

for approval

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 16, 2024
Copy link
Contributor

openshift-ci bot commented Apr 16, 2024

New changes are detected. LGTM label has been removed.

Copy link
Contributor

openshift-ci bot commented Apr 16, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: p0lyn0mial, vrutkovs
Once this PR has been reviewed and has the lgtm label, please ask for approval from stlaz. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vrutkovs vrutkovs changed the title certrotation: update caBundleConfigMap after labels have been applied certrotation: don't let multiple controllers rewrite metadata of the shared resource Apr 16, 2024
Ensure that multiple controllers managing single resource don't
overwrite each other metadata changes. This is archived by checking
ownership references - if current controller is set as first owner
other metadata changes are being ignored. This ensures that multiple
cert rotation controllers can handle single CA bundle configmap - but
only one of them will be updating configmap metadata - whichever is
started first
@vrutkovs vrutkovs force-pushed the EnsureConfigMapCABundle-apply-fix branch from db12f27 to 21aeec1 Compare April 16, 2024 12:50
…y metadata

In this case cert-rotation won't convert secret type as `ensureSecretTLSTypeSet` would never run
Copy link
Contributor

openshift-ci bot commented Apr 16, 2024

@vrutkovs: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

needsMetadataUpdate := false
if c.Owner != nil {
needsMetadataUpdate = ensureOwnerReference(&caBundleConfigMap.ObjectMeta, c.Owner)
if caBundleConfigMap.ObjectMeta.OwnerReferences[0] == *c.Owner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just have a single owner?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CA bundle can contain multiple signers, controlled by different controllers. We could have a single ownerRef, but that wouldn't reflect reality. It also makes easier to find such CA bundles in the must-gathers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CA bundle can contain multiple signers, controlled by different controllers

This is wrong and should be fixed. Why can't we have a single signer ?
Could you point me to code that does this ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two controllers:

I'm planning to rewrite this as "NewCertRotationController should accept an array of RotatedSelfSignedCertKeySecrets" later to avoid it, but in general it may happen accidentally and we'd need a way to prevent that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, there is a single signer and a CA bundle.
Also as far as I can tell kas-o doesn't make use of the Owner field for both signer and CA resources.

Copy link
Member Author

@vrutkovs vrutkovs Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Owner denotes a controller, not the signer or CA. So yes, these are derived from single signer - but controlled by different controllers.

Do you think a better option would be implementing a controller which controls multiple target certs instead?

UPD: created #1722 to try this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we have to have multiple controllers controlling a single resource. I would prefer to have a single controller for that.

It seems to me that we would like to mark which components are using the resource, which could be done with a single controller.

@vrutkovs vrutkovs changed the title certrotation: don't let multiple controllers rewrite metadata of the shared resource WIP certrotation: don't let multiple controllers rewrite metadata of the shared resource Apr 24, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants