Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'ca.crt' is removed from 'kubernetes.io/tls' secret after a while #80

Closed
dejwsz opened this issue Oct 19, 2020 · 10 comments
Closed

'ca.crt' is removed from 'kubernetes.io/tls' secret after a while #80

dejwsz opened this issue Oct 19, 2020 · 10 comments

Comments

@dejwsz
Copy link

dejwsz commented Oct 19, 2020

I have cert-manager deployed in latest 1.0.3 version in Openshift 3.11, I deployed also cert-utils-operator in version 0.1.0 - newer versions do not work unfortunately in this environment (it would be good to look at it).
So having cert utils operator in place while creating a new certificate from self-signed Issuer I can see 3 keys there, so: tls.key, tls.crt and ca.crt. After a few minutes, the related Secret is updated and ca.crt is removed completely which is not good as it is useless later in my use case.
I found cert utils operator is somehow responsible for this because when I remove it then this does not happen anymore later.
I tried to use newer operator versions but it is not possible in Openshift 3.11 as they do not work.

@mathianasj
Copy link
Contributor

Hi, this is by design. The route gets updated with the contents of the provided secret. If ca.crt is not provided in the secret that is why it is being overwritten. This behavior should be present in the latest version also.

@mathianasj
Copy link
Contributor

See here

if route.Spec.TLS.CACertificate != string(value) {

@dejwsz
Copy link
Author

dejwsz commented Oct 19, 2020

This secret was not used for a route at all, I did not use any cert utils operator annotation on it.
It should not work like this, the operator should not care about secrets which are not related to it I believe?
This looks like some side effect to me or I do not understand something. The operator should work only then if I use consciously one of the annotations on a secret, a configmap or a route. In this case, I did not.

@mathianasj
Copy link
Contributor

I misread this at first. Would you be able to provide a sample of the cert manager certificate, and an output of the secret for me to review and duplicate the issue. What version of cert-manager do you have deployed?

@dejwsz
Copy link
Author

dejwsz commented Oct 20, 2020

Hmm, I need to apologize. Maybe I had some other problem. I just wanted to be sure and give you all the needed info. So I removed everything and reinstalled again on my different Openshift 3.11 cluster. And now I cannot reproduce the same problem. Maybe I had some weird state in etcd in my test cluster as in the operator logs I can see such errors:

{"level":"error","ts":1603182665.7103422,"logger":"controller_ca_injection","msg":"unable to retrive ca from secret","secret":"my-test-project4/testdejw-tls","error":"Secret "testdejw-tls" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/travis/gopath/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngithub.com/redhat-cop/cert-utils-operator/pkg/controller/cainjection.(*ReconcileConfigmap).Reconcile\n\t/home/travis/gopath/src/github.com/redhat-cop/cert-utils-operator/pkg/controller/cainjection/configmap_controller.go:133\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/travis/gopath/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/travis/gopath/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/home/travis/gopath/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/home/travis/gopath/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/home/travis/gopath/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/home/travis/gopath/pkg/mod/k8s.io/apimachinery@v0.0.0-20191004115801-a2eda9f80ab8/pkg/util/wait/wait.go:88"}

The case is the secret mentioned there does not exist anymore and it tried to look for it. Maybe I had recreated such secret before and it had some old state which was not the same and then it tried to sync it (and then ca.crt was removed somehow).
I cannot reproduce it in the second cluster.

@dejwsz
Copy link
Author

dejwsz commented Oct 20, 2020

I am using the latest cert-manager v1.0.3 and cert-utils-operator v0.1.0.

@dejwsz
Copy link
Author

dejwsz commented Oct 20, 2020

Hmm I think I did it - it took more time that I antipicated. So in a second Openshift 3.11 cluster I have cert-manager v1.0.3 and cert-utils-operator v0.1.0. In one of my namespace I created local Issuer and later Certificate. And I just let them be there for a while. After some time (I checked it again after 1 hour) 'ca.crt' from the generated secret was removed. If I do the same later but without cert-utils-operator (uninstalled) then this takes no place anymore. I added yaml's with the issuer and the certificate and also with generated secret (with ca.crt inside).

issuer.txt
certificate.txt
created-secret.txt

@ron1
Copy link

ron1 commented Nov 12, 2020

We also run cert-utils-operator v0.1.0 on OCP 3.11 and have seen this behavior. Since we do not have cert-manager installed, I do not think this bug has anything to do with cert-manager.

@mathianasj
Copy link
Contributor

@dejwsz I believe this should be fixed with #84

@Kajot-dev
Copy link

@mathianasj Actually, it is not fixed with #84 (although it is fixed in v1.0.0+)

Real issue comes from UpdateFunc in isAnnotatedSecret predicate within controller/cainjection/secret_controller.go
It does not actually check if there is annotation on secret at all! Operator then proceeds to flood the logs reconciling every secret on the cluster regardless if it has the required annotation. This also explains the "random" delay - processing all secrets in the cluster simply takes time ;)

If someone wants to fix the v0.1 version, here is a patch (predicate for isAnnotadedSecret backported from more recent version):

From af627038a239aad8a130b3db43009d7bc94657bf Mon Sep 17 00:00:00 2001
From: jjaruszewski <jjaruszewski@man.poznan.pl>
Date: Wed, 10 May 2023 18:22:24 +0200
Subject: [PATCH] fix/reconciling all secrets in cluster

---
 pkg/controller/cainjection/secret_controller.go | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/pkg/controller/cainjection/secret_controller.go b/pkg/controller/cainjection/secret_controller.go
index 8ae1288..da5a9b3 100644
--- a/pkg/controller/cainjection/secret_controller.go
+++ b/pkg/controller/cainjection/secret_controller.go
@@ -49,7 +49,9 @@ func addSecretReconciler(mgr manager.Manager, r reconcile.Reconciler) error {
 			if newSecret.Type != util.TLSSecret {
 				return false
 			}
-			return true
+			oldSecretAnn, _ := e.MetaOld.GetAnnotations()[certAnnotationSecret]
+			newSecretAnn, _ := e.MetaNew.GetAnnotations()[certAnnotationSecret]
+			return oldSecretAnn != newSecretAnn
 		},
 		CreateFunc: func(e event.CreateEvent) bool {
 			secret, ok := e.Object.(*corev1.Secret)
-- 
2.40.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants