Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck on "Provisioning MinIO Headless Service" #632

Closed
djhoese opened this issue May 11, 2021 · 21 comments
Closed

Stuck on "Provisioning MinIO Headless Service" #632

djhoese opened this issue May 11, 2021 · 21 comments

Comments

@djhoese
Copy link

djhoese commented May 11, 2021

I'm attempting to install MinIO Operator on a new test cluster at my work. The cluster is made by rancher (RKE) and has a default StorageClass using rancher's local-path-provisioner. After following the README (using the krew plugin) to install the operator and create a tenant the status of the tenant stays at "Provisioning MinIO Headless Service". There is a good chance I'm doing something wrong, but it isn't clear to me what. The web UI says "Unable to get tenant usage" when I go to the page for the tenant.

Expected Behavior

Create tenant, see resources being created in the tenant workspace (PV/PVCs), and see the status of the tenant in the MinIO Operator web UI.

Current Behavior

See above. Tenant stuck in "Provisioning MinIO Headless Service".

Additionally, checking the logs of the operator show:

kubectl -n minio-operator logs minio-operator-85dc48fc66-mplj6
I0511 19:25:19.455309       1 main.go:74] Starting MinIO Operator
I0511 19:25:19.948228       1 main.go:146] caBundle on CRD updated
I0511 19:25:19.948941       1 main-controller.go:272] Setting up event handlers
I0511 19:25:19.949047       1 main-controller.go:656] Starting Tenant controller
I0511 19:25:19.949068       1 main-controller.go:659] Waiting for informer caches to sync
I0511 19:25:20.049470       1 main-controller.go:664] Starting workers
I0511 19:25:20.088866       1 main-controller.go:620] operator TLS secret not found%!(EXTRA string=secrets "operator-tls" not found)
W0511 19:25:20.150612       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:20.279329       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0511 19:25:20.279427       1 csr.go:217] Start polling for certificate of csr/operator-minio-operator-csr, every 5s, timeout after 20m0s
W0511 19:25:25.287545       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
...

Possible Solution

May be related to #483

Steps to Reproduce (for bugs)

  1. Start with RKE cluster (v1.20.6 in my case) and local-path-provisioner installed.
  2. kubectl minio init
  3. kubectl minio tenant create minio-tenant-1 --servers 3 --volumes 12 --capacity 16Gi --namespace minio-tenant-1 --storage-class local-path
  4. Check the tenant status:
$ kubectl minio tenant info -n minio-tenant-1 minio-tenant-1

Tenant 'minio-tenant-1/minio-tenant-1', total capacity 16 GiB

  Current status: Provisioning MinIO Headless Service
  MinIO version: minio/minio:RELEASE.2021-04-06T23-11-00Z
  MinIO service: minio/ClusterIP (port 443)

  Console version: minio/console:v0.6.8
  Console service: minio-tenant-1-console/ClusterIP (port 9443)

+------+---------+--------------------+---------------------+
| POOL | SERVERS | VOLUMES PER SERVER | CAPACITY PER VOLUME |
+------+---------+--------------------+---------------------+
| 0    | 3       | 4                  | 1431655765          |
+------+---------+--------------------+---------------------+

Context

Note I have very little experience with MinIO. I used the old helm chart once and am trying to switch to the Operator now.

My work just started using rancher as an on-premises solution for creating k8s clusters. I'm helping our IT department out by testing some things on a "test" cluster (3 very small nodes) that I want to use in our operational clusters.

Side note: I'm very confused by the certificate/TLS stuff in https://github.com/minio/operator/blob/master/README.md#3-connect-to-the-tenant. Is that on the operator pod? Is that supposed to be on cluster nodes? When I tried to do this on the operator pod (kubectl exec) the destination directory doesn't exist and can't be created).

Your Environment

  • Version used (minio-operator): krew plugin says v4.0.9
  • Environment name and version (e.g. kubernetes v1.17.2): RKE
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.6", GitCommit:"8a62859e515889f07e3e3be6a1080413f17cf2c3", GitTreeState:"clean", BuildDate:"2021-04-15T03:19:55Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"}
  • Server type and version:
  • Operating System and version (uname -a):
  • Link to your deployment file:
@djhoese
Copy link
Author

djhoese commented May 11, 2021

Forgot to mention the CSR exists and is approved:

$ kubectl get csr/operator-minio-operator-csr -o yaml | tail
  - client auth
  username: system:serviceaccount:minio-operator:minio-operator
status:
  conditions:
  - lastTransitionTime: "2021-05-11T01:55:35Z"
    lastUpdateTime: "2021-05-11T01:55:35Z"
    message: Automatically approved by MinIO Operator
    reason: MinIOOperatorAutoApproval
    status: "True"
    type: Approved

@dvaldivia
Copy link
Collaborator

@djhoese can you share both the logs of the minio-operator pod and the tenant yaml for this tenant?

@djhoese
Copy link
Author

djhoese commented May 12, 2021

Operator Log
I0511 19:25:19.455309       1 main.go:74] Starting MinIO Operator
I0511 19:25:19.948228       1 main.go:146] caBundle on CRD updated
I0511 19:25:19.948941       1 main-controller.go:272] Setting up event handlers
I0511 19:25:19.949047       1 main-controller.go:656] Starting Tenant controller
I0511 19:25:19.949068       1 main-controller.go:659] Waiting for informer caches to sync
I0511 19:25:20.049470       1 main-controller.go:664] Starting workers
I0511 19:25:20.088866       1 main-controller.go:620] operator TLS secret not found%!(EXTRA string=secrets "operator-tls" not found)
W0511 19:25:20.150612       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:20.279329       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0511 19:25:20.279427       1 csr.go:217] Start polling for certificate of csr/operator-minio-operator-csr, every 5s, timeout after 20m0s
W0511 19:25:25.287545       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:30.283385       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:35.284425       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:40.283410       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:45.286688       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:50.288868       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:25:55.290992       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:26:00.290091       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:26:05.286581       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:26:10.291832       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
E0511 19:26:13.409378       1 main-controller.go:742] error syncing 'minio-tenant-1/minio-tenant-1': secrets "operator-tls" not found
W0511 19:26:15.286213       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:26:20.287611       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
...
W0511 19:45:20.287991       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
E0511 19:45:20.288226       1 operator.go:111] Unexpected error during the creation of the csr/operator-minio-operator-csr: timeout during certificate fetching of csr/operator-minio-operator-csr
I0511 19:45:20.288250       1 main-controller.go:623] Waiting for the operator certificates to be issued timeout during certificate fetching of csr/operator-minio-operator-csr
E0511 19:45:23.624301       1 main-controller.go:742] error syncing 'minio-tenant-1/minio-tenant-1': secrets "operator-tls" not found
I0511 19:45:30.295939       1 main-controller.go:620] operator TLS secret not found%!(EXTRA string=secrets "operator-tls" not found)
W0511 19:45:30.298548       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0511 19:45:30.308986       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0511 19:45:30.309081       1 csr.go:217] Start polling for certificate of csr/operator-minio-operator-csr, every 5s, timeout after 20m0s
W0511 19:45:35.315380       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest

And this continues.

Tenant YAML
apiVersion: minio.min.io/v2
kind: Tenant
metadata:
  creationTimestamp: "2021-05-11T19:26:08Z"
  generation: 1
  name: minio-tenant-1
  namespace: minio-tenant-1
  resourceVersion: "1066063"
  uid: 2cb97d7d-054c-43b3-84cb-015a77606d4f
scheduler:
  name: ""
spec:
  certConfig: {}
  console:
    consoleSecret:
      name: minio-tenant-1-console-secret
    image: minio/console:v0.6.8
    replicas: 2
    resources: {}
  credsSecret:
    name: minio-tenant-1-creds-secret
  image: minio/minio:RELEASE.2021-04-06T23-11-00Z
  imagePullSecret: {}
  mountPath: /export
  pools:
  - affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: v1.min.io/tenant
              operator: In
              values:
              - minio-tenant-1
          topologyKey: kubernetes.io/hostname
    resources: {}
    servers: 3
    volumeClaimTemplate:
      apiVersion: v1
      kind: persistentvolumeclaims
      metadata: {}
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "1431655765"
        storageClassName: local-path
      status: {}
    volumesPerServer: 4
  requestAutoCert: true
status:
  availableReplicas: 0
  certificates:
    autoCertEnabled: true
  currentState: Provisioning MinIO Headless Service
  pools: null
  revision: 0
  syncVersion: ""

@puertal
Copy link

puertal commented May 13, 2021

Same problem here with minio-operator 4.0.10.
Including the secrets "operator-tls" not found Error in the log file.

@harshavardhana
Copy link
Member

Same problem here with minio-operator 4.0.10.
Including the secrets "operator-tls" not found Error in the log file.

Remove your existing CSRs or disable autoCert to deploy. This will generate new CSRs that would fix the operator-tls secret @puertal

@djhoese
Copy link
Author

djhoese commented May 13, 2021

@harshavardhana Does that apply to both of us?

I tried deleting the CSR, it was recreated, but no change on the operator-tls secret error.

I tried editing the tenant resource live (kubectl edit ...) by changing requestAutoCert to false. That didn't change the status of the tenant or change the secret error messages. I created the tenant with the krew plugin. Is there a way to disable this from that interface?

@harshavardhana
Copy link
Member

@harshavardhana Does that apply to both of us?

I tried deleting the CSR, it was recreated, but no change on the operator-tls secret error.

I tried editing the tenant resource live (kubectl edit ...) by changing requestAutoCert to false. That didn't change the status of the tenant or change the secret error messages. I created the tenant with the krew plugin. Is there a way to disable this from that interface?

You generate kubectl minio create -o yaml to generate the yaml to disable requestAutoCert: false - kubectl edit's won't work @djhoese

@djhoese
Copy link
Author

djhoese commented May 13, 2021

Had a small hiccup, the -o flag doesn't actually take any arguments. Did what you said. Here's the YAML:

Modified resource YAML
apiVersion: minio.min.io/v2
kind: Tenant
metadata:
  creationTimestamp: null
  name: minio-tenant-1
  namespace: minio-tenant-1
scheduler:
  name: ""
spec:
  certConfig: {}
  console:
    consoleSecret:
      name: minio-tenant-1-console-secret
    image: minio/console:v0.6.8
    replicas: 2
    resources: {}
  credsSecret:
    name: minio-tenant-1-creds-secret
  image: minio/minio:RELEASE.2021-04-06T23-11-00Z
  imagePullSecret: {}
  mountPath: /export
  pools:
  - affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: v1.min.io/tenant
              operator: In
              values:
              - minio-tenant-1
          topologyKey: kubernetes.io/hostname
    resources: {}
    servers: 3
    volumeClaimTemplate:
      apiVersion: v1
      kind: persistentvolumeclaims
      metadata:
        creationTimestamp: null
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "1431655765"
        storageClassName: local-path
      status: {}
    volumesPerServer: 4
  requestAutoCert: false
status:
  availableReplicas: 0
  certificates: {}
  currentState: ""
  pools: null
  revision: 0
  syncVersion: ""

---
apiVersion: v1
data:
  accesskey: ZjEyYmE5MDItMjMyMi00NTk5LThkZjgtMjJkNDYwMzJhZGE2
  secretkey: NDQ1MjE3YjQtOGU2ZS00YWY1LWEzMDAtOTE1MDI1ZTRhMjZj
kind: Secret
metadata:
  creationTimestamp: null
  name: minio-tenant-1-creds-secret
  namespace: minio-tenant-1

---
apiVersion: v1
data:
  CONSOLE_ACCESS_KEY: YWRtaW4=
  CONSOLE_PBKDF_PASSPHRASE: ZjhmYmVlNmUtZmE5YS00MjlkLWE0OTAtZDExMTAzMzI1ZDNh
  CONSOLE_PBKDF_SALT: MDE4OTQ3YzMtNDQ2Zi00OGU5LWI0ZTEtZGRlOWFkZTQ5NDE4
  CONSOLE_SECRET_KEY: NDQyMDIwMjQtMDkxMC00ZmVjLWFhM2QtMGMxNWY5YThmN2Yx
kind: Secret
metadata:
  creationTimestamp: null
  name: minio-tenant-1-console-secret
  namespace: minio-tenant-1

Did kubectl create -f minio-tenant-1.yaml but the log from the operator looks the same:

W0513 16:43:03.891275       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
E0513 16:43:07.095521       1 main-controller.go:742] error syncing 'minio-tenant-1/minio-tenant-1': secrets "operator-tls" not found
W0513 16:43:08.892397       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 16:43:13.890325       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
E0513 16:43:17.030601       1 main-controller.go:742] error syncing 'minio-tenant-1/minio-tenant-1': secrets "operator-tls" not found

@harshavardhana
Copy link
Member

W0513 16:43:03.891275 1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
E0513 16:43:07.095521 1 main-controller.go:742] error syncing 'minio-tenant-1/minio-tenant-1': secrets "operator-tls" not found
W0513 16:43:08.892397 1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 16:43:13.890325 1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
E0513 16:43:17.030601 1 main-controller.go:742] error syncing 'minio-tenant-1/minio-tenant-1': secrets "operator-tls" not found

This is because there is now operator-csr that is approved or not approved - which is necessary for operator-tls to be present. if the CSR is already approved then we need to wait create the secret associated with it.

the CSR problem today is that once its approved - k8s doesn't give out new approvals so it has to be manually purged.

kubectl get csr and delete all of them @djhoese

@harshavardhana
Copy link
Member

For example the correct things happening at the operator level

I0513 20:31:19.336567       1 main.go:74] Starting MinIO Operator
I0513 20:31:19.807609       1 main.go:146] caBundle on CRD updated
I0513 20:31:19.808126       1 main-controller.go:272] Setting up event handlers
I0513 20:31:19.808192       1 main-controller.go:656] Starting Tenant controller
I0513 20:31:19.808201       1 main-controller.go:659] Waiting for informer caches to sync
I0513 20:31:19.838415       1 main-controller.go:620] operator TLS secret not found%!(EXTRA string=secrets "operator-tls" not found)
W0513 20:31:19.842965       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 20:31:19.872186       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 20:31:19.879084       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0513 20:31:19.879347       1 csr.go:217] Start polling for certificate of csr/operator-minio-operator-csr, every 5s, timeout after 20m0s
I0513 20:31:20.008707       1 main-controller.go:664] Starting workers
W0513 20:31:24.886026       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0513 20:31:24.886414       1 csr.go:242] Certificate successfully fetched, creating secret with Private key and Certificate
I0513 20:31:24.893343       1 main-controller.go:623] Waiting for the operator certificates to be issued waiting for Operator cert
I0513 20:31:34.901374       1 main-controller.go:647] Starting api server
~ kubectl get secrets -n minio-operator
NAME                         TYPE                                  DATA   AGE
console-sa-token-lr5wm       kubernetes.io/service-account-token   3      72s
default-token-m54f9          kubernetes.io/service-account-token   3      72s
minio-operator-token-6vthl   kubernetes.io/service-account-token   3      72s
operator-tls                 Opaque                                2      63s

Once certs are issued the operator-tls should be present

And then you deploy your tenant

~ kubectl apply -f tenant.yaml
~ I0513 20:31:24.886414       1 csr.go:242] Certificate successfully fetched, creating secret with Private key and Certificate
I0513 20:31:24.893343       1 main-controller.go:623] Waiting for the operator certificates to be issued waiting for Operator cert
I0513 20:31:34.901374       1 main-controller.go:647] Starting api server
I0513 20:33:12.705370       1 main-controller.go:989] Deploying pool ss-0
I0513 20:33:13.727103       1 main-controller.go:994] Deploying pool failed ss-0
E0513 20:33:13.727189       1 main-controller.go:742] error syncing 'altinityy/altinity': MinIO is not ready
I0513 20:33:22.637684       1 main-controller.go:989] Deploying pool ss-0
W0513 20:33:22.640891       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0513 20:33:22.655437       1 csr.go:73] Generating private key
I0513 20:33:22.655612       1 csr.go:86] Generating CSR with CN=*.altinity-hl.altinityy.svc.cluster.local
W0513 20:33:22.662686       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 20:33:22.674657       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0513 20:33:22.674976       1 csr.go:217] Start polling for certificate of csr/altinity-altinityy-csr, every 5s, timeout after 20m0s
W0513 20:33:27.686440       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0513 20:33:27.686951       1 csr.go:242] Certificate successfully fetched, creating secret with Private key and Certificate
E0513 20:33:27.697112       1 main-controller.go:742] error syncing 'altinityy/altinity': waiting for minio cert

And now this cert granting takes godforsaken amount of time on k8s :-)

@dvaldivia
Copy link
Collaborator

We've seen this on RKE because when you create the RKE cluster you need to configure the CSR feature

https://rancher.com/docs/rke/latest/en/installation/certs/
rancher/rancher#14041

The symptom is that the CSR shows the CSR approved but not issues

@djhoese
Copy link
Author

djhoese commented May 14, 2021

Ah I didn't realize the CSR stuff was needed/expected before the tenant should be created. At least that takes the tenant out of the puzzle. However, in my last comment I mentioned that I had tried deleting the CSR and I didn't see a change. Hopefully I'm not missing something.

$ kubectl get csr
NAME                          AGE     SIGNERNAME                     REQUESTOR                                             CONDITION
operator-minio-operator-csr   7h13m   kubernetes.io/legacy-unknown   system:serviceaccount:minio-operator:minio-operator   Approved

Delete:

$ kubectl delete csr/operator-minio-operator-csr
certificatesigningrequest.certificates.k8s.io "operator-minio-operator-csr" deleted

Operator logs:

W0513 23:06:14.419841       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
E0513 23:06:14.419959       1 csr.go:238] Unexpected error during certificate fetching of csr/operator-minio-operator-csr: certificatesigningrequests.certificates.k8s.io "operator-minio-operator-csr" not found
E0513 23:06:14.419982       1 operator.go:111] Unexpected error during the creation of the csr/operator-minio-operator-csr: certificatesigningrequests.certificates.k8s.io "operator-minio-operator-csr" not found
I0513 23:06:14.420006       1 main-controller.go:623] Waiting for the operator certificates to be issued certificatesigningrequests.certificates.k8s.io "operator-minio-operator-csr" not found
I0513 23:06:24.429102       1 main-controller.go:620] operator TLS secret not found%!(EXTRA string=secrets "operator-tls" not found)
W0513 23:06:24.431908       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 23:06:24.440983       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 23:06:24.446934       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
I0513 23:06:24.447148       1 csr.go:217] Start polling for certificate of csr/operator-minio-operator-csr, every 5s, timeout after 20m0s
W0513 23:06:29.453054       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest
W0513 23:06:34.451394       1 warnings.go:70] certificates.k8s.io/v1beta1 CertificateSigningRequest is deprecated in v1.19+, unavailable in v1.22+; use certificates.k8s.io/v1 CertificateSigningRequest

And so on and on (put my kid to bed) and on (made a drink) and on..

Regarding your comment @dvaldivia, you're saying what I'm seeing is a symptom, meaning that deleting the csr should work as a workaround, right? If I need to configure my RKE cluster a certain way, how should it be done?

@puertal
Copy link

puertal commented May 17, 2021

I still have the same problem as @djhoese : "Start polling for certificate..." message forever and no "Starting API service" message.

To give more info, I am deploying minio-operator using your Helm Chart package v4.0.10 downloaded from: https://github.com/minio/operator/tree/master/helm-releases

And I configured it with:

  operator:
    imagePullSecrets:
      - name: dockerhub-registry-secret
    replicaCount: 1
    image:
      pullPolicy: Always
    resources:
      requests:
        cpu: 500m
        memory: 512Mi
        ephemeral-storage: 500Mi
      limits:
        cpu: 500m
        memory: 512Mi
        ephemeral-storage: 500Mi

Checking the "operator-deployment.yaml" file inside the TGZ file I cannot see much other configuration options.

If I delete the CSR, it is always created when Deployment is created, but the "operator-tls" Secret is never created.
Here you can find the CSR:

apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
  creationTimestamp: "2021-05-17T10:53:47Z"
  managedFields:
  - apiVersion: certificates.k8s.io/v1beta1
    fieldsType: FieldsV1
...
    manager: minio-operator
    operation: Update
    time: "2021-05-17T10:53:47Z"
  name: operator-minio-operator-csr
  ownerReferences:
  - apiVersion: minio.min.io/v2
    blockOwnerDeletion: true
    controller: true
    kind: Tenant
    name: minio-operator
    uid: 87384795-df46-4a44-8374-f2636d96c617
  resourceVersion: "16556290"
  selfLink: /apis/certificates.k8s.io/v1/certificatesigningrequests/operator-minio-operator-csr
  uid: d26f6562-829b-4915-b9e2-0c513d2ed899
spec:
  groups:
  - system:serviceaccounts
  - system:serviceaccounts:minio-operator
  - system:authenticated
  request: LS0...
  signerName: kubernetes.io/legacy-unknown
  uid: fc98c3f7-6b85-42df-b020-1a2a9d69d8fd
  usages:
  - digital signature
  - server auth
  - client auth
  username: system:serviceaccount:minio-operator:minio-operator
status:
  conditions:
  - lastTransitionTime: "2021-05-17T10:53:47Z"
    lastUpdateTime: "2021-05-17T10:53:47Z"
    message: Automatically approved by MinIO Operator
    reason: MinIOOperatorAutoApproval
    status: "True"
    type: Approved

No Tenant defined at this point.
Thanks!!!!!!!

@steve-todorov
Copy link

Well it looks like I need to join this party as well..
I'm having the exact same issue running k8s cluster started via RKE (k8s version is 1.20.5).
I have tried to follow the instruction above but I absolutely never ever get to having

$ kubectl get secrets -n minio-operator
NAME                         TYPE                                  DATA   AGE
console-sa-token-lr5wm       kubernetes.io/service-account-token   3      72s
default-token-m54f9          kubernetes.io/service-account-token   3      72s
minio-operator-token-6vthl   kubernetes.io/service-account-token   3      72s
operator-tls                 Opaque                                2      63s

The operator-tls never appears. I'm not entirely sure what's expected from us to do in order to fix it?
Any more ideas / suggestions?

@steve-todorov
Copy link

I've had some more time to play around with this and I confirm @dvaldivia's comment:

We've seen this on RKE because when you create the RKE cluster you need to configure the CSR feature

https://rancher.com/docs/rke/latest/en/installation/certs/
rancher/rancher#14041

The symptom is that the CSR shows the CSR approved but not issues

If you follow down the links in rancher/rancher#14041 you'll eventually end up at cockroachdb/cockroach#28075 (comment) which suggests:

-- cluster.yaml file
services:
  kube-controller: 
    extra_args: 
      cluster-signing-cert-file: "/etc/kubernetes/ssl/kube-ca.pem"
      cluster-signing-key-file: "/etc/kubernetes/ssl/kube-ca-key.pem"

Once you add the extra arguments to the kube-controller you should do rke up which will redeploy/restart the necessary containers (don't use the --update-only argument). Essentially these were the steps I took afterwards:

#!/bin/bash

NAMESPACE=minio-operator (or whatever)
# Go to the Rancher UI, get to a project, copy the project id from the url:
# i.e. https://domain.com/p/a-bbbbbb:c-dddddd/workloads
RANCHER_PROJECT_ID=a-bbbbbb:c-dddddd

kubectl delete csr/operator-$NAMESPACE-csr
kubectl delete secrets $(kubectl get secrets --namespace=$NAMESPACE | grep -iE "minio|console" | awk '{print $1}' | xargs)
sleep 5
kubectl minio delete --namespace $NAMESPACE

kubectl minio init --image minio/operator:v4.0.11 --namespace $NAMESPACE

# Associate the namespace with a rancher project.
if [[ $NAMESPACE != "default" && $NAMESPACE != "kube-system" && $RANCHER_PROJECT_ID != "" ]]; then
  kubectl annotate namespace $NAMESPACE field.cattle.io/projectId=$RANCHER_PROJECT_ID --overwrite
fi

kubectl minio tenant create $NAMESPACE --servers 4 --volumes 4 --capacity 10Gi --storage-class local-path --namespace $NAMESPACE

I waited for a minute or so and all worked as expected. :)

@harshavardhana
Copy link
Member

This needs to be reported upstream to rancher. Looks like quite a lot of other projects have been affected as well.

@harshavardhana
Copy link
Member

I've had some more time to play around with this and I confirm @dvaldivia's comment:

We've seen this on RKE because when you create the RKE cluster you need to configure the CSR feature
https://rancher.com/docs/rke/latest/en/installation/certs/
rancher/rancher#14041
The symptom is that the CSR shows the CSR approved but not issues

If you follow down the links in rancher/rancher#14041 you'll eventually end up at cockroachdb/cockroach#28075 (comment) which suggests:

-- cluster.yaml file
services:
  kube-controller: 
    extra_args: 
      cluster-signing-cert-file: "/etc/kubernetes/ssl/kube-ca.pem"
      cluster-signing-key-file: "/etc/kubernetes/ssl/kube-ca-key.pem"

Once you add the extra arguments to the kube-controller you should do rke up which will redeploy/restart the necessary containers (don't use the --update-only argument). Essentially these were the steps I took afterwards:

#!/bin/bash

NAMESPACE=minio-operator (or whatever)
# Go to the Rancher UI, get to a project, copy the project id from the url:
# i.e. https://domain.com/p/a-bbbbbb:c-dddddd/workloads
RANCHER_PROJECT_ID=a-bbbbbb:c-dddddd

kubectl delete csr/operator-$NAMESPACE-csr
kubectl delete secrets $(kubectl get secrets --namespace=$NAMESPACE | grep -iE "minio|console" | awk '{print $1}' | xargs)
sleep 5
kubectl minio delete --namespace $NAMESPACE

kubectl minio init --image minio/operator:v4.0.11 --namespace $NAMESPACE

# Associate the namespace with a rancher project.
if [[ $NAMESPACE != "default" && $NAMESPACE != "kube-system" && $RANCHER_PROJECT_ID != "" ]]; then
  kubectl annotate namespace $NAMESPACE field.cattle.io/projectId=$RANCHER_PROJECT_ID --overwrite
fi

kubectl minio tenant create $NAMESPACE --servers 4 --volumes 4 --capacity 10Gi --storage-class local-path --namespace $NAMESPACE

I waited for a minute or so and all worked as expected. :)

Closing this as per this comment.

@harshavardhana
Copy link
Member

@ravindk89 we may have to document this #632 (comment)

@ravindk89
Copy link
Contributor

Acknowledged - We can integrate this into the dedicated platform-based installation documentation. Though this is a longstanding issue, so I'll prioritize something for Rancher.

@steve-todorov
Copy link

@harshavardhana I agree - the RKE documentation is a bit confusing on this topic. You could have Rancher setup via a LB (i.e. Traefik) which generates valid ingress certificates. This might mislead you into thinking you don't need to do additional configuration parameters (which is what I mistakenly thought). Furthermore, these parameters aren't actually mentioned at all in he docs which is weird.

I've opened a ticket there as well rancher/rke#2550

@ravindk89
Copy link
Contributor

For those visiting this topic via github search, please see our docs for additional guidance.

In my recent testing with the latest stable K3S, I did not encounter issues provisioning TLS certificates. This might vary depending on how you set up K3s or Rancher.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants