Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation stuck at PodInitializing #56

Closed
shishkin opened this issue Dec 19, 2022 · 18 comments
Closed

Installation stuck at PodInitializing #56

shishkin opened this issue Dec 19, 2022 · 18 comments
Assignees

Comments

@shishkin
Copy link

I've installed the chart as described in the guide but it is stuck at PodInitializing:

❯ k get -n zitadel all
NAME                     READY   STATUS     RESTARTS   AGE
pod/crdb-0               1/1     Running    0          2m35s
pod/zitadel-init-nrfnq   0/1     Init:0/1   0          116s

NAME                  TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)              AGE
service/crdb-public   ClusterIP   10.43.27.85   <none>        26257/TCP,8080/TCP   2m35s
service/crdb          ClusterIP   None          <none>        26257/TCP,8080/TCP   2m35s

NAME                    READY   AGE
statefulset.apps/crdb   1/1     2m35s

NAME                                           SCHEDULE       SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/crdb-rotate-self-signer-client   0 0 */26 * *   False     0        <none>          2m35s

NAME                     COMPLETIONS   DURATION   AGE
job.batch/zitadel-init   0/1           116s       116s
❯ k logs -n zitadel pod/zitadel-init-nrfnq
Defaulted container "zitadel-init" out of: zitadel-init, chown (init)
Error from server (BadRequest): container "zitadel-init" in pod "zitadel-init-nrfnq" is waiting to start: PodInitializing

I had a successful test with docker-compose setup, but now with K8s charts have no idea what is going wrong.

@eliobischof
Copy link
Member

@shishkin, can you share the events output from kubectl -n zitadel describe pod zitadel-init-nrfnq, plz?

@shishkin
Copy link
Author

shishkin commented Dec 20, 2022

Thanks for the hint @eliobischof. The init pod is not able to mount SSL certificate secrets for the DB:

Events:
  Type     Reason       Age               From               Message
  ----     ------       ----              ----               -------
  Normal   Scheduled    17s               default-scheduler  Successfully assigned zitadel/zitadel-init-k9724 to k3d-k3s-cluster-server-0
  Warning  FailedMount  2s (x6 over 17s)  kubelet            MountVolume.SetUp failed for volume "db-ssl-client-crt" : secret "crdb-client-secret" not found
  Warning  FailedMount  2s (x6 over 17s)  kubelet            MountVolume.SetUp failed for volume "db-ssl-root-crt" : secret "crdb-ca-secret" not found

I don't think they're mentioned in the guide.

I'm installing the charts with Terraform Helm provider instead of Helm CLI. The values I provide are:

For CRDB:

{
  fullnameOverride: "crdb",
  conf: {
    "single-node": true,
  },
  statefulset: {
    replicas: 1,
  },
  tls: {
    enabled: false,
  },
}

For Zitadel:

{
  replicaCount: 1,
  zitadel: {
    masterkey: "...",
    configmapConfig: {
      ExternalSecure: false,
      TLS: {
        Enabled: false,
      },
    },
  },
  ingress: {
    enabled: true,
    hosts: [
      {
        host: "...",
        paths: [
          {
            path: "/",
            pathType: "Prefix",
          },
        ],
      },
    ],
  },
}

@eliobischof eliobischof self-assigned this Jan 3, 2023
@eliobischof
Copy link
Member

Sorry, I forgot this one. Do you still have the issue?

@shishkin
Copy link
Author

shishkin commented Jan 4, 2023

Do you still have the issue?

@eliobischof I got stuck with it and wasn't able to make progress. I would appreciate a hint which minimal set of values is mandatory to get Zitadel running.

@eliobischof
Copy link
Member

The cockroach chart should have created certificate secrets. Apparently, they are not called crdb-ca-secret and crdb-client-secret, as the zitadel pods try to mount them.

You can configure the cert secrets names as references in the zitadel charts values.yaml https://github.com/zitadel/zitadel-charts/blob/main/charts/zitadel/values.yaml#L59.
You should be able to see their names with k get -n zitadel get secrets.

@hifabienne
Copy link
Member

@shishkin Did elios proposal work for you?

@shishkin
Copy link
Author

I tried to deploy it again still with no success.

The cockroach chart should have created certificate secrets. Apparently, they are not called crdb-ca-secret and crdb-client-secret, as the zitadel pods try to mount them.

After setting CRDB's chart value tls.enabled: true I do see the secrets created and they're named as expected by zitadel:

❯ k get -n zitadel secrets
NAME                                TYPE                 DATA   AGE
crdb-ca-secret                      Opaque               2      23m
crdb-client-secret                  kubernetes.io/tls    3      23m
crdb-node-secret                    kubernetes.io/tls    3      23m

Still zitadel init fails:

❯ k describe -n zitadel pod/zitadel-init-6bjmj
Name:             zitadel-init-6bjmj
Namespace:        zitadel
Priority:         0
Service Account:  zitadel
Node:             k3d-k3s-cluster-server-0/172.20.0.3
Start Time:       Mon, 23 Jan 2023 10:53:55 +0100
Labels:           app.kubernetes.io/component=init
                  app.kubernetes.io/instance=zitadel
                  app.kubernetes.io/name=zitadel
                  controller-uid=d85138b0-722e-462a-a9df-06083890d7d4
                  job-name=zitadel-init
Annotations:      <none>
Status:           Running
IP:               10.42.0.7
IPs:
  IP:           10.42.0.7
Controlled By:  Job/zitadel-init
Init Containers:
  chown:
    Container ID:  containerd://ed4fdff7ffff449b347e2fdcc8aeacdaaeabc3c63bd6c0a3debd78a3aceb1520
    Image:         alpine:3.11
    Image ID:      docker.io/library/alpine@sha256:bcae378eacedab83da66079d9366c8f5df542d7ed9ab23bf487e3e1a8481375d
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
    Args:
      cp /db-ssl-client-crt/* /chowned-secrets/ && cp /db-ssl-root-crt/* /chowned-secrets/ &&  chown -R 1000:1000 /chowned-secrets/* && chmod 400 /chowned-secrets/*
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 23 Jan 2023 10:54:00 +0100
      Finished:     Mon, 23 Jan 2023 10:54:00 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /chowned-secrets from chowned-secrets (rw)
      /db-ssl-client-crt from db-ssl-client-crt (rw)
      /db-ssl-root-crt from db-ssl-root-crt (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kkmld (ro)
Containers:
  zitadel-init:
    Container ID:  containerd://20a0a47eacefece99c302dde1852a56de8196a30610a147771b45d11a5c8abd4
    Image:         ghcr.io/zitadel/zitadel:v2.15.0
    Image ID:      ghcr.io/zitadel/zitadel@sha256:446b3fb7613b2b88851ba0319add33596913ffe3abf2b6432d46cd86a91023e2
    Port:          <none>
    Host Port:     <none>
    Args:
      init
      --config
      /config/zitadel-config-yaml
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Mon, 23 Jan 2023 10:54:48 +0100
      Finished:     Mon, 23 Jan 2023 10:54:48 +0100
    Ready:          False
    Restart Count:  3
    Environment:
      POD_IP:                                          (v1:status.podIP)
      ZITADEL_DATABASE_COCKROACH_USER_SSL_ROOTCERT:   /.secrets/ca.crt
      ZITADEL_DATABASE_COCKROACH_ADMIN_SSL_ROOTCERT:  /.secrets/ca.crt
      ZITADEL_DATABASE_COCKROACH_ADMIN_SSL_CERT:      /.secrets/tls.crt
      ZITADEL_DATABASE_COCKROACH_ADMIN_SSL_KEY:       /.secrets/tls.key
    Mounts:
      /.secrets from chowned-secrets (rw)
      /config from zitadel-config-yaml (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kkmld (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  zitadel-config-yaml:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      zitadel-config-yaml
    Optional:  false
  db-ssl-root-crt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  crdb-ca-secret
    Optional:    false
  db-ssl-client-crt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  crdb-client-secret
    Optional:    false
  chowned-secrets:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-kkmld:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  79s                default-scheduler  Successfully assigned zitadel/zitadel-init-6bjmj to k3d-k3s-cluster-server-0
  Normal   Pulling    79s                kubelet            Pulling image "alpine:3.11"
  Normal   Pulled     74s                kubelet            Successfully pulled image "alpine:3.11" in 4.327597044s
  Normal   Created    74s                kubelet            Created container chown
  Normal   Started    74s                kubelet            Started container chown
  Normal   Pulling    73s                kubelet            Pulling image "ghcr.io/zitadel/zitadel:v2.15.0"
  Normal   Pulled     65s                kubelet            Successfully pulled image "ghcr.io/zitadel/zitadel:v2.15.0" in 8.327137128s
  Normal   Pulled     26s (x3 over 64s)  kubelet            Container image "ghcr.io/zitadel/zitadel:v2.15.0" already present on machine
  Normal   Created    26s (x4 over 65s)  kubelet            Created container zitadel-init
  Normal   Started    26s (x4 over 65s)  kubelet            Started container zitadel-init
  Warning  BackOff    1s (x7 over 63s)   kubelet            Back-off restarting failed container

@eliobischof
Copy link
Member

Sorry @shishkin, I was in vacation. Can you show me the log output of the init-jobs container, please?

@shishkin
Copy link
Author

shishkin commented Feb 9, 2023

This is the log from the last run:

Defaulted container "zitadel-init" out of: zitadel-init, chown (init)
time="2023-02-09T15:44:40Z" level=info msg="initialization started" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/init.go:72"
time="2023-02-09T15:44:40Z" level=info msg="verify user" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/verify_user.go:38" username=zitadel
time="2023-02-09T15:44:40Z" level=info msg="verify database" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/verify_database.go:38" database=zitadel
time="2023-02-09T15:44:40Z" level=info msg="verify grant" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/verify_grant.go:33" database=zitadel user=zitadel
time="2023-02-09T15:44:40Z" level=info msg="verify zitadel" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/verify_zitadel.go:69" database=zitadel
time="2023-02-09T15:44:40Z" level=fatal msg="unable to initialize ZITADEL" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/init.go:68" error="ID=DATAB-0pIWD Message=Errors.Database.Connection.Failed Parent=(failed to connect to `host=crdb-public user=zitadel database=zitadel`: failed SASL auth (ERROR: password authentication failed for user zitadel (SQLSTATE 28P01)))"

Seems like zitadel is not using the credentials that CRDB created.

@shishkin
Copy link
Author

shishkin commented Feb 9, 2023

Maybe you could provide a tested and working example because the one in the docs doesn't seem to work?

@eliobischof
Copy link
Member

@shishkin CockroachDB is not embedded in the ZITADEL chart anymore. So, passing crdb values doesn't have any effect anymore. We fixed that in the docs.

How did you deploy CockroachDB, and how do you pass the credentials to ZITADEL?

@eliobischof
Copy link
Member

eliobischof commented May 26, 2023

@shishkin I think now I get your problem. I'll close this one in favor of #91. Please track and participate in #91.

@shishkin
Copy link
Author

@eliobischof yes, "an example for a fast PoC" captures it perfectly. Looking forward to that.

@Congee
Copy link

Congee commented Jun 26, 2023

Note that the documentation at https://zitadel.com/docs/self-hosting/deploy/kubernetes#setup-zitadel-and-a-human-admin is outdated. I encountered the exact problem in this thread

@Congee
Copy link

Congee commented Jun 26, 2023

For those who may step into this pit again, the issue of

MountVolume.SetUp failed for volume "db-ssl-client-crt" : secret "crdb-client-secret" not found

is probably caused by race condition. WAIT for a while till "crdb-client-secret" is eventually usable.

Then you may still see an authentication failure. Somehow the zitadel-init job fails to create a user "zitadel" with a password with inspirations from #25. Manually creating the user zitadel with a password worked for me.

@shishkin It is very interesting that line 65 from the result of initialise(...) did not emit an error but line 68 from verifyZitadel(...) did. This seems fishy. It is even more problematic that the user zitadel aforementioned wasn't created in the above VerifyUser(...) step.

https://github.com/zitadel/zitadel/blob/2c6a2a376c6e66d659e25bca461265e0a125c557/cmd/initialise/init.go#L59-L69

As a user than a contributor, my exploration ends here. But, hopefully this comment gives you some pointers to fix related issues.

@Congee
Copy link

Congee commented Jun 26, 2023

A working helmfile definition is:

repositories:
  - name: cockroachdb 
    url: https://charts.cockroachdb.com/
  - name: zitadel 
    url: https://charts.zitadel.com

releases:
  - name: cockroachdb
    chart: cockroachdb/cockroachdb
    set:
      - name: fullnameOverride
        value: crdb
      - name: single-node
        value: true
      - name: statefulset.replicas
        value: 1
      - name: init.provisioning.enabled
        value: true
    values:
      - init.provisioning.users[0]:
          name: zitadel
          password: password
          options: [LOGIN]

  - name: zitadel
    needs: [cockroachdb]
    chart:  zitadel/zitadel
    atomic: true
    set:
      - name: zitadel.masterkey
        value: "MasterkeyNeedsToHave32Characters"  # 32 chars
      - name: zitadel.configmapConfig.ExternalSecure
        value: false
      - name: zitadel.configmapConfig.TLS.Enabled
        value: false
      - name: zitadel.configmapConfig.Database.cockroach.Host
        value: crdb-public.default
      - name: zitadel.secretConfig.Database.cockroach.User.Password
        value: "password"
      - name: replicaCount
        value: 1

@shishkin
Copy link
Author

Thanks @Congee, I will try your example.

@shishkin
Copy link
Author

I've tried that, but still get an error from zitadel initializer pod:

│ time="2023-06-27T19:56:18Z" level=info msg="initialization started" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/init.go:72"           │
│ time="2023-06-27T19:56:18Z" level=fatal msg="unable to initialize the database" caller="/home/runner/work/zitadel/zitadel/cmd/initialise/init.go:6 │

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants