OLM operator installation stuck for v0.5.0 #429

czunker · 2022-07-05T12:18:03Z

Describe the bug
bundle container is stuck in CrashLoopBackOff.
Error from the logs:

mkdir: can't create directory '/database': Permission denied

To Reproduce
Steps to reproduce the behavior:

start fresh minikube
operator-sdk olm install
kubectl create ns mondoo-operator
operator-sdk run bundle ghcr.io/mondoohq/mondoo-operator-bundle:v0.5.0 --namespace mondoo-operator --timeout 3m0s
check Pods in mondoo-operator namespace: kubectl -n mondoo-operator get pod
Get logs of ghcr-io-mondoohq-mondoo-operator-bundle-v0-5-0: kubectl -n mondoo-operator logs ghcr-io-mondoohq-mondoo-operator-bundle-v0-5-0

Expected behavior
Pod should start.

Screenshots

k get pod -A
NAMESPACE         NAME                                             READY   STATUS    RESTARTS        AGE
kube-system       coredns-64897985d-w4nlc                          1/1     Running   0               2m49s
kube-system       etcd-minikube                                    1/1     Running   0               3m3s
kube-system       kube-apiserver-minikube                          1/1     Running   0               3m4s
kube-system       kube-controller-manager-minikube                 1/1     Running   0               3m3s
kube-system       kube-proxy-r9f4l                                 1/1     Running   0               2m49s
kube-system       kube-scheduler-minikube                          1/1     Running   0               3m3s
kube-system       storage-provisioner                              1/1     Running   1 (2m19s ago)   3m2s
mondoo-operator   ghcr-io-mondoohq-mondoo-operator-bundle-v0-5-0   0/1     Error     3 (33s ago)     61s
olm               catalog-operator-67bcbb4f5d-jjsck                1/1     Running   0               2m5s
olm               olm-operator-d9f76fdc9-2pm47                     1/1     Running   0               2m5s
olm               operatorhubio-catalog-jxwg8                      1/1     Running   0               118s
olm               packageserver-685694fddd-g2hxh                   1/1     Running   0               117s
olm               packageserver-685694fddd-lmnb8                   1/1     Running   0               117s

Additional context
This PR was merged: #424

But the securityContext of the failed Pod is not derived from this:

kubectl -n mondoo-operator get po ghcr-io-mondoohq-mondoo-operator-bundle-v0-5-0 -o yaml 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2022-07-05T12:04:59Z"
  name: ghcr-io-mondoohq-mondoo-operator-bundle-v0-5-0
  namespace: mondoo-operator
  ownerReferences:
  - apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    name: mondoo-operator-catalog
    uid: 4451e898-54ca-485f-9693-cde291f57daf
  resourceVersion: "957"
  uid: 28aa6f3b-91fa-4993-9a1b-697aaaa5ddef
spec:
  containers:
  - command:
    - sh
    - -c
    - |
      mkdir -p /database && \
      opm registry add -d /database/index.db -b ghcr.io/mondoohq/mondoo-operator-bundle:v0.5.0 --mode=semver --skip-tls-verify=false --use-http=false && \
      opm registry serve -d /database/index.db -p 50051
    image: quay.io/operator-framework/opm:latest
    imagePullPolicy: Always
    name: registry-grpc
    ports:
    - containerPort: 50051
      name: grpc
      protocol: TCP
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-xc9kq
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: minikube
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-xc9kq
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-07-05T12:04:59Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-07-05T12:04:59Z"
    message: 'containers with unready status: [registry-grpc]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-07-05T12:04:59Z"
    message: 'containers with unready status: [registry-grpc]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-07-05T12:04:59Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://4a9c7a8bc05d5b87a77eaa43caf3bdbf22c094c54f82d807ffecdac63e393951
    image: quay.io/operator-framework/opm:latest
    imageID: docker-pullable://quay.io/operator-framework/opm@sha256:fda1d54edaf76df5baad31dc01954bb0112d36743cfe5b70d2d22fcc314ab36c
    lastState:
      terminated:
        containerID: docker://4a9c7a8bc05d5b87a77eaa43caf3bdbf22c094c54f82d807ffecdac63e393951
        exitCode: 1
        finishedAt: "2022-07-05T12:06:36Z"
        reason: Error
        startedAt: "2022-07-05T12:06:36Z"
    name: registry-grpc
    ready: false
    restartCount: 4
    started: false
    state:
      waiting:
        message: back-off 1m20s restarting failed container=registry-grpc pod=ghcr-io-mondoohq-mondoo-operator-bundle-v0-5-0_mondoo-operator(28aa6f3b-91fa-4993-9a1b-697aaaa5ddef)
        reason: CrashLoopBackOff
  hostIP: 192.168.49.2
  phase: Running
  podIP: 172.17.0.8
  podIPs:
  - ip: 172.17.0.8
  qosClass: BestEffort
  startTime: "2022-07-05T12:04:59Z"

The text was updated successfully, but these errors were encountered:

czunker · 2022-07-06T03:56:33Z

The container is running with uid 1001 and this user cannot create the directory:

/ $ id
uid=1001 gid=0(root)
/ $ ls -l
total 60
drwxr-xr-x    1 root     root          4096 Jul  4 13:58 bin
drwxr-xr-x    2 root     root          4096 Jan  1  1970 boot
drwxr-xr-x    2 root     root         12288 Jan  1  1970 busybox
drwxr-xr-x    5 root     root           360 Jul  6 03:51 dev
drwxr-xr-x    1 root     root          4096 Jul  6 03:51 etc
drwxr-xr-x    3 nonroot  nonroot       4096 Jan  1  1970 home
drwxr-xr-x    2 root     root          4096 Jan  1  1970 lib
dr-xr-xr-x  524 root     root             0 Jul  6 03:51 proc
drwx------    2 root     root          4096 Jan  1  1970 root
drwxr-xr-x    2 root     root          4096 Jan  1  1970 run
drwxr-xr-x    2 root     root          4096 Jan  1  1970 sbin
dr-xr-xr-x   13 root     root             0 Jul  6 03:51 sys
drwxrwxrwt    2 root     root          4096 Jan  1  1970 tmp
drwxr-xr-x    9 root     root          4096 Jan  1  1970 usr
drwxr-xr-x    1 root     root          4096 Jan  1  1970 var
/ $ mkdir -p /database
mkdir: can't create directory '/database': Permission denied

czunker · 2022-07-06T04:03:12Z

It seems to be the latest opm image.
The latest tag references v1.23.2.

Starting the Pod with quay.io/operator-framework/opm:v1.23.1, the container is running as root and can create the directory:

/ # id
uid=0(root) gid=0(root)
/ # mkdir -p /database
/ # ls -l
total 64
drwxr-xr-x    1 root     root          4096 Jul  1 14:08 bin
drwxr-xr-x    2 root     root          4096 Jan  1  1970 boot
drwxr-xr-x    2 root     root         12288 Jan  1  1970 busybox
drwxr-xr-x    2 root     root          4096 Jul  6 04:00 database
drwxr-xr-x    5 root     root           360 Jul  6 03:59 dev
...

czunker · 2022-07-06T07:01:26Z

Upstream issue: operator-framework/operator-registry#984

Workaround for operator-framework/operator-registry#984 Fixes #429 Signed-off-by: Christian Zunker <christian@mondoo.com>

This prevents `permission denied` errors. Workaround for operator-framework/operator-registry#984 Fixes #429 Signed-off-by: Christian Zunker <christian@mondoo.com>

czunker self-assigned this Jul 5, 2022

czunker added the bug Something isn't working label Jul 5, 2022

czunker added a commit that referenced this issue Jul 6, 2022

🐛 Pin opm image to prevent permission denied errors

1d6d279

Workaround for operator-framework/operator-registry#984 Fixes #429 Signed-off-by: Christian Zunker <christian@mondoo.com>

czunker mentioned this issue Jul 6, 2022

🐛 Pin opm image to prevent permission denied errors #432

Merged

chris-rock closed this as completed in #432 Jul 7, 2022

chris-rock pushed a commit that referenced this issue Jul 7, 2022

🐛 Pin opm image to prevent permission denied errors (#432)

df63b7e

Workaround for operator-framework/operator-registry#984 Fixes #429 Signed-off-by: Christian Zunker <christian@mondoo.com>

czunker added a commit that referenced this issue Jul 7, 2022

🐛 Pin opm image in GitHub actions

efb7b65

This prevents `permission denied` errors. Workaround for operator-framework/operator-registry#984 Fixes #429 Signed-off-by: Christian Zunker <christian@mondoo.com>

czunker mentioned this issue Jul 7, 2022

🐛 Pin opm image in GitHub actions #434

Merged

czunker added a commit that referenced this issue Jul 7, 2022

🐛 Pin opm image in GitHub actions (#434)

48f3332

This prevents `permission denied` errors. Workaround for operator-framework/operator-registry#984 Fixes #429 Signed-off-by: Christian Zunker <christian@mondoo.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OLM operator installation stuck for v0.5.0 #429

OLM operator installation stuck for v0.5.0 #429

czunker commented Jul 5, 2022

czunker commented Jul 6, 2022

czunker commented Jul 6, 2022 •

edited

czunker commented Jul 6, 2022

OLM operator installation stuck for v0.5.0 #429

OLM operator installation stuck for v0.5.0 #429

Comments

czunker commented Jul 5, 2022

czunker commented Jul 6, 2022

czunker commented Jul 6, 2022 • edited

czunker commented Jul 6, 2022

czunker commented Jul 6, 2022 •

edited