Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple ClickHouse instance with PVC Installation is failing while starting pod #1464

Closed
Rajpratik71 opened this issue Jul 24, 2024 · 24 comments

Comments

@Rajpratik71
Copy link

Operator install is success.

When tried to deploy "ClickHouse" Instance using below yaml

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "clickhouse-olap"
spec:
  defaults:
    templates:
      dataVolumeClaimTemplate: data-volume-template
      logVolumeClaimTemplate: log-volume-template
  configuration:
    clusters:
      - name: "clickhouse-olap"
        layout:
          shardsCount: 1
          replicasCount: 1
  templates:
    volumeClaimTemplates:
      - name: data-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Gi
      - name: log-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Gi

getting below error while starting the pod

pratikraj@Pratiks-MacBook-Pro common % oc -n clickhouse get po,svc,pvc,ClickHouseInstallation                          
NAME                                            READY   STATUS             RESTARTS       AGE
pod/chi-clickhouse-olap-clickhouse-olap-0-0-0   1/2     CrashLoopBackOff   10 (72s ago)   27m

NAME                                              TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                      AGE
service/chi-clickhouse-olap-clickhouse-olap-0-0   ClusterIP   None         <none>        9000/TCP,8123/TCP,9009/TCP   22m
service/clickhouse-clickhouse-olap                ClusterIP   None         <none>        8123/TCP,9000/TCP            12m

NAME                                                                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
persistentvolumeclaim/data-volume-template-chi-clickhouse-olap-clickhouse-olap-0-0-0   Bound    pvc-6e2da356-cd7d-4c39-98be-a047235b1fc4   100Gi      RWO            rook-ceph-block   27m
persistentvolumeclaim/log-volume-template-chi-clickhouse-olap-clickhouse-olap-0-0-0    Bound    pvc-f2bd4c06-32ef-411f-aeb8-1b6b417fc457   100Gi      RWO            rook-ceph-block   27m

NAME                                                             CLUSTERS   HOSTS   STATUS      HOSTS-COMPLETED   AGE
clickhouseinstallation.clickhouse.altinity.com/clickhouse-olap   1          1       Completed                     28m
pratikraj@Pratiks-MacBook-Pro common % 
pratikraj@Pratiks-MacBook-Pro common % 
pratikraj@Pratiks-MacBook-Pro common % oc -n clickhouse logs -f --tail=10 pod/chi-clickhouse-olap-clickhouse-olap-0-0-0
Defaulted container "clickhouse" out of: clickhouse, clickhouse-log
Code: 36. DB::Exception: Group 0 is not found in the system. (BAD_ARGUMENTS) (version 24.6.2.17 (official build))
Couldn't create necessary directory: /var/lib/clickhouse/
pratikraj@Pratiks-MacBook-Pro common % 
@Slach
Copy link
Collaborator

Slach commented Jul 24, 2024

Let's check, is /var/lib/clickhouse present in pod mounts ?
Could you share?
oc -n clickhouse getpod chi-clickhouse-olap-clickhouse-olap-0-0-0 -o yaml

@Rajpratik71
Copy link
Author

It is there

pratikraj@Pratiks-MacBook-Pro common % oc get pod -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0 -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.254.22.217/22"],"mac_address":"0a:58:0a:fe:16:d9","gateway_ips":["10.254.20.1"],"routes":[{"dest":"10.254.0.0/16","nextHop":"10.254.20.1"},{"dest":"172.30.0.0/16","nextHop":"10.254.20.1"},{"dest":"100.64.0.0/16","nextHop":"10.254.20.1"}],"ip_address":"10.254.22.217/22","gateway_ip":"10.254.20.1"}}'
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "ovn-kubernetes",
          "interface": "eth0",
          "ips": [
              "10.254.22.217"
          ],
          "mac": "0a:58:0a:fe:16:d9",
          "default": true,
          "dns": {}
      }]
    openshift.io/scc: restricted-v2
    seccomp.security.alpha.kubernetes.io/pod: runtime/default
  creationTimestamp: "2024-07-24T19:42:43Z"
  generateName: chi-clickhouse-olap-clickhouse-olap-0-0-
  labels:
    clickhouse.altinity.com/app: chop
    clickhouse.altinity.com/chi: clickhouse-olap
    clickhouse.altinity.com/cluster: clickhouse-olap
    clickhouse.altinity.com/namespace: clickhouse
    clickhouse.altinity.com/ready: "yes"
    clickhouse.altinity.com/replica: "0"
    clickhouse.altinity.com/shard: "0"
    controller-revision-hash: chi-clickhouse-olap-clickhouse-olap-0-0-6d59bbb7c
    statefulset.kubernetes.io/pod-name: chi-clickhouse-olap-clickhouse-olap-0-0-0
  name: chi-clickhouse-olap-clickhouse-olap-0-0-0
  namespace: clickhouse
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: StatefulSet
    name: chi-clickhouse-olap-clickhouse-olap-0-0
    uid: 3410fc06-876c-4748-814c-c4b511685b60
  resourceVersion: "1682310"
  uid: 25afed6c-7481-4284-99a3-97caf8516279
spec:
  containers:
  - image: clickhouse/clickhouse-server:latest
    imagePullPolicy: Always
    livenessProbe:
      failureThreshold: 10
      httpGet:
        path: /ping
        port: http
        scheme: HTTP
      initialDelaySeconds: 60
      periodSeconds: 3
      successThreshold: 1
      timeoutSeconds: 1
    name: clickhouse
    ports:
    - containerPort: 9000
      name: tcp
      protocol: TCP
    - containerPort: 8123
      name: http
      protocol: TCP
    - containerPort: 9009
      name: interserver
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /ping
        port: http
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 3
      successThreshold: 1
      timeoutSeconds: 1
    resources: {}
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000730000
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/clickhouse-server/config.d/
      name: chi-clickhouse-olap-common-configd
    - mountPath: /etc/clickhouse-server/users.d/
      name: chi-clickhouse-olap-common-usersd
    - mountPath: /etc/clickhouse-server/conf.d/
      name: chi-clickhouse-olap-deploy-confd-clickhouse-olap-0-0
    - mountPath: /var/lib/clickhouse
      name: data-volume-template
    - mountPath: /var/log/clickhouse-server
      name: log-volume-template
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-r2djq
      readOnly: true
  - args:
    - while true; do sleep 30; done;
    command:
    - /bin/sh
    - -c
    - --
    image: registry.access.redhat.com/ubi8/ubi-minimal:latest
    imagePullPolicy: Always
    name: clickhouse-log
    resources: {}
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000730000
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/clickhouse
      name: data-volume-template
    - mountPath: /var/log/clickhouse-server
      name: log-volume-template
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-r2djq
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostAliases:
  - hostnames:
    - chi-clickhouse-olap-clickhouse-olap-0-0
    ip: 127.0.0.1
  hostname: chi-clickhouse-olap-clickhouse-olap-0-0-0
  imagePullSecrets:
  - name: default-dockercfg-v9z72
  nodeName: worker1.gi-tracing-poc.cp.fyre.ibm.com
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 1000730000
    seLinuxOptions:
      level: s0:c27,c14
    seccompProfile:
      type: RuntimeDefault
  serviceAccount: default
  serviceAccountName: default
  subdomain: chi-clickhouse-olap-clickhouse-olap-0-0
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: data-volume-template
    persistentVolumeClaim:
      claimName: data-volume-template-chi-clickhouse-olap-clickhouse-olap-0-0-0
  - name: log-volume-template
    persistentVolumeClaim:
      claimName: log-volume-template-chi-clickhouse-olap-clickhouse-olap-0-0-0
  - configMap:
      defaultMode: 420
      name: chi-clickhouse-olap-common-configd
    name: chi-clickhouse-olap-common-configd
  - configMap:
      defaultMode: 420
      name: chi-clickhouse-olap-common-usersd
    name: chi-clickhouse-olap-common-usersd
  - configMap:
      defaultMode: 420
      name: chi-clickhouse-olap-deploy-confd-clickhouse-olap-0-0
    name: chi-clickhouse-olap-deploy-confd-clickhouse-olap-0-0
  - name: kube-api-access-r2djq
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
      - configMap:
          items:
          - key: service-ca.crt
            path: service-ca.crt
          name: openshift-service-ca.crt
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-07-24T19:42:44Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-07-24T19:42:44Z"
    message: 'containers with unready status: [clickhouse]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-07-24T19:42:44Z"
    message: 'containers with unready status: [clickhouse]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-07-24T19:42:44Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://3a9e1ba9e1cb22a9f9a3b962500d244fe0717c8471848c177f60ac059fec089c
    image: docker.io/clickhouse/clickhouse-server:latest
    imageID: docker.io/clickhouse/clickhouse-server@sha256:00d808c094fa0e790b662f4ee5b7a7476c990c79907c997ac2a1484a8833ab70
    lastState:
      terminated:
        containerID: cri-o://3a9e1ba9e1cb22a9f9a3b962500d244fe0717c8471848c177f60ac059fec089c
        exitCode: 1
        finishedAt: "2024-07-24T20:19:41Z"
        reason: Error
        startedAt: "2024-07-24T20:19:40Z"
    name: clickhouse
    ready: false
    restartCount: 12
    started: false
    state:
      waiting:
        message: back-off 5m0s restarting failed container=clickhouse pod=chi-clickhouse-olap-clickhouse-olap-0-0-0_clickhouse(25afed6c-7481-4284-99a3-97caf8516279)
        reason: CrashLoopBackOff
  - containerID: cri-o://b2c34c6fcabcbe6aa3b97512ea692bb3453ae012b09404a9d2c29456b9885760
    image: registry.access.redhat.com/ubi8/ubi-minimal:latest
    imageID: registry.access.redhat.com/ubi8/ubi-minimal@sha256:a6e546ff72e0eca114e0bfee08aa5b1bba726fc3986a8fa1e453629e054c4357
    lastState: {}
    name: clickhouse-log
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2024-07-24T19:42:49Z"
  hostIP: 10.17.60.70
  phase: Running
  podIP: 10.254.22.217
  podIPs:
  - ip: 10.254.22.217
  qosClass: BestEffort
  startTime: "2024-07-24T19:42:44Z"

@Slach
Copy link
Collaborator

Slach commented Jul 25, 2024

Which component responsible for second container with sleep 30 and securityContext?
I don't see where exactly you configure it, looks like these customizations added outside from clickhouse-operator?

I think the root reason is

    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000730000

to strict security context

could we change it to following according to internal docker image uid?

    securityContext:
      runAsUser: 101
      runAsGroup: 101
      fsGroup: 101
      allowPrivilegeEscalation: false
      capabilities:
        drop: [ "ALL" ]
        add: [ "CAP_NICE", "CAP_IPC_LOCK" ]

If we can't change securityContext

could we add

  env:
  - name: CLIKCHOUSE_UID
    value: 1000730000  

to podTemplate?

Something like that should work

spec:
  defaults:
    templates:
      podTemlate: custom-uid

  templates:
    podTemplates:
    - name: custom-uid
      spec:
        containers:
        - name: clickhouse
          image: clickhouse/clickhouse-server:latest
          env:
          - name: CLIKCHOUSE_UID
            value: 1000730000  
          - name: CLIKCHOUSE_GID
            value: 1000730000              

@Rajpratik71
Copy link
Author

Tried below snippet but adding below snippet resulted in pending instance creation

spec:
  defaults:
    templates:
      podTemlate: custom-uid

  templates:
    podTemplates:
    - name: custom-uid
      spec:
        containers:
        - name: clickhouse
          image: clickhouse/clickhouse-server:latest
          env:
          - name: CLIKCHOUSE_UID
            value: 1000730000  
          - name: CLIKCHOUSE_GID
            value: 1000730000              

even after 3 mins it remains in pending state

image

while without the above snippet , it got created in few sec.

image

@Slach
Copy link
Collaborator

Slach commented Jul 25, 2024

which clickhouse-opeator version do you use
could you share?

oc get pods --all-namespaces -l app=clickhouse-operator -o jsonpath="{.items[*].spec.containers[*].image}"

@Rajpratik71
Copy link
Author

altinity/clickhouse-operator:0.23.6 altinity/metrics-exporter:0.23.6

@Slach
Copy link
Collaborator

Slach commented Jul 25, 2024

apply changes from #1464 (comment)
again

and when status will InProgress

share
oc describe chi -n clickhouse clickhouse-olap

Events Section

@Rajpratik71
Copy link
Author

No success.

Even after 65m , no update .

pratikraj@Pratiks-MacBook-Pro common % oc -n clickhouse get po,svc,pvc,ClickHouseInstallation
NAME                                                             CLUSTERS   HOSTS   STATUS   HOSTS-COMPLETED   AGE
clickhouseinstallation.clickhouse.altinity.com/clickhouse-olap                                                 65m
pratikraj@Pratiks-MacBook-Pro common % 
pratikraj@Pratiks-MacBook-Pro common % oc describe chi -n clickhouse clickhouse-olap         
Name:         clickhouse-olap
Namespace:    clickhouse
Labels:       <none>
Annotations:  <none>
API Version:  clickhouse.altinity.com/v1
Kind:         ClickHouseInstallation
Metadata:
  Creation Timestamp:  2024-07-25T12:47:20Z
  Generation:          1
  Resource Version:    3999789
  UID:                 20e63d44-dc3d-4b29-a850-6b688637655f
Spec:
  Configuration:
    Clusters:
      Layout:
        Replicas Count:  1
        Shards Count:    1
      Name:              clickhouse-olap
  Defaults:
    Templates:
      Data Volume Claim Template:  data-volume-template
      Log Volume Claim Template:   log-volume-template
  Templates:
    Pod Templates:
      Name:  custom-uid
      Spec:
        Containers:
          Env:
            Name:   CLIKCHOUSE_UID
            Value:  1000730000
            Name:   CLIKCHOUSE_GID
            Value:  1000730000
          Image:    clickhouse/clickhouse-server:latest
          Name:     clickhouse
    Volume Claim Templates:
      Name:  data-volume-template
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:  100Gi
      Name:           log-volume-template
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:  100Gi
Events:               <none>
pratikraj@Pratiks-MacBook-Pro common % 
pratikraj@Pratiks-MacBook-Pro common % 

One thing i notice that "podTemlate: custom-uid" is not missing in respective section.

@Slach
Copy link
Collaborator

Slach commented Jul 25, 2024

check clickhouse-operator logs

oc logs --all-namespaces -l app=clickhouse-operator --container clickhouse-operator --since=2h
if not work

check

oc get pods --all-namespaces -l app=clickhouse-operator

and
oc logs -n <your-namespace-where-operator-installed> deployment/clickhouse-operator --container clickhouse-operator --since=2h

@Slach
Copy link
Collaborator

Slach commented Jul 25, 2024

also check
oc get sts -n clickhouse

@Rajpratik71
Copy link
Author

In operator log , got the issue

W0726 03:29:44.109340       1 reflector.go:533] pkg/client/informers/externalversions/factory.go:132: failed to list *v1.ClickHouseInstallation: json: cannot unmarshal number into Go struct field EnvVar.items.spec.templates.podTemplates.spec.containers.env.value of type string
E0726 03:29:44.109373       1 reflector.go:148] pkg/client/informers/externalversions/factory.go:132: Failed to watch *v1.ClickHouseInstallation: failed to list *v1.ClickHouseInstallation: json: cannot unmarshal number into Go struct field EnvVar.items.spec.templates.podTemplates.spec.containers.env.value of type string

which is fixed by adding quotes in env values.

But again same issue while starting the pod

pratikraj@Pratiks-MacBook-Pro common % 
pratikraj@Pratiks-MacBook-Pro common % oc -n clickhouse get po,svc,pvc,ClickHouseInstallation             
NAME                                            READY   STATUS             RESTARTS         AGE
pod/chi-clickhouse-olap-clickhouse-olap-0-0-0   1/2     CrashLoopBackOff   18 (2m46s ago)   70m

NAME                                              TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                      AGE
service/chi-clickhouse-olap-clickhouse-olap-0-0   ClusterIP   None         <none>        9000/TCP,8123/TCP,9009/TCP   65m
service/clickhouse-clickhouse-olap                ClusterIP   None         <none>        8123/TCP,9000/TCP            55m

NAME                                                                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
persistentvolumeclaim/data-volume-template-chi-clickhouse-olap-clickhouse-olap-0-0-0   Bound    pvc-6f901528-fc9d-468f-a1c6-5d0d1ef687a4   100Gi      RWO            rook-ceph-block   70m
persistentvolumeclaim/log-volume-template-chi-clickhouse-olap-clickhouse-olap-0-0-0    Bound    pvc-6952669c-93ff-4e31-8e74-c6819fa3668f   100Gi      RWO            rook-ceph-block   70m

NAME                                                             CLUSTERS   HOSTS   STATUS      HOSTS-COMPLETED   AGE
clickhouseinstallation.clickhouse.altinity.com/clickhouse-olap   1          1       Completed                     70m
pratikraj@Pratiks-MacBook-Pro common % 
pratikraj@Pratiks-MacBook-Pro common % oc logs -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0    
Defaulted container "clickhouse" out of: clickhouse, clickhouse-log
Code: 36. DB::Exception: Group 0 is not found in the system. (BAD_ARGUMENTS) (version 24.6.2.17 (official build))
Couldn't create necessary directory: /var/lib/clickhouse/
pratikraj@Pratiks-MacBook-Pro common % 
pratikraj@Pratiks-MacBook-Pro common % 

@Slach
Copy link
Collaborator

Slach commented Jul 26, 2024

need to figure out, which component added security context in your OpenShift?

@Slach
Copy link
Collaborator

Slach commented Jul 26, 2024

oc exec -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0 --container clickhouse-log -- ls -la /var/lib/clickhouse

oc exec -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0 --container clickhouse-log -- whoami

@Rajpratik71
Copy link
Author

need to figure out, which component added security context in your OpenShift?

I think default "PodSecurityPolicy" is enabled. I don't have any other security / policy enforcer tool installed.

@Rajpratik71
Copy link
Author

oc exec -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0 --container clickhouse-log -- ls -la /var/lib/clickhouse

oc exec -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0 --container clickhouse-log -- whoami
pratikraj@Pratiks-MacBook-Pro ~ % 
pratikraj@Pratiks-MacBook-Pro ~ % oc exec -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0 --container clickhouse-log -- ls -la /var/lib/clickhouse
total 20
drwxrwsrwx. 3 root 1000730000  4096 Jul 26 03:31 .
drwxr-xr-x. 1 root root          24 Jul 26 03:31 ..
drwxrws---. 2 root 1000730000 16384 Jul 26 03:31 lost+found
pratikraj@Pratiks-MacBook-Pro ~ % 
pratikraj@Pratiks-MacBook-Pro ~ % 
pratikraj@Pratiks-MacBook-Pro ~ % oc exec -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0 --container clickhouse-log -- whoami
1000730000
pratikraj@Pratiks-MacBook-Pro ~ % 
pratikraj@Pratiks-MacBook-Pro ~ % 

@Slach
Copy link
Collaborator

Slach commented Jul 29, 2024

let's apply CLICKHOUSE_DO_NOT_CHOWN=1

spec:
  defaults:
    templates:
      podTemlate: custom-uid

  templates:
    podTemplates:
    - name: custom-uid
      spec:
        containers:
        - name: clickhouse
          image: clickhouse/clickhouse-server:latest
          env:
          - name: CLIKCHOUSE_UID
            value: "1000730000"  
          - name: CLIKCHOUSE_GID
            value: "1000730000"
          - name: CLICKHOUSE_DO_NOT_CHOWN
            value: "1"

@Rajpratik71
Copy link
Author

same issue

@alex-zaitsev
Copy link
Member

@Rajpratik71 , is it resolved? What is a reason of CrashLoopBackOff? (you should be able to see it in container logs)

Do you need clickhouse-logs container, btw? It is rarely useful

@Rajpratik71
Copy link
Author

@Rajpratik71 , is it resolved? What is a reason of CrashLoopBackOff? (you should be able to see it in container logs)

Do you need clickhouse-logs container, btw? It is rarely useful

@alex-zaitsev same issue.

getting below in log

pratikraj@Pratiks-MacBook-Pro common % oc logs -n clickhouse chi-clickhouse-olap-clickhouse-olap-0-0-0    
Defaulted container "clickhouse" out of: clickhouse, clickhouse-log
Code: 36. DB::Exception: Group 0 is not found in the system. (BAD_ARGUMENTS) (version 24.6.2.17 (official build))
Couldn't create necessary directory: /var/lib/clickhouse/
pratikraj@Pratiks-MacBook-Pro common % 

@alex-zaitsev
Copy link
Member

@Rajpratik71 , have you tried adding security context as suggested above?

    securityContext:
      runAsUser: 101
      runAsGroup: 101
      fsGroup: 101
      allowPrivilegeEscalation: false

Could you post here full CHI spec deleting sensitive data?

@keyute
Copy link

keyute commented Oct 25, 2024

Have ran into the same issue when deploying on OpenShift. I have tried the above (fixed the spelling mistakes) without success. It might be related to this issue ClickHouse/ClickHouse#59141 as well.

@keyute
Copy link

keyute commented Oct 25, 2024

@Rajpratik71 , have you tried adding security context as suggested above?

    securityContext:
      runAsUser: 101
      runAsGroup: 101
      fsGroup: 101
      allowPrivilegeEscalation: false

Could you post here full CHI spec deleting sensitive data?

This works as we are running the intended UID and GID for the docker container. Unfortunately this means that we will need a custom scc (or anyuid) to run the pod. Hopefully the above-mentioned issue gets fixed so that we can run the docker container as non-root.

@keyute
Copy link

keyute commented Oct 25, 2024

For those on OpenShift, you can try this workaround for every namespace

  1. Create the service account
kubectl create sa clickhouse -n test-clickhouse
  1. Give the service account anyuid permissions (or a custom scc that is more restrictive that can run as 101)
oc adm policy add-scc-to-user anyuid -z clickhouse -n test-clickhouse
  1. Run the pod with the service account
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: clickhouse
  namespace: test-clickhouse
spec:
  defaults:
    templates:
      podTemplate: custom-uid
  configuration:
    clusters:
      - name: clickhouse
        layout:
          shardsCount: 1
          replicasCount: 1
  templates:
    podTemplates:
      - name: custom-uid
        spec:
          securityContext:
            runAsUser: 101
            runAsGroup: 101
            fsGroup: 101
            allowPrivilegeEscalation: false
          serviceAccountName: clickhouse
          automountServiceAccountToken: false

@Slach
Copy link
Collaborator

Slach commented Oct 25, 2024

@keyute thanks for openshift workaround, let's close issue

@Slach Slach closed this as completed Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants