Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/ghost] Pods fail on init with GKE Autopilot and K8s Docker Desktop #25439

Closed
javajon opened this issue Apr 28, 2024 · 5 comments
Closed
Assignees
Labels
solved stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@javajon
Copy link

javajon commented Apr 28, 2024

Name and Version

bitnami/ghost 20.0.3

What architecture are you using?

amd64

What steps will reproduce the bug?

This is with a GKE Autopilot K8s cluster with the GKE Autopilot defaults.

helm install ghost bitnami/ghost -n ghost --create-namespace --version 20.0.3

then follow the Helm output instructions:

kubectl get svc --namespace ghost -w ghost

Wait for external ip, then:

export APP_HOST=$(kubectl get svc --namespace ghost ghost --template "{{ range (index .status.loadBalancer.ingress 0) }}{{ . }}{{ end }}")
export GHOST_PASSWORD=$(kubectl get secret --namespace "ghost" ghost -o jsonpath="{.data.ghost-password}" | base64 -d)
export MYSQL_ROOT_PASSWORD=$(kubectl get secret --namespace "ghost" ghost-mysql -o jsonpath="{.data.mysql-root-password}" | base64 -d)
export MYSQL_PASSWORD=$(kubectl get secret --namespace "ghost" ghost-mysql -o jsonpath="{.data.mysql-password}" | base64 -d)

Then apply the upgrade:

helm upgrade --namespace ghost ghost oci://registry-1.docker.io/bitnamicharts/ghost --set service.type=LoadBalancer,ghostHost=$APP_HOST,ghostPassword=$GHOST_PASSWORD,mysql.auth.rootPassword=$MYSQL_ROOT_PASSWORD,mysql.auth.password=$MYSQL_PASSWORD

Are you using any custom parameters or values?

Occurs with just the chart defaults.

Several attempts with other parameters such as setting "ephemeral-storage: 10Mi" or attempting to set ephemeral storage to another volume, but those did not work or were not configured correctly.

What is the expected behavior?

Pods and other objects remain healthy and running without restarts and the Ghost web access is available.

What do you see instead?

The result is:

NAME                     READY   STATUS                        RESTARTS       AGE
ghost-7b5474f8df-2zpkp   0/1     ContainerStatusUnknown        1              7m31s
ghost-7b5474f8df-467v2   0/1     PodInitializing               0              19m
ghost-7b5474f8df-4kf89   0/1     PodInitializing               0              12m
ghost-7b5474f8df-4whfk   0/1     PodInitializing               0              2m16s
ghost-7b5474f8df-8bn5l   0/1     ContainerStatusUnknown        1              9m39s
ghost-7b5474f8df-8rpj9   0/1     Init:ContainerStatusUnknown   1              8m13s
ghost-7b5474f8df-bblrw   0/1     PodInitializing               0              15m
ghost-7b5474f8df-bx89l   0/1     ContainerStatusUnknown        1              20m
ghost-7b5474f8df-f5pxv   0/1     ContainerStatusUnknown        1              14m
ghost-7b5474f8df-fnpf5   0/1     Init:0/1                      0              16s
ghost-7b5474f8df-gzzn9   0/1     Error                         0              16m
ghost-7b5474f8df-hfsrr   0/1     ContainerStatusUnknown        1              21m
ghost-7b5474f8df-jq2s4   0/1     ContainerStatusUnknown        1              5m41s
ghost-7b5474f8df-kgtdl   0/1     ContainerStatusUnknown        1              3m41s
ghost-7b5474f8df-lmhl9   0/1     PodInitializing               0              6m16s
ghost-7b5474f8df-lthbb   0/1     ContainerStatusUnknown        1              101s
ghost-7b5474f8df-nr7zf   0/1     ContainerStatusUnknown        1              13m
ghost-7b5474f8df-p2m97   0/1     PodInitializing               0              4m16s
ghost-7b5474f8df-vd27c   0/1     Error                         0              11m
ghost-7b5474f8df-vlb2b   0/1     Error                         0              18m
ghost-7b5474f8df-vxdqp   0/1     PodInitializing               0              17m
ghost-7b5474f8df-x4rmg   0/1     Init:ContainerStatusUnknown   1              10m
ghost-mysql-0            0/1     CrashLoopBackOff              7 (5m7s ago)   22m

Additional information

Kubernetes on GKE Autopilot

This is with a GKE Autopilot K8s cluster with the GKE Autopilot defaults.

Same problem happens consistently with chart versions 20.0.3 (current) and 19.11.7.

MySQL appears healthy and running after the first install. On the helm upgrade with ghost initializing this is when the errors start.

The mysql pod reports:

[ERROR] [MY-013178] [Server] Execution of server-side SQL statement 'ALTER TABLE user MODIFY ssl_type enum('','ANY','X509', 'SPECIFIED') NOT NULL; ' failed with error code = 3664, error message = 'Failed to delete SDI 'mysql.user' in tablespace 'mysql'.'.

The ghost pod reports

message: 'Pod ephemeral local storage usage exceeds the total limit of containers 50Mi.

More details:

mysql pod:

kubectl logs -n ghost ghost-mysql-0

mysql 11:33:40.66 INFO  ==> 
mysql 11:33:40.67 INFO  ==> Welcome to the Bitnami mysql container
mysql 11:33:40.67 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
mysql 11:33:40.70 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
mysql 11:33:40.71 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
mysql 11:33:40.71 INFO  ==>
mysql 11:33:40.72 INFO  ==> ** Starting MySQL setup **
mysql 11:33:40.74 INFO  ==> Validating settings in MYSQL_*/MARIADB_* env vars
mysql 11:33:40.81 INFO  ==> Initializing mysql database
mysql 11:33:40.82 WARN  ==> The mysql configuration file '/opt/bitnami/mysql/conf/my.cnf' is not writable. Configurations based on environment variables will not be applied for this file.
mysql 11:33:40.82 INFO  ==> Using persisted data
mysql 11:33:40.93 INFO  ==> Running mysql_upgrade
mysql 11:33:40.93 INFO  ==> Starting mysql in background
2024-04-28T11:33:41.738861Z 0 [Warning] [MY-011068] [Server] The syntax 'skip_slave_start' is deprecated and will be removed in a future release. Please use skip_replica_start instead.
2024-04-28T11:33:41.739039Z 0 [Warning] [MY-010918] [Server] 'default_authentication_plugin' is deprecated and will be removed in a future release. Please use authentication_policy instead.
2024-04-28T11:33:41.739069Z 0 [System] [MY-010116] [Server] /opt/bitnami/mysql/bin/mysqld (mysqld 8.0.36) starting as process 47
2024-04-28T11:33:41.806649Z 0 [Warning] [MY-013242] [Server] --character-set-server: 'utf8' is currently an alias for the character set UTF8MB3, but will be an alias for UTF8MB4 in a future release. Please consider using UTF8MB4 in order to be unambiguous.
2024-04-28T11:33:41.816532Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-04-28T11:33:44.032310Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2024-04-28T11:33:44.453194Z 4 [System] [MY-013381] [Server] Server upgrade from '80036' to '80036' started.
2024-04-28T11:34:34.814647Z 4 [ERROR] [MY-013178] [Server] Execution of server-side SQL statement 'ALTER TABLE user MODIFY ssl_type enum('','ANY','X509', 'SPECIFIED') NOT NULL; ' failed with error code = 3664, error message = 'Failed to delete SDI 'mysql.user' in tablespace 'mysql'.'. 
2024-04-28T11:34:34.819139Z 0 [ERROR] [MY-013380] [Server] Failed to upgrade server.
2024-04-28T11:34:34.819184Z 0 [ERROR] [MY-010119] [Server] Aborting
2024-04-28T11:34:35.946709Z 0 [System] [MY-010910] [Server] /opt/bitnami/mysql/bin/mysqld: Shutdown complete (mysqld 8.0.36)  Source distribution.

ghost pod: (see status: section)

apiVersion: v1
kind: Pod
metadata:
  annotations:
    checksum/secrets: d4211c28527e68d6c3f7ef94d1b280d05de2f276e1ee249e70eb3baafebca437
  creationTimestamp: "2024-04-28T03:49:23Z"
  generateName: ghost-6b68b4b666-
  labels:
    app.kubernetes.io/component: ghost
    app.kubernetes.io/instance: ghost
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: ghost
    app.kubernetes.io/version: 5.82.2
    helm.sh/chart: ghost-20.0.3
    pod-template-hash: 6b68b4b666
  name: ghost-6b68b4b666-nxzqr
  namespace: ghost
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: ghost-6b68b4b666
    uid: 609b963b-682e-4474-ae9d-eae26007641f
  resourceVersion: "6787543"
  uid: e5d31619-9ce0-4432-a98f-dd68fb865273
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchLabels:
              app.kubernetes.io/instance: ghost
              app.kubernetes.io/name: ghost
          topologyKey: kubernetes.io/hostname
        weight: 1
  automountServiceAccountToken: false
  containers:
  - env:
    - name: BITNAMI_DEBUG
      value: "false"
    - name: ALLOW_EMPTY_PASSWORD
      value: "yes"
    - name: GHOST_DATABASE_HOST
      value: ghost-mysql
    - name: GHOST_DATABASE_PORT_NUMBER
      value: "3306"
    - name: GHOST_DATABASE_NAME
      value: bitnami_ghost
    - name: GHOST_DATABASE_USER
      value: bn_ghost
    - name: GHOST_DATABASE_PASSWORD
      valueFrom:
        secretKeyRef:
          key: mysql-password
          name: ghost-mysql
    - name: GHOST_HOST
      value: 35.223.95.160/
    - name: GHOST_PORT_NUMBER
      value: "2368"
    - name: GHOST_USERNAME
      value: user
    - name: GHOST_PASSWORD
      valueFrom:
        secretKeyRef:
          key: ghost-password
          name: ghost
    - name: GHOST_EMAIL
      value: user@example.com
    - name: GHOST_BLOG_TITLE
      value: User's Blog
    - name: GHOST_ENABLE_HTTPS
      value: "no"
    - name: GHOST_EXTERNAL_HTTP_PORT_NUMBER
      value: "80"
    - name: GHOST_EXTERNAL_HTTPS_PORT_NUMBER
      value: "443"
    - name: GHOST_SKIP_BOOTSTRAP
      value: "no"
    image: docker.io/bitnami/ghost:5.82.2-debian-12-r0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 6
      httpGet:
        path: /
        port: http
        scheme: HTTP
      initialDelaySeconds: 120
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    name: ghost
    ports:
    - containerPort: 2368
      name: http
      protocol: TCP
    readinessProbe:
      failureThreshold: 6
      httpGet:
        path: /
        port: http
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 5
      successThreshold: 1
      timeoutSeconds: 3
    resources:
      limits:
        cpu: 375m
        ephemeral-storage: 50Mi
        memory: 384Mi
      requests:
        cpu: 250m
        ephemeral-storage: 50Mi
        memory: 256Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsGroup: 1001
      runAsNonRoot: true
      runAsUser: 1001
      seLinuxOptions: {}
      seccompProfile:
        type: RuntimeDefault
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /opt/bitnami/ghost
      name: empty-dir
      subPath: app-base-dir
    - mountPath: /.ghost
      name: empty-dir
      subPath: app-tmp-dir
    - mountPath: /tmp
      name: empty-dir
      subPath: tmp-dir
    - mountPath: /bitnami/ghost
      name: ghost-data
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - args:
    - -ec
    - |
      #!/bin/bash

      . /opt/bitnami/scripts/liblog.sh

      info "Copying base dir to empty dir"
      # In order to not break the application functionality (such as upgrades or plugins) we need
      # to make the base directory writable, so we need to copy it to an empty dir volume
      cp -r --preserve=mode /opt/bitnami/ghost /emptydir/app-base-dir
    command:
    - /bin/bash
    image: docker.io/bitnami/ghost:5.82.2-debian-12-r0
    imagePullPolicy: IfNotPresent
    name: prepare-base-dir
    resources:
      limits:
        cpu: 375m
        ephemeral-storage: 50Mi
        memory: 384Mi
      requests:
        cpu: 250m
        ephemeral-storage: 50Mi
        memory: 256Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsGroup: 1001
      runAsNonRoot: true
      runAsUser: 1001
      seLinuxOptions: {}
      seccompProfile:
        type: RuntimeDefault
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /emptydir
      name: empty-dir
  nodeName: gk3-autopilot-cluster-1-pool-2-334409ca-m8xl
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: gke.io/optimize-utilization-scheduler
  securityContext:
    fsGroup: 1001
    fsGroupChangePolicy: Always
    seccompProfile:
      type: RuntimeDefault
  serviceAccount: ghost
  serviceAccountName: ghost
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: kubernetes.io/arch
    operator: Equal
    value: amd64
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - emptyDir: {}
    name: empty-dir
  - name: ghost-data
    persistentVolumeClaim:
      claimName: ghost
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-04-28T03:51:03Z"
    status: "False"
    type: PodReadyToStartContainers
  - lastProbeTime: null
    lastTransitionTime: "2024-04-28T03:50:16Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-04-28T03:49:28Z"
    reason: PodFailed
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-04-28T03:49:28Z"
    reason: PodFailed
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-04-28T03:49:28Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://abde3e651ad72fdce34ef772f743a8b14cf9b007774e8086eafc1d95bbe9caba
    image: docker.io/bitnami/ghost:5.82.2-debian-12-r0
    imageID: docker.io/bitnami/ghost@sha256:8ac0be9f536e22b9ba2c9e6109673e659b1cf4229b07afda1dad86f714d4f8ac
    lastState: {}
    name: ghost
    ready: false
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: containerd://abde3e651ad72fdce34ef772f743a8b14cf9b007774e8086eafc1d95bbe9caba
        exitCode: 137
        finishedAt: "2024-04-28T03:51:02Z"
        reason: Error
        startedAt: "2024-04-28T03:50:16Z"
  hostIP: 10.128.0.7
  hostIPs:
  - ip: 10.128.0.7
  initContainerStatuses:
  - containerID: containerd://d3056e6960a0d796f06468d1e24e7aaae5f9b3b520c574e338851c4555d10ae7
    image: docker.io/bitnami/ghost:5.82.2-debian-12-r0
    imageID: docker.io/bitnami/ghost@sha256:8ac0be9f536e22b9ba2c9e6109673e659b1cf4229b07afda1dad86f714d4f8ac
    lastState: {}
    name: prepare-base-dir
    ready: true
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: containerd://d3056e6960a0d796f06468d1e24e7aaae5f9b3b520c574e338851c4555d10ae7
        exitCode: 0
        finishedAt: "2024-04-28T03:50:15Z"
        reason: Completed
        startedAt: "2024-04-28T03:49:41Z"
  message: 'Pod ephemeral local storage usage exceeds the total limit of containers
    50Mi. '
  phase: Failed
  podIP: 10.8.0.76
  podIPs:
  - ip: 10.8.0.76
  qosClass: Burstable
  reason: Evicted
  startTime: "2024-04-28T03:49:28Z"```


### Kubernetes on Docker Desktop

Running the same chart bitnami/ghost 20.0.3 with the chart defaults on K8s docker-desktop cluster also fails but with different errors:

```sh
kubectl get all,pvc -n ghost
NAME                        READY   STATUS             RESTARTS        AGE
pod/ghost-5d6ddffdd-nxr2z   0/1     CrashLoopBackOff   5 (72s ago)     12m
pod/ghost-mysql-0           0/1     Running            4 (2m11s ago)   12m

NAME                           TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/ghost                  LoadBalancer   10.110.103.164   localhost     80:31652/TCP   12m
service/ghost-mysql            ClusterIP      10.98.157.222    <none>        3306/TCP       12m
service/ghost-mysql-headless   ClusterIP      None             <none>        3306/TCP       12m

NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ghost   0/1     1            0           12m

NAME                              DESIRED   CURRENT   READY   AGE
replicaset.apps/ghost-5d6ddffdd   1         1         0       12m

NAME                           READY   AGE
statefulset.apps/ghost-mysql   0/1     12m

NAME                                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/data-ghost-mysql-0   Bound    pvc-48214596-b603-48d5-9eb0-02bd23c4f7eb   8Gi        RWO            hostpath       12m
persistentvolumeclaim/ghost                Bound    pvc-174da2a1-ffbe-4c34-bbb8-65236246beea   8Gi        RWO            hostpath       12m

ghost pod

kubectl logs -n ghost ghost-5d6ddffdd-nxr2z 

ghost 12:33:24.29 INFO  ==> 
ghost 12:33:24.29 INFO  ==> Welcome to the Bitnami ghost container
ghost 12:33:24.29 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
ghost 12:33:24.29 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
ghost 12:33:24.29 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
ghost 12:33:24.29 INFO  ==>
ghost 12:33:24.30 INFO  ==> Configuring libnss_wrapper
ghost 12:33:24.31 INFO  ==> Validating settings in MYSQL_CLIENT_* env vars
ghost 12:33:24.38 WARN  ==> You set the environment variable ALLOW_EMPTY_PASSWORD=yes. For safety reasons, do not use this flag in a production environment.
ghost 12:33:24.39 INFO  ==> Ensuring Ghost directories exist
ghost 12:33:24.39 INFO  ==> Trying to connect to the database server
ghost 12:34:24.58 ERROR ==> Could not connect to the database

mysql pod

$ k logs -n ghost ghost-mysql-0
mysql 12:28:45.37 INFO  ==> 
mysql 12:28:45.37 INFO  ==> Welcome to the Bitnami mysql container
mysql 12:28:45.37 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
mysql 12:28:45.37 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
mysql 12:28:45.37 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
mysql 12:28:45.37 INFO  ==>
mysql 12:28:45.38 INFO  ==> ** Starting MySQL setup **
mysql 12:28:45.39 INFO  ==> Validating settings in MYSQL_*/MARIADB_* env vars
mysql 12:28:45.39 INFO  ==> Initializing mysql database
mysql 12:28:45.40 WARN  ==> The mysql configuration file '/opt/bitnami/mysql/conf/my.cnf' is not writable. Configurations based on environment variables will not be applied for this file.
mysql 12:28:45.40 INFO  ==> Using persisted data
mysql 12:28:45.49 INFO  ==> Running mysql_upgrade
mysql 12:28:45.49 INFO  ==> Starting mysql in background
2024-04-28T12:28:45.903818Z 0 [Warning] [MY-011068] [Server] The syntax 'skip_slave_start' is deprecated and will be removed in a future release. Please use skip_replica_start instead.
2024-04-28T12:28:45.903921Z 0 [Warning] [MY-010918] [Server] 'default_authentication_plugin' is deprecated and will be removed in a future release. Please use authentication_policy instead.
2024-04-28T12:28:45.903936Z 0 [System] [MY-010116] [Server] /opt/bitnami/mysql/bin/mysqld (mysqld 8.0.36) starting as process 47
2024-04-28T12:28:45.906062Z 0 [Warning] [MY-013242] [Server] --character-set-server: 'utf8' is currently an alias for the character set UTF8MB3, but will be an alias for UTF8MB4 in a future release. Please consider using UTF8MB4 in order to be unambiguous.
2024-04-28T12:28:45.913967Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-04-28T12:28:47.995906Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2024-04-28T12:28:48.306612Z 4 [System] [MY-013381] [Server] Server upgrade from '80036' to '80036' started.
/opt/bitnami/scripts/libos.sh: line 346:    47 Killed                  "$@" > /dev/null 2>&1
@javajon javajon added the tech-issues The user has a technical issue about an application label Apr 28, 2024
@github-actions github-actions bot added the triage Triage is needed label Apr 28, 2024
@javajon javajon changed the title [gitnami/ghost] Pods fail on init with GKE Autopilot and K8s Docker Desktop [bitnami/ghost] Pods fail on init with GKE Autopilot and K8s Docker Desktop Apr 28, 2024
@javsalgar
Copy link
Contributor

Hi,

About this error

/opt/bitnami/scripts/libos.sh: line 346:    47 Killed                  "$@" > /dev/null 2>&1

It seems to me that the resource preset in your platform is not enough and requires a higher limit (which is strange as we test this chart in all major k8s distros), could you setting higher resource limits in the mysql.resources section? Another option could be mysql.resourcesPreset but this is not recommended in production.

@javajon
Copy link
Author

javajon commented May 1, 2024

While GKE clusters are mostly likely on your major k8s distro list, perhaps the testing is on a GKE "standard" cluster, and the delineation of "GKE Autopilot" has not been attempted. My previous testing was without a sufficient understanding of the behaviors of ephemeral-storage as a resource with requests and limits.

Considering your suggestion I provided the chart with these override defaults, with ample storage.

# micro
resources:
  limits:
    cpu: 375m
    ephemeral-storage: 5Gi
    memory: 384Mi
  requests:
    cpu: 250m
    ephemeral-storage: 2Gi
    memory: 256Mi    

# micro
volumePermissions:
  resources:
    limits:
      cpu: 375m
      ephemeral-storage: 5Gi
      memory: 384Mi
    requests:
      cpu: 250m
      ephemeral-storage: 2Gi
      memory: 256Mi    

# small
mysql:
  primary:
    resources:
      limits:
        cpu: 700m
        ephemeral-storage: 5Gi
        memory: 768Mi
      requests:
        cpu: 500m
        ephemeral-storage: 2Gi
        memory: 512Mi 

The MySQL pod appears to be healthy and logging appropriately.

However, the ghost pod repeatedly tries to start and is evicted with this error:

Events:
  Type     Reason               Age   From                                   Message
  ----     ------               ----  ----                                   -------
  Normal   Scheduled            17m   gke.io/optimize-utilization-scheduler  Successfully assigned ghost/ghost-8fcc8fbf5-ztmdt to gk3-autopilot-cluster-1-pool-2-9a2f203f-hql9
  Normal   Pulled               17m   kubelet                                Container image "docker.io/bitnami/ghost:5.82.5-debian-12-r0" already present on machine
  Normal   Created              17m   kubelet                                Created container prepare-base-dir
  Normal   Started              17m   kubelet                                Started container prepare-base-dir
  Normal   Pulled               16m   kubelet                                Container image "docker.io/bitnami/ghost:5.82.5-debian-12-r0" already present on machine
  Normal   Created              16m   kubelet                                Created container ghost
  Normal   Started              16m   kubelet                                Started container ghost
  Warning  Evicted              16m   kubelet                                Pod ephemeral local storage usage exceeds the total limit of containers 50Mi.
  Normal   Killing              16m   kubelet                                Stopping container ghost
  Warning  ExceededGracePeriod  16m   kubelet                                Container runtime did not kill the pod within specified grace period.

Despite asking for 2Gi, there appears to be a hard limit on the GKE node which perhaps forces the Pods to have a 50Mi ceiling. This is evident as the Pod description reflects this 50Mi cap.:

    Limits:
      cpu:                375m
      ephemeral-storage:  50Mi
      memory:             384Mi
    Requests:
      cpu:                250m
      ephemeral-storage:  50Mi
      memory:             256Mi

My understanding is there are a few solutions:

  1. Adjust the cluster's limit of ephemeral-storage. Since we cannot adjust the Autopilot criteria, the option would be to use a different cluster type GKE standard or another cluster flavor.

  2. Change the ghost code so not as much data is stored to ephemeral-storage. I believe this also includes emptydir and any other files written to the container or pod that are not mounted to a PVC. For me and Bitnami chart authors, changing the Ghost code is not an immediate option. Perhaps there is a Ghost environment variable to reduce the verbosity of data being produced?

  3. Mount the verbose output that exceeds the 50Mi limit to a PVC. At this point, I have not investigated what file(s) are getting so large and if they are going to emptyDir how to mount those to a PVC.

However, I'm confused because I was running a Helm chart for Ollama and was hitting the exact same eviction error related to ephemeral-storage. I increased the chart resource defaults with this:

resources:
  limits:
    ephemeral-storage: 5Gi
  requests:
    ephemeral-storage: 5Gi
    memory: 2Gi

and the problem went away and Ollama is healthy. With the Pod reporting the ephemeral-storage at 5Gi as requested:

   Requests:
      cpu:                308m
      ephemeral-storage:  5Gi
      memory:             2Gi

This tells me I do not understand Autopilot or assume that's what limits the ephemeral-storage request. Why would the Bitnami Ghost deployment be capped at 50Mi, yet this other chart has no problem requesting 5Gi?

Thoughts?

@javsalgar
Copy link
Contributor

Hi,

This is weird, as we do not set any kind of cap in the resources. Could you confirm that the rendered YAML indeed sets the ephemeral-storage to 5Gi? As this 50Mi could be related to the resources section not being rendered correctly

Copy link

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label May 22, 2024
Copy link

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

@bitnami-bot bitnami-bot closed this as not planned Won't fix, can't repro, duplicate, stale May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved stale 15 days without activity tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

No branches or pull requests

3 participants