[stable/nextcloud] Image stuck at Initializing NextCloud... when PVC is attached #22920

mikeyGlitz · 2020-06-24T02:26:57Z

Describe the bug

When the helm chart is bringing up NextCloud, the application does not get past the log message

Initializing Nextcloud 17.0.7...

Version of Helm and Kubernetes:

helm: v3.2.1
kubernetes: v1.18.4+k3s1

Which chart:

stable/nextcloud

What happened:

Namespace is created.
Helm creates persistent-volume-claim
Helm instantiates MariaDB using bitnami/mariadb chart
Helm instantiates Nextcloud container
Nextcloud container starts
Nextcloud container does not get past

Initializing Nextcloud 17.0.7...

What you expected to happen:

Nextcloud was supposed to finish initialization
Nextcloud files were supposed to be copied with correct permissions to the PVC

How to reproduce it (as minimally and precisely as possible):

Initialize helm with the following:

helm install stable/nfs-client-provisioner nfs --namespace=nas  \
--set nfs.server=x.x.x.x --set nfs.path=/mnt/external

helm install -f values.yaml stable/nextcloud files --namespace=nextcloud

values.yaml

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: traefik
    cert-manager.io/cluster-issuer: cluster-issuer
    traefik.ingress.kubernetes.io/redirect-entry-point: https
    traefik.frontend.passHostHeader: "true"
  tls:
   - hosts:
     - files.haus.net
      secretName: nextcloud-app-tls
nextcloud.host: files.haus.net
nextcloud.username: admin
nextcloud.password: P@$$w0rd!
internalDatabase:
  enabled: false
mariadb:
  enabled: yes
  password: P@$$w0rd!
  user: nextcloud
  name: nextcloud
persistence:
  enabled: yes
  storageClass: nfs-client
  size: 1Ti

The text was updated successfully, but these errors were encountered:

11jwolfe2 · 2020-06-29T21:12:09Z

I have also been trying to get this install to work with a PV and PVC and no luck, If I do it without a PVC and PV it works, as soon as I enable the PV, it says nextcloud directory isn't found, so I make the directory. Then it says "Error: failed to create subPath directory for volumeMount "nextcloud-data" of container "nextcloud"". does anyone have any ideas about this?

derdrdirk · 2020-06-30T19:11:42Z

I am having the same issue. I also use nfs-client as storageClass, which might cause this bug? IIRC I used a manual created PV some time back and it worked.

Have you figured out how to make this work?

almahmoud · 2020-06-30T20:53:37Z

Not sure if we are having the same issue, but I will detail my investigation so far on trying to use persistence.existingClaim, in case it helps people progress in their own investigations and/or if the context would help someone more knowledgeable provide some help as I have only worked with k8s for a year or so.

From what I could see, the container creation process errors out with:

Error: failed to start container "nextcloud": Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/var/lib/kubelet/pods/49c19090-14d6-4bee-b774-ca24b0ddd259/volume-subpaths/jun30third-nextcloud-data-pv/nextcloud/0\\\" to rootfs \\\"/var/lib/docker/overlay2/40dca10bcad3a57d61d35d40d0bd897f6d2322c3a5d9f615d2a90a38d7fe4cd5/merged\\\" at \\\"/var/lib/docker/overlay2/40dca10bcad3a57d61d35d40d0bd897f6d2322c3a5d9f615d2a90a38d7fe4cd5/merged/var/www\\\" caused \\\"no such file or directory\\\"\"": unknown

I've looked on the node during the time of the directory creation, and some things to note:

the source directory always exists and the path is correct
the destination directory up to (.*)/merged is created when the container is being spun up, but I could never see merged directory inside (although I didn't have the container ID beforehand so relied on watch commands and manually looking on the node, so I can't guarantee that it was never there, I just know I could never see it there)

The only lead I've found so far to why this might be happening is kubernetes/kubernetes#61545 (comment) and the following comment links kubernetes/kubernetes#61563 (comment). My guess is that this is related to the second issue in the last comment (i.e. kubernetes/kubernetes#61545), given that the config mounts are nested inside the directory mount, however given that the error is on subpath /nextcloud/0 of the container (which I have verified is the root subpath), this might not be true but is my best lead so far.

I'm currently poking by manually changing specifications to see if any configuration works (i.e. trying different variations of the mountpaths nesting to see if I can get it to start up manually before figuring out how to correct the chart), but in the meantime if anyone else finds a solution and/or if it seems I'm going down the wrong trail, please let me know!

Update: it is not the configmap causing this in my case, it's the nested mounts: https://github.com/helm/charts/blob/master/stable/nextcloud/templates/deployment.yaml#L289 Additionally, the problem only appears after the first restart (it seems that the first time it can do the mounting, but once things get written to the volumes and the container restarts, the bind mounts fail for the new container with the above error). This problem might be specific to our storage class (we're using an RClone CSI which fuse-mounts an s3 bucket) and different from yours, although I haven't tried it with a nfs layer on top yet to confirm. This does seem to be different than what you're seeing though... (sorry for hijacking your issue).

In case this comes up for anyone else current workaround is keeping only the root directory mount (which is enough to backup everything else as they are nested) and that seems to fix the problem

11jwolfe2 · 2020-07-01T02:27:50Z

Okay I got it working! I am using a Open Media vault NFS share for all of my persistent volumes. I set them up with the following settings and it now works without any issues when using the regular helm install, no extra stuff required.

Settings for nfs share.

rw,no_root_squash,insecure,async,no_subtree_check,anonuid=1000,anongid=1000

almahmoud · 2020-07-01T02:29:50Z

It also works with nfs-server-provisioner (https://github.com/helm/charts/tree/master/stable/nfs-server-provisioner) with expected values.

Specific values we're using

helm install nfs-provisioner stable/nfs-server-provisioner
    --namespace myns
    --set persistence.enabled=true
    --set persistence.storageClass="ebs"
    --set persistence.size=100Gi
    --set storageClass.create=true
    --set storageClass.reclaimPolicy="Delete"
    --set storageClass.allowVolumeExpansion=true

and NextCloud snippet:

persistence:
  enabled: true
  storageClass: nfs
  accessMode: "ReadWriteMany"

I'll open a separate issue for the existingClaim problem

mikeyGlitz · 2020-07-04T00:59:59Z

Okay I got it working! I am using a Open Media vault NFS share for all of my persistent volumes. I set them up with the following settings and it now works without any issues when using the regular helm install, no extra stuff required.

Settings for nfs share.

rw,no_root_squash,insecure,async,no_subtree_check,anonuid=1000,anongid=1000

Tried changing the line in my /etc/exports and it didn't fix the problem.

mikeyGlitz · 2020-07-04T01:08:16Z

Using the following snippets:

nfs-client-provisioner.values.yaml

nfs:
  mountOptions:
    - nfsvers=4
  server: 172.16.0.1
  nfs.path: /mnt/external

I updated my nextcloud values with the new value persistence.accessMode=ReadWriteMany.

Also didn't work.

I have the following directories in my volume:

drwxrwxrwx 9 root     root 4096 Jul  4 01:04 ./
drwxr-xr-x 7 root     root 4096 Jul  4 01:09 ../
drwxrwxrwx 2 root     root 4096 Jul  4 01:04 config/
drwxrwxrwx 2 root     root 4096 Jul  4 01:04 custom_apps/
drwxrwxrwx 2 root     root 4096 Jul  4 01:04 data/
drwxrwxrwx 8 www-data root 4096 Jul  4 01:08 html/
drwxrwxrwx 4 root     root 4096 Jul  4 01:04 root/
drwxrwxrwx 2 root     root 4096 Jul  4 01:04 themes/
drwxrwxrwx 2 root     root 4096 Jul  4 01:04 tmp/

stale · 2020-08-03T01:59:45Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

tomhouweling1987 · 2020-08-06T13:43:59Z

Got the same problem. Tested with version 17.0.0-apache and 19.0.1-apache. Also seeing that the dirs are root:root.
When we deploy without PVC the installation works

jesussancheztellomm · 2020-08-11T08:57:44Z

Using nfs-client-provisioner works but the main problem is that the initial rsync takes arround 5 minutes to complete (at least in my tests using GCP Filestore). You can look at the entrypoint.sh file.

rsync -rlDog --chown www-data:root --delete --exclude-from=/upgrade.exclude /usr/src/nextcloud/ /var/www/html/

If you disable the readiness and the liveness in the values, it works.

❯ k logs nextcloud-5756597dbc-nhg5m
Initializing nextcloud 17.0.8.1 ...
Initializing finished
New nextcloud instance
Installing with PostgreSQL database
starting nextcloud installation
Nextcloud was successfully installed
setting trusted domains…
System config value trusted_domains => 1 set to string XXXXXXXX
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 10.192.149.41. Set the 'ServerName' directive globally to suppress this message
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 10.192.149.41. Set the 'ServerName' directive globally to suppress this message
[Tue Aug 11 08:54:50.097547 2020] [mpm_prefork:notice] [pid 1] AH00163: Apache/2.4.38 (Debian) PHP/7.3.21 configured -- resuming normal operations
[Tue Aug 11 08:54:50.097621 2020] [core:notice] [pid 1] AH00094: Command line: 'apache2 -D FOREGROUND'

I've trying some alternatives to that rsync but since there are a lot of small files to copy i haven't found any improvement.

Any ideas?

timtorChen · 2020-08-17T09:43:48Z

Log looks like stuck at Initializing Nextcloud 17.0.7..., because the rsync process extremely slow (for my local nfs, it is about 1.5MB/s, show the progress with rsync --info=progress2). More worsely, the liveness probe will continuously fail and finally get CrashLoopBackOff.

As the workaround like jesussancheztellomm, I disable the liveness probe on first installation, and enable it after finishing the installation.

Maybe we can refer to nextcloud/docker#968
It will not solve the problem of slow nfs transmission speed (I still have no idea why ...), but stateless application may remove the rsync process.

tomhouweling1987 · 2020-08-20T18:48:33Z

@timtorChen i can confirm, when i disabled the LivinessProbe it took 11min to sync. Also tried it with an S3 Storage backend, it took just seconds to sync.

So i looked deeper in my NFS, and we are using SYNC instead of ASYNC because we want not lose any data. I didnt test it with an ASYNC connection.

billimek · 2020-08-31T02:57:36Z

The nextcloud chart has migrated to a new repo. Can you please raise the issue over there? https://github.com/nextcloud/helm

somerandow · 2020-08-31T19:30:11Z

Opened an incident over on the new repo. Tried to summarize some of the info from this discussion.

stale · 2020-10-03T15:31:13Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

stale · 2020-10-24T08:24:54Z

This issue is being automatically closed due to inactivity.

stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 3, 2020

stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 6, 2020

somerandow mentioned this issue Aug 31, 2020

Stuck at "Initializing Nextcloud..." when attached to NFS PVC nextcloud/helm#10

Open

stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 3, 2020

stale bot closed this as completed Oct 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[stable/nextcloud] Image stuck at Initializing NextCloud... when PVC is attached #22920

[stable/nextcloud] Image stuck at Initializing NextCloud... when PVC is attached #22920

mikeyGlitz commented Jun 24, 2020

11jwolfe2 commented Jun 29, 2020 •

edited

Loading

derdrdirk commented Jun 30, 2020

almahmoud commented Jun 30, 2020 •

edited

Loading

11jwolfe2 commented Jul 1, 2020

almahmoud commented Jul 1, 2020 •

edited

Loading

mikeyGlitz commented Jul 4, 2020

mikeyGlitz commented Jul 4, 2020 •

edited

Loading

stale bot commented Aug 3, 2020

tomhouweling1987 commented Aug 6, 2020

jesussancheztellomm commented Aug 11, 2020 •

edited

Loading

timtorChen commented Aug 17, 2020 •

edited

Loading

tomhouweling1987 commented Aug 20, 2020

billimek commented Aug 31, 2020

somerandow commented Aug 31, 2020

stale bot commented Oct 3, 2020

stale bot commented Oct 24, 2020

[stable/nextcloud] Image stuck at Initializing NextCloud... when PVC is attached #22920

[stable/nextcloud] Image stuck at Initializing NextCloud... when PVC is attached #22920

Comments

mikeyGlitz commented Jun 24, 2020

helm: v3.2.1 kubernetes: v1.18.4+k3s1

11jwolfe2 commented Jun 29, 2020 • edited Loading

derdrdirk commented Jun 30, 2020

almahmoud commented Jun 30, 2020 • edited Loading

11jwolfe2 commented Jul 1, 2020

almahmoud commented Jul 1, 2020 • edited Loading

mikeyGlitz commented Jul 4, 2020

mikeyGlitz commented Jul 4, 2020 • edited Loading

stale bot commented Aug 3, 2020

tomhouweling1987 commented Aug 6, 2020

jesussancheztellomm commented Aug 11, 2020 • edited Loading

timtorChen commented Aug 17, 2020 • edited Loading

tomhouweling1987 commented Aug 20, 2020

billimek commented Aug 31, 2020

somerandow commented Aug 31, 2020

stale bot commented Oct 3, 2020

stale bot commented Oct 24, 2020

helm: v3.2.1
kubernetes: v1.18.4+k3s1

11jwolfe2 commented Jun 29, 2020 •

edited

Loading

almahmoud commented Jun 30, 2020 •

edited

Loading

almahmoud commented Jul 1, 2020 •

edited

Loading

mikeyGlitz commented Jul 4, 2020 •

edited

Loading

jesussancheztellomm commented Aug 11, 2020 •

edited

Loading

timtorChen commented Aug 17, 2020 •

edited

Loading