Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cannot mount nfs shares from inside pods #1109

Open
fragolinux opened this issue Jul 23, 2022 · 16 comments
Open

[BUG] cannot mount nfs shares from inside pods #1109

fragolinux opened this issue Jul 23, 2022 · 16 comments
Labels
bug Something isn't working

Comments

@fragolinux
Copy link

fragolinux commented Jul 23, 2022

What did you do

i initially tested openebs nfs-provisioner, on top of k3d default local-path storage class... pvc where created, but pods could not mount them, saying "not permitted" or "not supported"... i could mount the shares from inside the openebs nfs sharing pods, even between them (a pod could mount its own shares, AND the shares of the other pod, sharing a different pvc)... but NO OTHER pods could mount them, they all remain in containerCreating state, and i've those errors in events...

so i tried a different solution, an nfs server docker container running on my host machine, and connect to it using the nfs subdir provisioner, with identical results, so it seems i cannot get an RWX volume on k3d right now, whatever solution i do... tested on both my dev machine (macbook pro, big sur latest) AND on an ubuntu 22.04 vm (with of course the nfs-common package installed on it)

  • How was the cluster created?
docker network create --subnet="172.22.0.0/16" --gateway="172.22.0.1" "internalNetwork"

k3d cluster create test --network internalNetwork

mkdir -p ~/nfsshare

docker run -d --net=internalNetwork -p 2049:2049 --name nfs --privileged -v ~/nfsshare:/nfsshare -e SHARED_DIRECTORY=/nfsshare itsthenetwork/nfs-server-alpine:latest

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/

helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner     --set nfs.server=host.k3d.internal --set nfs.path=/

the pod stays in "containerCreating" state, and in events i get:

Message:             MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs host.k3d.internal:/ /var/lib/kubelet/pods/ddeba612-5e1c-4ae6-8068-f641f42706ca/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root
Output: mount: mounting host.k3d.internal:/ on /var/lib/kubelet/pods/ddeba612-5e1c-4ae6-8068-f641f42706ca/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root failed: Not supported

so, let's try from an ubuntu pod

kubectl run ubuntu --image ubuntu sleep infinity
# shell inside, then:
apt update
apt install nfs-common -y
mkdir t
mount -t nfs host.k3d.internal:/ t # host is correctly resolved using dig...
mount.nfs: Operation not permitted

test from host to see if the share works: it does...

mkdir ~/nfsshare
sudo mount -t nfs localhost:/ t0
touch t0/aaa
ls t0 # aaa exists
ls ~/nfsshare # and is visible in the share
touch ~/nfsshare/bbb
ls ~/nfsshare # bbb exists in share
ls t0 # and is visible in local folder

What did you expect to happen

share should be mountable, to create rwx volumes

Which OS & Architecture

  • output of k3d runtime-info
arch: x86_64
cgroupdriver: systemd
cgroupversion: "2"
endpoint: /var/run/docker.sock
filesystem: extfs
name: docker
os: Ubuntu 22.04 LTS
ostype: linux
version: 20.10.12

Which version of k3d

  • output of k3d version
k3d version v5.4.4
k3s version v1.23.8-k3s1 (default)

Which version of docker

  • output of docker version and docker info
Client:
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.17.3
 Git commit:        20.10.12-0ubuntu4
 Built:             Mon Mar  7 17:10:06 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.3
  Git commit:       20.10.12-0ubuntu4
  Built:            Mon Mar  7 15:57:50 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:          1.1.3
  GitCommit:        v1.1.3-0-g6724737f
 docker-init:
  Version:          0.19.0
  GitCommit:
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 5
  Running: 5
  Paused: 0
  Stopped: 0
 Images: 24
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.3-0-g6724737f
 init version:
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.15.0-41-generic
 Operating System: Ubuntu 22.04 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.827GiB
 Name: lima-ubuntu
 ID: EMKE:7WSJ:7M3C:3JFT:HS7J:MNZJ:SQKG:Y3RY:2S7Q:PMMO:DHRL:6K3P
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
@fragolinux fragolinux added the bug Something isn't working label Jul 23, 2022
@fragolinux
Copy link
Author

update: tried ganesha nfs provisioner, too, same setup as above... even thats, fails to create usable nfs shares... my pods are now in a different state (CreateContainerConfigError), and i get this in events...

Message:             MountVolume.SetUp failed for volume "pvc-8352d87c-c342-4777-892f-ef94f02d8ded" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o vers=4 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded /var/lib/kubelet/pods/b56c8042-25aa-4d7b-9045-2b0827c03c8d/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded
Output: mount: mounting 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded on /var/lib/kubelet/pods/b56c8042-25aa-4d7b-9045-2b0827c03c8d/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded failed: Stale file handle

all the other stuff is the same, the storageclass is always "nfs", this is my helmrelease:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: ${release}
  namespace: flux-system
spec:
  chart:
    spec:
      chart: nfs-server-provisioner # nfs-provisioner
      version: "1.4.0" # "0.9.0"
      sourceRef:
        kind: HelmRepository
        name: ${release}
  interval: 1h0m0s
  releaseName: ${release}
  targetNamespace: ${namespace}

  values:
    # see https://artifacthub.io/packages/helm/kvaps/nfs-server-provisioner
    persistence:
      enabled: true
      storageClass: "local-path"
      size: 1Gi
    storageClass:
      defaultClass: false
      name: nfs
      reclaimPolicy: Retain
      mountOptions:
        - "vers=4"

@fragolinux
Copy link
Author

i just tried even the rook nfs provisioner, that too does not work on k3d 5.4.4 with errors:

Unable to attach or mount volumes: unmounted volumes=[rook-nfs-vol], unattached volumes=[rook-nfs-vol kube-api-access-tv299]: timed out waiting for the condition

Message:             MountVolume.SetUp failed for volume "pvc-8352d87c-c342-4777-892f-ef94f02d8ded" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o vers=4 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded /var/lib/kubelet/pods/a068abfc-8347-4f76-87b2-f9269b37c0db/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded
Output: mount: mounting 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded on /var/lib/kubelet/pods/a068abfc-8347-4f76-87b2-f9269b37c0db/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded failed: Stale file handle

pvc are regularly created and bound, but pods cannot mount them and write...

@maoxuner
Copy link

@fragolinux

It's not a bug of k3d but a defect of k3s docker image.

K3s docker image is build from scratch with no nfs support.Dockerfile

As as result of that, both k3s node container and pods inside of the node can not mount nfs.

There is a workaround: rebase k3s image with alpine and install nfs-utils.

FROM alpine:latest

RUN set -ex; \
    apk add --no-cache iptables ip6tables nfs-utils; \
    echo 'hosts: files dns' > /etc/nsswitch.conf

COPY --from=rancher/k3s:v1.24.3-k3s1 /bin /opt/k3s/bin

VOLUME /var/lib/kubelet
VOLUME /var/lib/rancher/k3s
VOLUME /var/lib/cni
VOLUME /var/log

ENV PATH="$PATH:/opt/k3s/bin:/opt/k3s/bin/aux"
ENV CRI_CONFIG_FILE="/var/lib/rancher/k3s/agent/etc/crictl.yaml"

ENTRYPOINT ["/opt/k3s/bin/k3s"]
CMD ["agent"]

Build it yourself or have a look at mine. maoxuner/k3s (not managed frequently)

I don't known how to patch nfs-utils into official k3s image. Anyone know it please tell me.

@fragolinux
Copy link
Author

@pawmaster tried that, but didn't work for me... hints?

k3d cluster create test -i maoxuner/k3s:v1.24.3-k3s1
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-test'
INFO[0000] Created image volume k3d-test-images
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-test-tools'
INFO[0001] Creating node 'k3d-test-server-0'
INFO[0001] Creating LoadBalancer 'k3d-test-serverlb'
INFO[0001] Using the k3d-tools node to gather environment information
INFO[0001] Starting new tools node...
INFO[0001] Starting Node 'k3d-test-tools'
INFO[0003] Starting cluster 'test'
INFO[0003] Starting servers...
INFO[0003] Starting Node 'k3d-test-server-0'
ERRO[0003] Failed Cluster Start: Failed to start server k3d-test-server-0: Node k3d-test-server-0 failed to get ready: error waiting for log line `k3s is up and running` from node 'k3d-test-server-0': stopped returning log lines
ERRO[0003] Failed to create cluster >>> Rolling Back
INFO[0003] Deleting cluster 'test'
INFO[0004] Deleting cluster network 'k3d-test'
INFO[0004] Deleting 2 attached volumes...
WARN[0004] Failed to delete volume 'k3d-test-images' of cluster 'test': failed to find volume 'k3d-test-images': Error: No such volume: k3d-test-images -> Try to delete it manually
FATA[0004] Cluster creation FAILED, all changes have been rolled back!

@fragolinux
Copy link
Author

I created a similar image, based on the version i need (1.22) and have same issues... something missing in dockerfile?

k3d cluster create test -i ghcr.io/ecomind/k3s-nfs:1.22.12-k3s1
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-test'
INFO[0000] Created image volume k3d-test-images
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-test-tools'
INFO[0001] Creating node 'k3d-test-server-0'
INFO[0001] Creating LoadBalancer 'k3d-test-serverlb'
INFO[0001] Using the k3d-tools node to gather environment information
INFO[0001] Starting new tools node...
INFO[0001] Starting Node 'k3d-test-tools'
INFO[0003] Starting cluster 'test'
INFO[0003] Starting servers...
INFO[0003] Starting Node 'k3d-test-server-0'
ERRO[0003] Failed Cluster Start: Failed to start server k3d-test-server-0: Node k3d-test-server-0 failed to get ready: error waiting for log line `k3s is up and running` from node 'k3d-test-server-0': stopped returning log lines
ERRO[0003] Failed to create cluster >>> Rolling Back
INFO[0003] Deleting cluster 'test'
INFO[0004] Deleting cluster network 'k3d-test'
INFO[0004] Deleting 2 attached volumes...
WARN[0004] Failed to delete volume 'k3d-test-images' of cluster 'test': failed to find volume 'k3d-test-images': Error: No such volume: k3d-test-images -> Try to delete it manually
FATA[0004] Cluster creation FAILED, all changes have been rolled back!

@maoxuner
Copy link

maoxuner commented Aug 26, 2022

I've run into same issue before. I tried clean up all resources (image container volume network), then create cluster. Again and again, repeat it and finally succeed. But I don't known what happend. That's why I'm looking for some way to patch origin image.

@fragolinux
Copy link
Author

@pawmaster think i fixed it... take a look at my repo, i just left the paths as in original image (no /opt...), and image comes now up no problem... now let's see if nfs works :D

try: k3d cluster create test -i ghcr.io/ecomind/k3s-nfs:1.22.13-k3s1

@maoxuner
Copy link

maoxuner commented Aug 27, 2022

@fragolinux It's not a good practice to override alpine binaries with original image(scratch binaries) directly, there may be incompatible between binary files.

A better way is replace all binaries with alpine packages. I can't find packages including bin files such as /bin/aux/xtables. As a result of that, I copied all bin files to /opt/k3s/bin

Anyway, if it works, it's still a good idea.


By the way, do you know any method to backup and restore clusters (multiple nodes) created by k3d? I've tried to backup /var/lib/rancher/k3s/server/db(using sqlite by default), but new cluster can't restore from it.

marcoaraujojunior added a commit to marcoaraujojunior/k3s-docker that referenced this issue Oct 22, 2022
a template creating a Dockerfile to allow use nfs
@jlian
Copy link

jlian commented Jun 29, 2023

Hey I got NFS to work in k3d for GitHub codespace based on the info from this thread.

https://github.com/jlian/k3d-nfs

It's mostly the same as @marcoaraujojunior's commit marcoaraujojunior/k3s-docker@914c6f8 with touch /run/openrc/softlevel added to the entrypoint script and also figuring out to set export K3D_FIX_CGROUPV2=false so that the entrypoint isn't overridden.

Try with

export K3D_FIX_CGROUPV2=false
k3d cluster create -i ghcr.io/jlian/k3d-nfs:v1.25.3-k3s1

@iwilltry42
Copy link
Member

@jlian , instead of disabling k3d's entrypoints (there are actually multiple), just add your script to the list by putting it here /bin/k3d-entrypoint-*.sh, replacing the * with the name of your script.
This will make k3d execute it alongside the other entrypoint scripts.
I hope that we can expose this more easily using the lifecycle hooks at some point.

jlian added a commit to jlian/k3d-nfs that referenced this issue Jun 30, 2023
@jlian
Copy link

jlian commented Jun 30, 2023

@iwilltry42 Ok thanks, got it to work. Now just needs k3d cluster create -i ghcr.io/jlian/k3d-nfs:v1.25.3-k3s1

Took me a while to find the entrypoint logs in /var/log as opposed to docker container logs. Also noticed that this custom entrypoint method doesn't work on k3d v4 and older.

@ryan-mcd
Copy link

ryan-mcd commented Jul 9, 2023

I am not currently experiencing the issues @jlian is experiencing.

I have created a repository with the latest images from the 1.25, 1.26, and 1.27 channels, as well as the "stable" channel at https://github.com/ryan-mcd/k3s-containers

Feel free to utilize these images.

@dcpc007
Copy link

dcpc007 commented Jul 27, 2023

OUaouh !!! thanks !!!

I lost 5 hours for my first test to try create a nfs share for my pods on synology !!
and all was k3d fault !!!

+1 to update this, even if k3d purpose is mainly test, not have nfs for storage is a strange pain !

any idea to warn k3d company about that more drectly ?

Thks @jlian for the 1.25 image ; no 1.26 or + ?
(and thx for gloomhaven, i'm starting by jaws of lion, maybe will try to adapt it for those scenarii !)

@jlian
Copy link

jlian commented Nov 3, 2023

@ryan-mcd you got NFS to work in codespaces without using openrc? It's been a while but I kind of remember when I first tried it without openrc it kept not working. Can you show me which part in your Dockerfile that makes it work?

EDIT: hmm, I tried your image and it didn't work for me, getting FailedMount with my pod with MountVolume.SetUp failed for volume "pvc-***" : mount failed: exit status 32 and

│ Mounting command: mount                                        
│ Mounting arguments: -t nfs -o vers=3 10.43.210.51:/export/pvc-*** /var/lib/kubelet/pods/***/volumes/kubernetes.io~nfs/pvc-***│
│ Output: mount.nfs: rpc.statd is not running but is required for remote locking.
│ mount.nfs: Either use '-o nolock' to keep locks local, or start statd.

@iwilltry42
Copy link
Member

OUaouh !!! thanks !!!

I lost 5 hours for my first test to try create a nfs share for my pods on synology !! and all was k3d fault !!!

+1 to update this, even if k3d purpose is mainly test, not have nfs for storage is a strange pain !

any idea to warn k3d company about that more drectly ?

Thks @jlian for the 1.25 image ; no 1.26 or + ? (and thx for gloomhaven, i'm starting by jaws of lion, maybe will try to adapt it for those scenarii !)

@dcpc007 there is no company behind k3d. There's SUSE Rancher behind K3s though, which is what's inside k3d, so feel free to open issues/PRs on https://github.com/k3s-io/k3s or ask them via Slack.
Or if you want, you can try to come up with an automated workflow that builds k3d specific images of K3s that includes extra features (like nfs support) that may not be included upstream.

@ryan-mcd
Copy link

ryan-mcd commented Nov 3, 2023

@ryan-mcd you got NFS to work in codespaces without using openrc? It's been a while but I kind of remember when I first tried it without openrc it kept not working. Can you show me which part in your Dockerfile that makes it work?

EDIT: hmm, I tried your image and it didn't work for me, getting FailedMount with my pod with MountVolume.SetUp failed for volume "pvc-***" : mount failed: exit status 32 and

│ Mounting command: mount                                        
│ Mounting arguments: -t nfs -o vers=3 10.43.210.51:/export/pvc-*** /var/lib/kubelet/pods/***/volumes/kubernetes.io~nfs/pvc-***│
│ Output: mount.nfs: rpc.statd is not running but is required for remote locking.
│ mount.nfs: Either use '-o nolock' to keep locks local, or start statd.

I don't use codespaces. Perhaps that's why I didn't have an issue without openrc. In my local environment it worked fine without it, so I didn't include it. I can certainly add it back.

Which version were you planning/attempting to use?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants