Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes Service not working with NFS between two pods #74266

Closed
CodeCorrupt opened this issue Feb 19, 2019 · 11 comments

Comments

Projects
None yet
6 participants
@CodeCorrupt
Copy link

commented Feb 19, 2019

What happened: Due to what I can only assume is an issue with K8s Services, I'm unable to mount a NFS share from a server pod to a client pod

What you expected to happen: mount -t nfs nfs-service:/exports /mnt/test to successfully mount

How to reproduce it (as minimally and precisely as possible):
I've copied what was posted on SuperUser and Stack Overflow with no responses

When I try to mount /exports form the nfs-server I get an error pertaining to it being write-protected, when (rw) is specified in /etc/exports. After testing I believe it's related to the way Kubernete's services handle low port numbers.

First I created a GCP Disk to mount as a PV
gcloud compute disks create --size=50GB --zone=us-central1-a nfs-disk

Then I kubectl apply -f the following two configs.

NFS_Deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nfs-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-server
  template:
    metadata:
      labels:
        app: nfs-server
    spec:
      containers:
      - name: nfs-server
        image: gcr.io/google_containers/volume-nfs:0.8
        ports:
          - name: nfs
            containerPort: 2049
          - name: mountd
            containerPort: 20048
          - name: rpcbind
            containerPort: 111
        securityContext:
          privileged: true
        volumeMounts:
          - mountPath: /exports
            name: nfs-disk
      volumes:
        - name: nfs-disk
          gcePersistentDisk:
            pdName: nfs-disk
            fsType: ext4

NFS_Service.yaml

apiVersion: v1
kind: Service
metadata:
  name: nfs-service
spec:
  clusterIP: 10.27.254.55
  ports:
    - name: nfs
      port: 2049
    - name: mountd
      port: 20048
    - name: rpcbind
      port: 111
  selector:
    app: nfs-server

This produces the following /etc/exports inside the nfs-server cotainer

[root@nfs-server-85546b8c7b-jlzc5 /]# cat /etc/exports
/exports *(rw,fsid=0,insecure,no_root_squash)
/ *(rw,fsid=0,insecure,no_root_squash)

I can successfully mount the /exports from localhost on the nfs-server.
ie: mount -t nfs localhost:/exports /test/

After this, I run a temporary container just to test with the following command
kubectl run --rm -it --image=markeijsermans/debug:kitchen-sink debug /bin/bash

  • The image being used is just a troubleshooting image I found. I have the exact same issue with every image I tried, to include Ubuntu and Fedora

When I try to mount the nfs inside the debug container (in the same cluster) I get the following error

(22:54 debug-7dcc5cd59f-496fd:/) mkdir /test 
(22:54 debug-7dcc5cd59f-496fd:/) chmod -R 777 /test/
(22:54 debug-7dcc5cd59f-496fd:/) mount -t nfs nfs-service.default.svc.cluster.local:/exports /test/
mount: nfs-service.default.svc.cluster.local:/exports is write-protected, mounting read-only
mount: cannot mount nfs-service.default.svc.cluster.local:/exports read-only
(32 22:54 debug-7dcc5cd59f-496fd:/) mount -t nfs 10.27.254.55:/exports /test/                                 
mount: 10.27.254.55:/exports is write-protected, mounting read-only
mount: cannot mount 10.27.254.55:/exports read-only
(32 22:55 debug-7dcc5cd59f-496fd:/) mount -t nfs nfs-service:/exports /test/                                  
mount: nfs-service:/exports is write-protected, mounting read-only
mount: cannot mount nfs-service:/exports read-only
(32 22:55 debug-7dcc5cd59f-496fd:/) 

^^^^ This is the issue I'm trying to solve

Curiously, I'm able to use the nfs server as a PV and PVC. To demonstrate, the following yaml deploys and can be used successfully

NFS_Volume.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: nfs-service.default.svc.cluster.local
    path: "/"

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: nfs
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: ""
  resources:
    requests:
      storage: 10Gi

This can then be proven and tested with the following ReplicationController

_NFS_Test_PV.yaml

# This mounts the nfs volume claim into /mnt and continuously
# overwrites /mnt/index.html with the time and hostname of the pod.

apiVersion: v1
kind: ReplicationController
metadata:
  name: nfs-busybox
spec:
  replicas: 2
  selector:
    name: nfs-busybox
  template:
    metadata:
      labels:
        name: nfs-busybox
    spec:
      containers:
      - image: busybox
        command:
          - sh
          - -c
          - 'while true; do date > /mnt/index.html; hostname >> /mnt/index.html; sleep $(($RANDOM % 5 + 5)); done'
        imagePullPolicy: IfNotPresent
        name: busybox
        volumeMounts:
          - name: nfs
            mountPath: "/mnt"
      volumes:
      - name: nfs
        persistentVolumeClaim:
          claimName: nfs

All of this leads me to believe that there is something going on with Services that (my best guess) drops low port numbers, like rpc.

Anything else we need to know?: I understand this could be better asked in StackOverflow, but I have tried to find answers elsewhere to no avail. This is a last ditch effort to get an answer.

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.7", GitCommit:"0c38c362511b20a098d7cd855f1314dad92c2780", GitTreeState:"clean", BuildDate:"2018-08-20T10:09:03Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.11-gke.1", GitCommit:"5c4fddf874319c9825581cc9ab1d0f0cf51e1dc9", GitTreeState:"clean", BuildDate:"2018-11-30T16:18:58Z", GoVersion:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: Google Kubernetes Engine Master version 1.10.11-gke.1
  • OS (e.g. from /etc/os-release):
    Node version 1.10.11-gke.1
    Node image Container-Optimized OS (cos)
    Machine type n1-standard-4 (4 vCPUs, 15 GB memory)
  • Kernel (e.g. uname -a): ???
  • Install tools: ???
  • Others: ???
@CodeCorrupt

This comment has been minimized.

Copy link
Author

commented Feb 19, 2019

@kubernetes/sig-network-bugs

@k8s-ci-robot k8s-ci-robot added sig/network and removed needs-sig labels Feb 19, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Feb 19, 2019

@CodeCorrupt: Reiterating the mentions to trigger a notification:
@kubernetes/sig-network-bugs

In response to this:

@kubernetes/sig-network-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@YoubingLi

This comment has been minimized.

Copy link
Contributor

commented Feb 21, 2019

I don't think this is a bug.

NFS doesn't allow you to nested export filesystem.

The container mounts host-based path to /exportfs, then export /exportfs path inside container.
You cannot mount container's /exportfs at any place.

It is same as following scenario.

1. HostA export path1
2. HostB mounts HostA:path1 as path2, then export path2
3.  NFS dosn't allow you to mount HostB:path2 on HostC.
     "access denied by server" message would be showed up.
@CodeCorrupt

This comment has been minimized.

Copy link
Author

commented Feb 21, 2019

@YoubingLi Thank you for that explanation! best answer I've gotten by far.

I'm curious now, why is it that I can use that share as a Persistent Volume then? What is the difference between me mounting the nfs inside a container, and K8s service mounting that same nfs share internally to create a PV/PVC? Also why am I able to mount the same nfs share when I mount through localhost in the pod running the nfs server?

There seems to be something special creating a nfs PV/PVC (as in the config I shared above) is doing that I can't replicate when just trying to mount inside another pod.

@thockin thockin self-assigned this Mar 7, 2019

@thockin

This comment has been minimized.

Copy link
Member

commented Mar 21, 2019

ping to @thockin

@thockin

This comment has been minimized.

Copy link
Member

commented Mar 22, 2019

There really is not anything special about how volumes are mounted directly withing a Pod vs through a PVC.

to debug, here's what I would try:

  1. Make sure your test client is running privileged -- mount requires more than just a root UID.

  2. Can a pod client mount the NFS share from the server Pod's IP address (instead of the service address)?

  3. On the Node where the server pod is running, you can inspect the various mount modes via /proc/mounts. Make sure it's read-write everywhere.

This does not seem like a network issue at all, and I doubt very much it is a bug in Kubernetes. If you have evidence otherwise, PLEASE reopen and I will dig deeper.

@thockin thockin closed this Mar 22, 2019

@CodeCorrupt

This comment has been minimized.

Copy link
Author

commented Mar 23, 2019

I should have updated the issue a while ago. You're correct about starting the pod with privileged. That fixed the issue for me.

Thank you all for the help!

@CodeCorrupt

This comment has been minimized.

Copy link
Author

commented Apr 26, 2019

@mlensment Why are you trying to telnet? telnet will use port 23 which you don't have forwarded, and needs the telnet daemon running on the other end.

My solution was to make sure the client you're trying to mount onto is running with privileged mode.

@mlensment

This comment has been minimized.

Copy link

commented Apr 26, 2019

@mlensment Why are you trying to telnet? telnet will use port 23 which you don't have forwarded, and needs the telnet daemon running on the other end.

My solution was to make sure the client you're trying to mount onto is running with privileged mode.

@CodeCorrupt I deleted my previous comment, my service was defined incorrectly. My issue was not related to this topic.

@johnmcdowall

This comment has been minimized.

Copy link

commented May 12, 2019

@CodeCorrupt Any chance you could post a full configuration? I'm running into the same issue just following the official NFS example, and I've tried privileged mode and still can't get it to work.

@CodeCorrupt

This comment has been minimized.

Copy link
Author

commented May 12, 2019

@johnmcdowall Sure!

NFSDeployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nfs
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: nfs
  template:
    metadata:
      labels:
        app: nfs
    spec:
      containers:
      - name: nfs
        image: gcr.io/google_containers/volume-nfs:0.8
        ports:
          - name: nfs
            containerPort: 2049
          - name: mountd
            containerPort: 20048
          - name: rpcbind
            containerPort: 111
        securityContext:
          privileged: true
        volumeMounts:
          - mountPath: /exports
            name: nfs-disk
      volumes:
        - name: nfs-disk
          gcePersistentDisk:
            pdName: varnost-nfs-disk
            fsType: ext4

(for the volume, you can use whatever, I just have been using a GCP disk)

NFSService.yaml

apiVersion: v1
kind: Service
metadata:
  name: nf-service
spec:
  ports:
    - name: nfs
      port: 2049
    - name: mountd
      port: 20048
    - name: rpcbind
      port: 111
  selector:
    app: nfs

Debugging.yaml

apiVersion: v1
kind: Pod
metadata:
  name: debug
spec:
  containers:
  - name: debug
    image: ubuntu
    command: [ "/bin/bash", "-c", "--" ]
    args: [ "while true; do sleep 10; done;" ]
    securityContext:
      privileged: true

From within the debug container, you should be able to mount the NFS dir. (Assuming you have the NFS server itself configured to export the correct folder/volume)

@YoubingLi YoubingLi referenced this issue Jul 4, 2019

Closed

REQUEST: New membership for YoubingLi #987

6 of 6 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.