NFS example does not work on google container engine. #24687

arenoir · 2016-04-22T21:12:04Z

The nfs example at /examples/nfs does not work on google container engine.

The nfs-server runs but the nfs-busybox won't mount the PersistentVolumeClaim.

The busybox pod errors out trying to mount the persistent volume claim.

Output: mount.nfs: Connection timed out

The nfs-pv uses the nfs-server service ip. Both the persistent volume and the persistent volume claim are bound.

I did notice the nfs server logged a warning.

rpcinfo: can't contact rpcbind: : RPC: Unable to receive; errno = Connection refused

I have tried exposing additional ports 111 tcp/udp 2049/udp but that had no effect.

Please help.

#kubectl describe service nfs-server
Name:           nfs-server
Namespace:      default
Labels:         <none>
Selector:       role=nfs-server
Type:           ClusterIP
IP:         10.19.247.137
Port:           <unset> 2049/TCP
Endpoints:      10.16.3.4:2049
Session Affinity:   None
No events.

#kubectl describe pv nfs
Name:       nfs
Labels:     <none>
Status:     Bound
Claim:      default/nfs
Reclaim Policy: Retain
Access Modes:   RWX
Capacity:   1Mi
Message:    
Source:
    Type:   NFS (an NFS mount that lasts the lifetime of a pod)
    Server: 10.19.247.137
    Path:   /
    ReadOnly:   false

#kubectl describe pvc nfs
Name:       nfs
Namespace:  default
Status:     Bound
Volume:     nfs
Labels:     <none>
Capacity:   1Mi
Access Modes:   RWX

#kubectl describe pod nfs-server-e71xs
Name:       nfs-server-e71xs
Namespace:  default
Node:       gke-fieldphone-32335ca1-node-9o0q/10.128.0.4
Start Time: Fri, 22 Apr 2016 13:39:57 -0700
Labels:     role=nfs-server
Status:     Running
IP:     10.16.3.4
Controllers:    ReplicationController/nfs-server
Containers:
  nfs-server:
    Container ID:   docker://d0f11148b09986163c73baf525d57b4a59b3bce149f1776f117adcb444993a5c
    Image:      gcr.io/google_containers/volume-nfs
    Image ID:       docker://3f8217a3a8f1e891612aece9cbf8b8defeb1f1ffa39836ebb7de5e03139f56a7
    Port:       2049/TCP
    QoS Tier:
      cpu:  Burstable
      memory:   BestEffort
    Requests:
      cpu:      100m
    State:      Running
      Started:      Fri, 22 Apr 2016 13:39:59 -0700
    Ready:      True
    Restart Count:  0
    Environment Variables:
Conditions:
  Type      Status
  Ready     True 
Volumes:
  default-token-szz1v:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-szz1v
Events:
  FirstSeen LastSeen    Count   From                        SubobjectPath           Type        Reason      Message
  --------- --------    -----   ----                        -------------           --------    ------      -------
  29m       29m     1   {default-scheduler }                                Normal      Scheduled   Successfully assigned nfs-server-e71xs to gke-fieldphone-32335ca1-node-9o0q
  29m       29m     1   {kubelet gke-fieldphone-32335ca1-node-9o0q} spec.containers{nfs-server} Normal      Pulling     pulling image "gcr.io/google_containers/volume-nfs"
  29m       29m     1   {kubelet gke-fieldphone-32335ca1-node-9o0q} spec.containers{nfs-server} Normal      Pulled      Successfully pulled image "gcr.io/google_containers/volume-nfs"
  29m       29m     1   {kubelet gke-fieldphone-32335ca1-node-9o0q} spec.containers{nfs-server} Normal      Created     Created container with docker id d0f11148b099
  29m       29m     1   {kubelet gke-fieldphone-32335ca1-node-9o0q} spec.containers{nfs-server} Normal      Started     Started container with docker id d0f11148b099

#kubectl describe pod nfs-busybox-fu4el
Name:       nfs-busybox-fu4el
Namespace:  default
Node:       gke-fieldphone-32335ca1-node-00f2/10.128.0.9
Start Time: Fri, 22 Apr 2016 13:49:25 -0700
Labels:     name=nfs-busybox
Status:     Pending
IP:     
Controllers:    ReplicationController/nfs-busybox
Containers:
  busybox:
    Container ID:   
    Image:      busybox
    Image ID:       
    Port:       
    Command:
      sh
      -c
      while true; do date > /mnt/index.html; hostname >> /mnt/index.html; sleep $(($RANDOM % 5 + 5)); done
    QoS Tier:
      cpu:  Burstable
      memory:   BestEffort
    Requests:
      cpu:      100m
    State:      Waiting
      Reason:       ContainerCreating
    Ready:      False
    Restart Count:  0
    Environment Variables:
Conditions:
  Type      Status
  Ready     False 
Volumes:
  nfs:
    Type:   PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  nfs
    ReadOnly:   false
  default-token-szz1v:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-szz1v
Events:
  FirstSeen LastSeen    Count   From                        SubobjectPath   Type        Reason          Message
  --------- --------    -----   ----                        -------------   --------    ------          -------
  19m       14m     21  {default-scheduler }                        Warning     FailedScheduling    PersistentVolumeClaim is not bound: "nfs"
  13m       13m     4   {default-scheduler }                        Warning     FailedScheduling    PersistentVolumeClaim 'default/nfs' is not in cache
  13m       13m     1   {default-scheduler }                        Normal      Scheduled       Successfully assigned nfs-busybox-fu4el to gke-fieldphone-32335ca1-node-00f2
  11m       1m      5   {kubelet gke-fieldphone-32335ca1-node-00f2}         Warning     FailedMount     Unable to mount volumes for pod "nfs-busybox-fu4el_default(ce04b0d3-08ca-11e6-a6a8-42010af000bb)": Mount failed: exit status 32
Mounting arguments: 10.19.247.137:/ /var/lib/kubelet/pods/ce04b0d3-08ca-11e6-a6a8-42010af000bb/volumes/kubernetes.io~nfs/nfs nfs []
Output: mount.nfs: Connection timed out


  11m   1m  5   {kubelet gke-fieldphone-32335ca1-node-00f2}     Warning FailedSync  Error syncing pod, skipping: Mount failed: exit status 32
Mounting arguments: 10.19.247.137:/ /var/lib/kubelet/pods/ce04b0d3-08ca-11e6-a6a8-42010af000bb/volumes/kubernetes.io~nfs/nfs nfs []
Output: mount.nfs: Connection timed out

The text was updated successfully, but these errors were encountered:

erinboyd · 2016-04-22T21:18:31Z

What version of NFS are you using?

erinboyd · 2016-04-22T21:23:23Z

Can you also include your exports from the nfs server

arenoir · 2016-04-22T21:23:34Z

@erinboyd I don't know. What ever version is running in the google image from the example. gcr.io/google_containers/volume-nfs

I can't find the source of google docker images.

This is a vanilla setup taken straight out of the documentation it should "just work".

arenoir · 2016-04-22T21:25:26Z

According to the README it is exporting /mnt/data as /.

The server exports /mnt/data directory as / (fsid=0). The directory contains dummy index.html. Wait until the pod is running by checking kubectl get pods -lrole=nfs-server.

arenoir · 2016-04-25T16:46:49Z

@erinboyd I have tried a couple of other docker images in place of the gcr.io/google_containers/volume-nfs without any success. Do you have or know of a working nfs-server docker implementation?

This is the last piece keeping me from moving my setup from colocation to google cloud.

xidui · 2016-04-26T02:43:07Z

+1
I also had a same problem.

xidui · 2016-04-26T02:47:13Z

here is my output:

[root@kubernetes-master nfs]# kubectl get pod
NAME                READY     STATUS              RESTARTS   AGE
nfs-busybox-30a30   0/1       ContainerCreating   0          9m
nfs-busybox-8dw8q   0/1       ContainerCreating   0          9m
nfs-server-0e7rq    1/1       Running             0          4s

[root@kubernetes-master nfs]# kubectl logs nfs-server-0e7rq
Serving /exports
Serving /
rpcinfo: can't contact rpcbind: : RPC: Unable to receive; errno = Connection refused
Starting rpcbind
NFS started

[root@kubernetes-master ~]# kubectl describe po nfs-busybox-8dw8q
Name:                           nfs-busybox-8dw8q
Namespace:                      default
Image(s):                       busybox
Node:                           node-2-slave-1/107.170.32.151
Start Time:                     Mon, 25 Apr 2016 22:29:56 -0400
Labels:                         name=nfs-busybox
Status:                         Pending
Reason:
Message:
IP:
Replication Controllers:        nfs-busybox (2/2 replicas created)
Containers:
  busybox:
    Container ID:
    Image:              busybox
    Image ID:
    QoS Tier:
      cpu:              BestEffort
      memory:           BestEffort
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Environment Variables:
Conditions:
  Type          Status
  Ready         False
Volumes:
  nfs:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  nfs
    ReadOnly:   false
  default-token-pjwpm:
    Type:       Secret (a secret that should populate this volume)
    SecretName: default-token-pjwpm
Events:
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Reason          Message
  ─────────     ────────        ─────   ────                            ─────────────   ──────          ───────
  16m           16m             1       {scheduler }                                    Scheduled       Successfully assigned nfs-busybox-8dw8q to node-2-slave-1
  16m           6s              92      {kubelet node-2-slave-1}                        FailedMount     Unable to mount volumes for pod "nfs-busybox-8dw8q_default            status 32
Mounting arguments: 10.254.207.116:/ /var/lib/kubelet/pods/bd8e9ba7-0b56-11e6-a2e5-0401cb31c801/volumes/kubernetes.io~nfs/nfs nfs []
Output: Job for rpc-statd.service failed because the control process exited with error code. See "systemctl status rpc-statd.service" and "journalctl -xe" for det
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified


  16m   6s      92      {kubelet node-2-slave-1}                FailedSync      Error syncing pod, skipping: Mount failed: exit status 32
Mounting arguments: 10.254.207.116:/ /var/lib/kubelet/pods/bd8e9ba7-0b56-11e6-a2e5-0401cb31c801/volumes/kubernetes.io~nfs/nfs nfs []
Output: Job for rpc-statd.service failed because the control process exited with error code. See "systemctl status rpc-statd.service" and "journalctl -xe" for det
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified

xidui · 2016-04-26T03:28:38Z

@arenoir I have solved this problem, see if my circumstance also applies for you:

I found that there are some problems within the image gcr.io/google_containers/volume-nfs,
See if there is /mnt/data directory in the nfs-server container.

The docker file says that it copies index.html to /mnt/data/index.html. But after I execute kubectl exec -it nfs-server bash and ls /mnt. I fount there is no data directory inside. So I rebuild the image and use the new one. After that this issue solved.

arenoir · 2016-04-28T02:42:11Z

@xidui, thanks for the info... I was not able to get the gcr.io/google_containers/volume-nfs to image to work. I ended up using the image jsafrane/nfsexporter. All is well that ends well.

mml · 2016-05-02T21:59:42Z

@arenoir it looks like you got things working, but I believe we should still correct our docs, at least.

ekozan · 2016-05-03T18:17:51Z

I think somebody have push on gcr.io/google_containers/volume-nfs the builded image of the pr #22665

That's why actually it's not work any more
@rootfs

rootfs · 2016-05-03T19:03:52Z

@pwittrock ^^^

bgrant0607 · 2016-05-06T23:39:00Z

cc @kubernetes/examples

liubin · 2016-06-14T11:13:36Z

The same error( in vagrant):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}

And the nfs-server error:

2016-06-14T10:57:10.776492975Z Serving /exports
2016-06-14T10:57:10.780350890Z Serving /
2016-06-14T10:57:10.878049309Z rpcinfo: can't contact rpcbind: : RPC: Unable to receive; errno = Connection refused
2016-06-14T10:57:10.887171106Z Starting rpcbind
2016-06-14T10:57:11.143431556Z NFS started

rootfs · 2016-06-14T11:48:20Z

Note, the image is updated per #22665.

The example works on GCE/GKE, AWS and Cinder.

klaus · 2016-06-23T05:53:02Z

I am also getting this output, but the nfs-sample still works for me in the gce cloud installation

Output: mount.nfs: Connection timed out

It might be interesting to know that reproducible it does NOT work in local k8s in docker contained installations based on debian. I am wondering if this might be a useful hint? Be it, that locally I am using HostPath pvs or connected to some network-manager I am running on my laptop.

What hints me in the network direction rather than the HostPath is that when I start nfs-common service on my local machine I eventually get a full lockup of my box running the nfs example. Something about "cpu locked for more than 22s"

Jun 22 16:32:27 vex kernel: [ 1240.971949] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [rpc.nfsd:27497]
Jun 22 16:32:27 vex kernel: [ 1240.971952] Modules linked in: xt_nat(E) xt_recent(E) xt_mark(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) xt_comment(E) veth(E) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) rfcomm(E) fuse(E) x
t_conntrack(E) ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) xt_addrtype(E) iptable_filter(E) ip_tables(E) x_tables(E) br_netfilter(E) nf_nat(E) nf_conntrack(E
) bridge(E) stp(E) llc(E) pci_stub(E) vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) overlay(E) bnep(E) cpufreq_powersave(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) nls_utf8(E) nl
s_cp437(E) vfat(E) fat(E) dm_crypt(E) wl(POE) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) arc4(E) btusb(E) kvm_intel(E) kvm(E) btrtl(E) btbcm(E) btintel(E) bluetooth(E) iTCO_wdt(E) iTCO_vendor_support
(E) evdev(E) irqbypass(E) ath9k(E) ath9k_common(E) ath9k_hw(E) dcdbas(E) ath(E) mac80211(E) cfg80211(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_
intel(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) sg(E) snd_pcm(E) snd_timer(E) pcspkr(E) rfkill(E) snd(E) mei_me(E) 8250_fintek(E) battery(E) mei(E) soundcore(E) lpc_ich(E) mfd_core(E) efi_pstore(E) ie31
200_edac(E) edac_core(E) i2c_i801(E) video(E) shpchp(E) serio_raw(E) tpm_tis(E) efivars(E) tpm(E) processor(E) button(E) amdgpu(E) parport_pc(E) ppdev(E) lp(E) parport(E) efivarfs(E) autofs4(E) ext4(E) ecb(E) crc16(E) jbd2(E) 
crc32c_generic(E) mbcache(E) dm_mod(E) raid1(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sr_mod(E) cdrom(E) sd_mod(E) crc32c_intel(E) ahci(E) aesni_intel(E) libahci(E) radeon(E) xhci_pci(E) i2c_algo_bit(E) xhci_hcd(E) libata(
E) aes_x86_64(E) ehci_pci(E) glue_helper(E) lrw(E) gf128mul(E) ehci_hcd(E) ablk_helper(E) cryptd(E) e1000e(E) drm_kms_helper(E) psmouse(E) scsi_mod(E) ttm(E) ptp(E) usbcore(E) pps_core(E) drm(E) usb_common(E) thermal(E) fjes(E
)
Jun 22 16:32:27 vex kernel: [ 1240.972022] CPU: 1 PID: 27497 Comm: rpc.nfsd Tainted: P           OEL  4.6.0-1-amd64 #1 Debian 4.6.1-1
Jun 22 16:32:27 vex kernel: [ 1240.972022] Hardware name: Dell Inc. OptiPlex 7010/0GY6Y8, BIOS A16 09/09/2013
Jun 22 16:32:27 vex kernel: [ 1240.972023] task: ffff8800a2de6080 ti: ffff8804c6364000 task.ti: ffff8804c6364000
Jun 22 16:32:27 vex kernel: [ 1240.972025] RIP: 0010:[<ffffffff8109a138>]  [<ffffffff8109a138>] blocking_notifier_chain_register+0x38/0x90
Jun 22 16:32:27 vex kernel: [ 1240.972029] RSP: 0018:ffff8804c6367dd8  EFLAGS: 00000246
Jun 22 16:32:27 vex kernel: [ 1240.972030] RAX: ffffffffc0c90cd0 RBX: ffffffff81add020 RCX: 0000000000000000
Jun 22 16:32:27 vex kernel: [ 1240.972031] RDX: ffffffffc0c90cd8 RSI: ffffffffc0bf7810 RDI: ffffffff81add020
Jun 22 16:32:27 vex kernel: [ 1240.972032] RBP: ffffffffc0bf7810 R08: 0000000000000005 R09: ffff8804abffc500
Jun 22 16:32:27 vex kernel: [ 1240.972032] R10: ffff8804c1fbb200 R11: 0000000000000000 R12: ffff8800a2e661c0
Jun 22 16:32:27 vex kernel: [ 1240.972033] R13: ffff8804c1fbb200 R14: 0000000000000000 R15: ffff8804c1fbb200
Jun 22 16:32:27 vex kernel: [ 1240.972034] FS:  00007fe8678fe840(0000) GS:ffff88061dc80000(0000) knlGS:0000000000000000
Jun 22 16:32:27 vex kernel: [ 1240.972035] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 22 16:32:27 vex kernel: [ 1240.972036] CR2: 00007fcb61036000 CR3: 00000004ac007000 CR4: 00000000001406e0
Jun 22 16:32:27 vex kernel: [ 1240.972036] Stack:
Jun 22 16:32:27 vex kernel: [ 1240.972037]  ffff8804690ac200 ffff8800a2e661c0 ffffffffc0bebe3d 0000000000000002
Jun 22 16:32:27 vex kernel: [ 1240.972039]  ffff8800a2e661c0 0000000000000000 ffff8804c1fbb200 ffffffffc0c570e7
Jun 22 16:32:27 vex kernel: [ 1240.972040]  ffffffffc0c58510 ffff8804e1d5c008 ffff8800a2e661c0 ffffffffc0c58510
Jun 22 16:32:27 vex kernel: [ 1240.972041] Call Trace:
Jun 22 16:32:27 vex kernel: [ 1240.972046]  [<ffffffffc0bebe3d>] ? lockd_up+0x11d/0x330 [lockd]
Jun 22 16:32:27 vex kernel: [ 1240.972053]  [<ffffffffc0c570e7>] ? nfsd_svc+0x1c7/0x2a0 [nfsd]
Jun 22 16:32:27 vex kernel: [ 1240.972058]  [<ffffffffc0c58510>] ? write_leasetime+0x80/0x80 [nfsd]
Jun 22 16:32:27 vex kernel: [ 1240.972062]  [<ffffffffc0c58510>] ? write_leasetime+0x80/0x80 [nfsd]
Jun 22 16:32:27 vex kernel: [ 1240.972065]  [<ffffffffc0c58596>] ? write_threads+0x86/0xe0 [nfsd]
Jun 22 16:32:27 vex kernel: [ 1240.972068]  [<ffffffff81176862>] ? get_zeroed_page+0x12/0x40
Jun 22 16:32:27 vex kernel: [ 1240.972070]  [<ffffffff812186b0>] ? simple_transaction_get+0xa0/0xb0
Jun 22 16:32:27 vex kernel: [ 1240.972074]  [<ffffffffc0c57b33>] ? nfsctl_transaction_write+0x43/0x70 [nfsd]
Jun 22 16:32:27 vex kernel: [ 1240.972076]  [<ffffffff811f1374>] ? vfs_write+0xa4/0x1a0
Jun 22 16:32:27 vex kernel: [ 1240.972077]  [<ffffffff811f2762>] ? SyS_write+0x52/0xc0
Jun 22 16:32:27 vex kernel: [ 1240.972080]  [<ffffffff815c65b6>] ? system_call_fast_compare_end+0xc/0x96
Jun 22 16:32:27 vex kernel: [ 1240.972081] Code: 74 4a 55 53 48 89 fb 48 89 f5 e8 04 aa 52 00 48 8b 43 28 48 8d 53 28 48 85 c0 74 1c 8b 4d 10 3b 48 10 7e 07 eb 12 39 48 10 7c 0d <48> 8d 50 08 48 8b 40 08 48 85 c0 75 ee 48 89 45 08 48 89 2a 48

pwittrock · 2016-06-23T22:04:10Z

cc @fabioy

sijnc · 2016-06-27T23:03:37Z

I got this example working, however during testing the busybox rc (replicas=2) I noticed that both busybox pods seemed to write to the file at the same time (first entry). Where all other cat file commands show only one host like the second one. Is this normal?

[root@nfs-web-363353589-qausi exports]# cat index.html
Mon Jun 27 22:53:42 UTC 2016
nfs-web-busybox-tf95c
nfs-web-busybox-wgxl4
[root@nfs-web-363353589-qausi exports]# cat index.html
Mon Jun 27 22:54:38 UTC 2016
nfs-web-busybox-wgxl4

rootfs · 2016-06-28T01:52:54Z

@sijnc yes, that's how the example works. There are 2 replica writing to the nfs share. What you see from curl is the current snapshot of the file.

klaus · 2016-06-28T08:14:07Z

I now got the example working, I still think my comment is not fully voided by this.

I am running a k8s 1.4.2 local cluster via docker. This funnily selects quite old components, esp. the DNS component is outdated / not working and throwing a lot of messages. I did initially not see them as the cluster was behaving quite well. So really, DNS was the first service that fully depended on a perfectly working dns server.

changing gcr.io/google-containers/kubedns-amd64:1.2 to 1.3 via kubectl edit rc kube-dns-v13 --namespace kube-system. (I needed to restart the whole cluster for this to have effect).
second, I had to install nfs-common sudo apt-get install nfs-common on the host to. Immediately after that, the volumes got mounted and the busybox images started. Amazingly, initially, I had to purge all nfs-related setup on the host machine to prevent the above described kernel-lockups.

so, with caveats, the example is working ...

jingxu97 · 2017-01-05T20:23:21Z

It is here https://github.com/kubernetes/kubernetes/tree/master/test/images/volumes-tester/nfs

…

On Thu, Jan 5, 2017 at 11:55 AM, Joshua Sindy ***@***.***> wrote: @jingxu97 <https://github.com/jingxu97> know where I can track down the Dockerfile for gcr.io/google_containers/volume-nfs:0.8 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#24687 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ASSNxa9m5o4JwnPUeOB7fV47xFVFapXgks5rPUqzgaJpZM4IN-T7> .

-- - Jing

pmblatino · 2017-01-12T13:40:46Z

@jingxu97 Thanks a lot for poiting those lines out, just tought of letting people know that it doesnt work on GCE kubernetes 1.4.6 only on 1.4.7, and when listing this nfs disk (df -h) on the pod it shows the size of the original disk and not the size set on the nfs-pvc.

ghost · 2017-01-25T11:37:22Z

only to inform that i was able to use the nfs example in GKE node version 1.4.8 but only when using Node image container-vm. If i tried to use with node image gci it doesn't work, giving this information in event history of the pod

  18m		5m		7	{kubelet gke-qamar-n1-standard2-55a0bb05-wcqw}			Warning		FailedMount	Unable to mount volumes for pod "frontoffice-rc-39.0-u7vp7_default(ebd46ce9-e253-11e6-b119-42010a84013c)": timeout expired waiting for volumes to attach/mount for pod "frontoffice-rc-39.0-u7vp7"/"default". list of unattached/unmounted volumes=[nfsvol]
  18m		5m		7	{kubelet gke-qamar-n1-standard2-55a0bb05-wcqw}			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "frontoffice-rc-39.0-u7vp7"/"default". list of unattached/unmounted volumes=[nfsvol]
  20m		4m		15	{kubelet gke-qamar-n1-standard2-55a0bb05-wcqw}			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/nfs/ebd46ce9-e253-11e6-b119-42010a84013c-client-nfs-pv" (spec.Name: "client-nfs-pv") pod "ebd46ce9-e253-11e6-b119-42010a84013c" (UID: "ebd46ce9-e253-11e6-b119-42010a84013c") with: mount failed: exit status 32
Mounting command: /home/kubernetes/bin/mounter
Mounting arguments: 10.100.245.245:/exports /var/lib/kubelet/pods/ebd46ce9-e253-11e6-b119-42010a84013c/volumes/kubernetes.io~nfs/client-nfs-pv nfs []
Output: Running mount using a rkt fly container
run: group "rkt" not found, will use default gid when rendering images
mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
mount.nfs: an incorrect mount option was specified

jingxu97 · 2017-01-25T18:19:05Z

From your log, I think you are trying to use NFSv3 which is currently not supported. Please use "/" instead of "/exports" for the mount path so that you can use NFSv4. Please let me know if you have any issue with that. Thanks!

…

On Wed, Jan 25, 2017 at 3:38 AM, Andre Freitas ***@***.***> wrote: only to inform that i was able to use the nfs example in GKE node version 1.4.8 but only when using Node image container-vm. If i tried to use with node image gci it doesn't work, giving this information in event history of the pod 18m 5m 7 {kubelet gke-qamar-n1-standard2-55a0bb05-wcqw} Warning FailedMount Unable to mount volumes for pod "frontoffice-rc-39.0-u7vp7_default(ebd46ce9-e253-11e6-b119-42010a84013c)": timeout expired waiting for volumes to attach/mount for pod "frontoffice-rc-39.0-u7vp7"/"default". list of unattached/unmounted volumes=[nfsvol] 18m 5m 7 {kubelet gke-qamar-n1-standard2-55a0bb05-wcqw} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "frontoffice-rc-39.0-u7vp7"/"default". list of unattached/unmounted volumes=[nfsvol] 20m 4m 15 {kubelet gke-qamar-n1-standard2-55a0bb05-wcqw} Warning FailedMount MountVolume.SetUp failed for volume "kubernetes.io/nfs/ebd46ce9-e253-11e6-b119-42010a84013c-client-nfs-pv" (spec.Name: "client-nfs-pv") pod "ebd46ce9-e253-11e6-b119-42010a84013c" (UID: "ebd46ce9-e253-11e6-b119-42010a84013c") with: mount failed: exit status 32 Mounting command: /home/kubernetes/bin/mounter Mounting arguments: 10.100.245.245:/exports /var/lib/kubelet/pods/ebd46ce9-e253-11e6-b119-42010a84013c/volumes/kubernetes.io~nfs/client-nfs-pv nfs [] Output: Running mount using a rkt fly container run: group "rkt" not found, will use default gid when rendering images mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd. mount.nfs: an incorrect mount option was specified — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#24687 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ASSNxVQ2aoCXt9yrnQFGz4p58yFW_ZQDks5rVzQpgaJpZM4IN-T7> .

-- - Jing

avra911 · 2017-01-25T22:47:43Z

@jingxu97, using path '/' gives a different error: Invalid specification: destination can't be '/', while using gcr.io/google_containers/volume-nfs:0.8.

For the client container, the error is still the same:

Running mount using a rkt fly container run: group "rkt" not found, 
will use default gid when rendering images mount.nfs: 
Failed to resolve server magento-nfs: Temporary failure in name resolution

jingxu97 · 2017-01-27T06:21:49Z

@avra911 the following are the changes I made to make the test /examples/nfs work

Edit examples/volumes/nfs/nfs-pv.yaml
change the last line to path: "/"

Edit examples/volumes/nfs/nfs-server-rc.yaml
change the image to the one that enabled NFSv4
image: gcr.io/google_containers/volume-nfs:0.8

Your error message has this Temporary failure in name resolution. Are you using IP in volume spec for the client pod? You could also share us your yaml file so we can help take a look.
Please let me know if you have any problems with these changes. Thanks!

avra911 · 2017-01-27T06:32:30Z

Hello,

I will try today once again exactly the files from the example. If it works, I will double check my files and come back here for more help.

@jingxu97 , quick question, does it matter how the cluster is created, because I am starting with gcloud container clusters create my-project --zone europe-west1-d --disk-size=40 --num-nodes=1 --machine-type n1-highcpu-2? I haven't tried on a cluster created with kube-up.sh, maybe is missing rkt support on the host.

EDITED: The files from the example works out of the box, without any modifications to the image or path. Tested on 1.5.2 (server and client) on a cluster created with cluster/kube-up.sh and using the rkt runtime.

Thank you!

wstrange · 2017-02-07T18:54:45Z

Also having issues. I have made the suggested edits above, but the busybox pod can not mount the pvc:

8b6f997-ed66-11e6-88e4-08002702efd7-nfs" (spec.Name: "nfs") pod "38b6f997-ed66-11e6-88e4-08002702efd7" (UID: "38b6f997-ed66-11e6-88e4-08002702efd7") with: mount failed: exit status 32
Mounting command: mount
Mounting arguments: 10.0.0.212:/ /var/lib/kubelet/pods/38b6f997-ed66-11e6-88e4-08002702efd7/volumes/kubernetes.io~nfs/nfs nfs []
Output: mount.nfs: an incorrect mount option was specified

This is on minikube 0.16, kube 1.5.2

Also - is there a tracking bug for

https://github.com/kubernetes/kubernetes/tree/master/examples/volumes/nfs with correction from kubernetes/kubernetes#24687

jingxu97 · 2017-04-21T23:07:32Z

@avra911 sorry that I missed your message. It does not matter how the cluster is created, I think.

jingxu97 · 2017-04-21T23:08:10Z

All users, now NFSv3 is also supported on GKE, please give it a try and let us know if there is any problem. Thanks!

ahmetb · 2017-06-27T20:18:55Z

We moved the examples to their own repo (https://github.com/kubernetes/examples) for further maintenance. However such popular examples to host their own repo for maintenance, and/or convert into Helm charts.

It also looks like this issue is now fixed, @jingxu97 should we close now?

kwiesmueller · 2017-07-04T16:56:25Z

@jingxu97 we just migrated to the new cos due to the deprecation of container-vm and I can confirm that NFS is still not working with us. We keep getting the Output: mount.nfs: Connection timed out error on an up-to-date cluster.

jingxu97 · 2017-07-05T17:17:07Z

@kwiesmueller Could you please provide more details about your NFS setup so that we can help figure out what the problem is. Thanks!

kwiesmueller · 2017-07-05T17:31:02Z

@jingxu97 sure thing!
The Cluster is a default GCE Project on 1.6.4 using cos.
The NFS Server is this:

...
containers:
      - name: nfs-server
        image: gcr.io/google-samples/nfs-server:1.1
        imagePullPolicy: IfNotPresent
        ports:
        - name: nfs
          containerPort: 2049
        - name: mountd
          containerPort: 20048
        - name: rpcbind
          containerPort: 111
        securityContext:
          privileged: true
...

The Volume for the Server is a Google PD.

The Clients are using the Server like this:

volumes:
      - name: file-store
        nfs:
          server: 10.55.254.247
          path: '/exports'

Oh and the NFS Server IP is fixed in the Service:

spec:
  type: ClusterIP
  clusterIP: 10.55.254.247
  ports:
    - name: nfs
      port: 2049
    - name: mountd
      port: 20048
    - name: rpcbind
      port: 111

Oh and there are no errors on the server side and I can not find any in Google Logs as well...
But the Pods return this:

Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/nfs/ef6d92c5-619f-11e7-a897-42010a84016a-file-store" (spec.Name: "file-store") pod "ef6d92c5-619f-11e7-a897-42010a84016a" (UID: "ef6d92c5-619f-11e7-a897-42010a84016a") with: mount failed: exit status 1
Mounting command: /home/kubernetes/containerized_mounter/mounter
Mounting arguments: 10.55.254.247:/exports /var/lib/kubelet/pods/ef6d92c5-619f-11e7-a897-42010a84016a/volumes/kubernetes.io~nfs/file-store nfs []
Output: Mount failed: Mount failed: exit status 32
Mounting command: chroot
Mounting arguments: [/home/kubernetes/containerized_mounter/rootfs mount -t nfs 10.55.254.247:/exports /var/lib/kubelet/pods/ef6d92c5-619f-11e7-a897-42010a84016a/volumes/kubernetes.io~nfs/file-store]
Output: mount.nfs: Connection timed out

jingxu97 · 2017-07-05T19:47:45Z

Could you try to use gcr.io/google_containers/volume-nfs:0.8 as the NFS server container image? And also use Path: '/' in client volumes. This makes sure to use NFSv4.

kwiesmueller · 2017-07-05T23:12:05Z

Will try right tomorrow, thanks!

kwiesmueller · 2017-07-06T12:41:27Z

Getting this on the NFS Pod:

2017-07-06T12:37:45.365208494Z Serving /exports
2017-07-06T12:37:45.365990557Z Serving /
2017-07-06T12:37:47.626999272Z Starting rpcbind
2017-07-06T12:37:47.637040428Z /usr/local/bin/run_nfs.sh: line 18:     9 Killed                  /usr/sbin/rpcinfo 127.0.0.1 > /dev/null
2017-07-06T12:37:48.235323386Z exportfs: / does not support NFS export
2017-07-06T12:37:52.52793174Z NFS started

The client Pod now only gives a timeout, nfs error:

Events:
  FirstSeen	LastSeen	Count	From							SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----							-------------	--------	------		-------
  2m		2m		1	default-scheduler						Normal		Scheduled	Successfully assigned app-1675861351-xv4qc to gke-cluster-1-default-pool-8e381c1c-4l61
  21s		21s		1	kubelet, gke-cluster-1-default-pool-8e381c1c-4l61		Warning		FailedMount	Unable to mount volumes for pod "app-1675861351-xv4qc_gartentechnik-com-test(f7006f38-6247-11e7-a897-42010a84016a)": timeout expired waiting for volumes to attach/mount for pod "gartentechnik-com-test"/"app-1675861351-xv4qc". list of unattached/unmounted volumes=[file-store]
  21s		21s		1	kubelet, gke-cluster-1-default-pool-8e381c1c-4l61		Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "gartentechnik-com-test"/"app-1675861351-xv4qc". list of unattached/unmounted volumes=[file-store]

kwiesmueller · 2017-07-06T12:43:31Z

NFS Depl:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: fileserver
  namespace: {{ "NAMESPACE" | env }}
  labels:
    app: fileserver
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fileserver-worker
  template:
    metadata:
      labels:
        app: fileserver-worker
    spec:
      containers:
      - name: fileserver-server
        image: 'gcr.io/google_containers/volume-nfs:0.8'
        imagePullPolicy: IfNotPresent
        ports:
        - name: nfs
          containerPort: 2049
        - name: mountd
          containerPort: 20048
        - name: rpcbind
          containerPort: 111
        securityContext:
          privileged: true
        resources:
          limits:
            cpu: 200m
            memory: 100Mi
          requests:
            cpu: 10m
            memory: 10Mi
        volumeMounts:
          - mountPath: /exports
            name: files-store
        livenessProbe:
          failureThreshold: 3
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 2049
          timeoutSeconds: 2
        readinessProbe:
          failureThreshold: 1
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 2049
          timeoutSeconds: 2
      volumes:
      - name: files-store
        gcePersistentDisk:
          fsType: "ext4"
          pdName: "{{ "NFS_PD_NAME" | env }}"
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: cloud.google.com/gke-preemptible
                operator: DoesNotExist

Client:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: app
  namespace: {{ "NAMESPACE" | env }}
  labels:
    app: app
spec:
  replicas: 2
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      app: app-worker
  template:
    metadata:
      labels:
        app: app-worker
    spec:
      containers:
      - name: app-apache
        image: 'eu.gcr.io/project/app:{{ "VERSION" | env }}'
        imagePullPolicy: IfNotPresent
        # command: ["tail", "-f", "/var/log/dpkg.log"]
        ports:
        - name: gt
          containerPort: 8080
        - name: api
          containerPort: 8085
        resources:
          limits:
            # cpu: 4
            memory: 2Gi
          requests:
            cpu: 0.1
            memory: 0.5Gi
        volumeMounts:
          - mountPath: /mnt
            name: file-store
      volumes:
      - name: file-store
        nfs:
          server: {{ "NFS_SERVER_IP" | env }}
          path: /
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: cloud.google.com/gke-preemptible
                operator: DoesNotExist

avra911 · 2017-07-06T13:05:10Z

Your server points to `- mountPath: /exports` while apache tries to reach `nfs: path: /`; should be the same path, this is one thing I see right now

…

---- On Thu, 06 Jul 2017 15:44:48 +0300 Kevin Wiesmüller <notifications@github.com> wrote ---- NFS Depl: apiVersion: extensions/v1beta1 kind: Deployment metadata: name: fileserver namespace: {{ "NAMESPACE" | env }} labels: app: fileserver spec: replicas: 1 selector: matchLabels: app: fileserver-worker template: metadata: labels: app: fileserver-worker spec: containers: - name: fileserver-server image: 'gcr.io/google_containers/volume-nfs:0.8' imagePullPolicy: IfNotPresent ports: - name: nfs containerPort: 2049 - name: mountd containerPort: 20048 - name: rpcbind containerPort: 111 securityContext: privileged: true resources: limits: cpu: 200m memory: 100Mi requests: cpu: 10m memory: 10Mi volumeMounts: - mountPath: /exports name: files-store livenessProbe: failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 2049 timeoutSeconds: 2 readinessProbe: failureThreshold: 1 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 2049 timeoutSeconds: 2 volumes: - name: files-store gcePersistentDisk: fsType: "ext4" pdName: "{{ "NFS_PD_NAME" | env }}" affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-preemptible operator: DoesNotExist Client: apiVersion: extensions/v1beta1 kind: Deployment metadata: name: app namespace: {{ "NAMESPACE" | env }} labels: app: app spec: replicas: 2 revisionHistoryLimit: 1 selector: matchLabels: app: app-worker template: metadata: labels: app: app-worker spec: containers: - name: app-apache image: 'eu.gcr.io/project/app:{{ "VERSION" | env }}' imagePullPolicy: IfNotPresent # command: ["tail", "-f", "/var/log/dpkg.log"] ports: - name: gt containerPort: 8080 - name: api containerPort: 8085 resources: limits: # cpu: 4 memory: 2Gi requests: cpu: 0.1 memory: 0.5Gi volumeMounts: - mountPath: /mnt name: file-store volumes: - name: file-store nfs: server: {{ "NFS_SERVER_IP" | env }} path: / affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-preemptible operator: DoesNotExist — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

kwiesmueller · 2017-07-06T13:50:29Z

@avra911 No, I can not mount / on my PD obviously... I was jut following @jingxu97 's advise to switch the client path.

kwiesmueller · 2017-07-06T14:53:29Z

Never mind... Running now...
NFSv4 works, the new Container is good... @jingxu97 where could one find the dockerfile for this one?
And the last issue on which I posted the manifests was, that NFS had not enough resources and then killed itself.

erik777 · 2017-07-06T17:23:25Z

Thanks @jingxu97. I got it working with your client-side method instead of the PVC from example, which never matched the PV.

Do you know if there is a way to provide input on k8s direction to help prioritize? I'd like to see k8s provide a purely dynamic, including elastic scaling, with no single points of failure for a horizontally clustered database. This NFS solution, with a static IP in the client YAML to a single point of failure, the node running nfs-server, appears to be a a work-around until that can be achieved.

Setting the NFS patch solution aside, I love k8s and am very hopeful its direction will address these use cases.

jingxu97 · 2017-07-07T07:04:09Z

@kwiesmueller The docker file is https://github.com/kubernetes/kubernetes/blob/master/test/images/volumes-tester/nfs/Dockerfile

jingxu97 · 2017-07-07T07:07:24Z

@erik777 You mentioned PVC never matches PV. Do you know the reason for it?

I am not sure I understand your second question. Could you please give me more details? Thanks!

erik777 · 2017-07-07T15:58:17Z

@jinkxu97 I could not figure out the reason. It does not produce an error. Is there a way to diagnose the matching logic of PVCs on GKE?

I think I found the answer to #2, how to contribute to direction: https://github.com/kubernetes/community

Heck, I even found Priority column in this spreadsheet. lol

msau42 · 2017-08-05T00:41:12Z

The NFS example should be updated now to work on the latest version of K8s. Can this be closed?

jingxu97 · 2017-08-05T03:43:52Z

Yes, it is fixed with kubernetes/examples#30. Close this issue

mml added kind/documentation Categorizes issue or PR as related to documentation. sig/storage Categorizes an issue or PR as relevant to SIG Storage. team/cluster labels May 2, 2016

jayunit100 mentioned this issue May 6, 2016

Determine which File system plugins work, and are stable for writing 1k, 1M, 1G of data, over service endpoints. #25268

Closed

bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. area/example labels May 6, 2016

thockin added team/sig-aws and removed team/sig-aws labels Aug 12, 2016

ssable added a commit to ssable/gke-jupyter-classroom that referenced this issue Feb 20, 2017

Add sample kubernetes nfs deployment from

6274f8d

https://github.com/kubernetes/kubernetes/tree/master/examples/volumes/nfs with correction from kubernetes/kubernetes#24687

ssable mentioned this issue Feb 24, 2017

Stuck in Creating Container Mode. GoogleCloudPlatform/gke-jupyter-classroom#1

Closed

jingxu97 closed this as completed Aug 5, 2017

NFS example does not work on google container engine. #24687

NFS example does not work on google container engine. #24687

Comments

arenoir commented Apr 22, 2016

erinboyd commented Apr 22, 2016

erinboyd commented Apr 22, 2016

arenoir commented Apr 22, 2016

arenoir commented Apr 22, 2016

arenoir commented Apr 25, 2016

xidui commented Apr 26, 2016

xidui commented Apr 26, 2016

xidui commented Apr 26, 2016

arenoir commented Apr 28, 2016

mml commented May 2, 2016

ekozan commented May 3, 2016 • edited

rootfs commented May 3, 2016

bgrant0607 commented May 6, 2016

liubin commented Jun 14, 2016

rootfs commented Jun 14, 2016

klaus commented Jun 23, 2016 • edited

pwittrock commented Jun 23, 2016

sijnc commented Jun 27, 2016

rootfs commented Jun 28, 2016

klaus commented Jun 28, 2016

jingxu97 commented Jan 5, 2017 via email

pmblatino commented Jan 12, 2017

ghost commented Jan 25, 2017

jingxu97 commented Jan 25, 2017 via email

avra911 commented Jan 25, 2017 • edited

jingxu97 commented Jan 27, 2017

avra911 commented Jan 27, 2017 • edited

wstrange commented Feb 7, 2017

jingxu97 commented Apr 21, 2017

jingxu97 commented Apr 21, 2017

ahmetb commented Jun 27, 2017

kwiesmueller commented Jul 4, 2017

jingxu97 commented Jul 5, 2017

kwiesmueller commented Jul 5, 2017

jingxu97 commented Jul 5, 2017

kwiesmueller commented Jul 5, 2017

kwiesmueller commented Jul 6, 2017

kwiesmueller commented Jul 6, 2017

avra911 commented Jul 6, 2017 via email • edited

kwiesmueller commented Jul 6, 2017

kwiesmueller commented Jul 6, 2017

erik777 commented Jul 6, 2017

jingxu97 commented Jul 7, 2017

jingxu97 commented Jul 7, 2017

erik777 commented Jul 7, 2017 • edited

msau42 commented Aug 5, 2017

jingxu97 commented Aug 5, 2017

ekozan commented May 3, 2016 •

edited

klaus commented Jun 23, 2016 •

edited

avra911 commented Jan 25, 2017 •

edited

avra911 commented Jan 27, 2017 •

edited

avra911 commented Jul 6, 2017 via email •

edited

erik777 commented Jul 7, 2017 •

edited