Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFSv3 mount hangs in 3510.2.1 NFSv4.1 works fine #1444

Open
kthommandra opened this issue May 15, 2024 · 6 comments
Open

NFSv3 mount hangs in 3510.2.1 NFSv4.1 works fine #1444

kthommandra opened this issue May 15, 2024 · 6 comments
Labels
kind/bug Something isn't working

Comments

@kthommandra
Copy link

Description

sudo mount -t nfs -overs=3 NFSSERVER:/SHARE /tmp/mountdir

The above command hangs. Upon dumping the /proc/PID/stack I see the following

[<0>] lockd_up+0x1f/0x2b0 [lockd]
[<0>] nlmclnt_init+0x28/0xc0 [lockd]
[<0>] nfs_start_lockd+0xdd/0x120 [nfs]
[<0>] nfs_init_server.isra.0+0x1f3/0x350 [nfs]
[<0>] nfs_create_server+0x7f/0x220 [nfs]
[<0>] nfs3_create_server+0xc/0x60 [nfsv3]
[<0>] nfs_try_get_tree+0x12c/0x210 [nfs]
[<0>] vfs_get_tree+0x22/0xc0
[<0>] path_mount+0x469/0xa40
[<0>] __x64_sys_mount+0x107/0x140
[<0>] do_syscall_64+0x38/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x61/0xcb

rpc-statd.service is running on the node

If I re-run the above command with -o nolock then the mount succeeds.

If I use NFSv4 (different NFS server and share) then the mount succeeds.

I tested the same NFSv3 mount on one of the older nodes in our cluster 2983.2.1 and v3 mount works fine.

Impact

This bug is preventing mount of NFSv3 shares. For one of our NFS share we have requirement to use V3 only.

Environment and steps to reproduce

  1. sudo mkdir /tmp/nfsv3
  2. sudo mount -t nfs -overs=3 nfs-server:/nfs-share /tmp/nfsv3

The command stalls and is Ctl-Cable

Expected behavior

mount using NFSv3 should succeed

Additional information

@kthommandra kthommandra added the kind/bug Something isn't working label May 15, 2024
@kthommandra
Copy link
Author

Could someone look into this?

@kthommandra
Copy link
Author

bump

@tormath1
Copy link
Contributor

@kthommandra hello, I can't reproduce the issue via a Kubernetes deployment. I tried with two different kernel versions:

$ uname -r
5.15.154-flatcar
$ kubectl exec -ti test-pod-1 -- mount | grep nfs
10.109.64.37:/export/pvc-61c8ad64-c700-4bdb-b996-586ef8bf59f0 on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.109.64.37,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.109.64.37)
$ uname -r
6.6.30-flatcar
$ kubectl exec -ti pods/test-pod-1 -- mount | grep nfs
10.101.210.186:/export/pvc-10854062-5dc8-4dfe-99c3-62dedcf956ee on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.101.210.186,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.101.210.186)

Did you try with newer Flatcar versions?

@ader1990
Copy link

Hello @kthommandra,

In addition to @tormath1, I have also tried to reproduce the behaviour using:

There were no hangs in my testing.

@kthommandra, Can you please elaborate on how did you install the NFS server or what NFS hardware box are you using?
Maybe there are some logs on the NFS box that might help us with the debug?

Thanks.

@kthommandra
Copy link
Author

kthommandra commented Jun 11, 2024

@kthommandra hello, I can't reproduce the issue via a Kubernetes deployment. I tried with two different kernel versions:

$ uname -r
5.15.154-flatcar
$ kubectl exec -ti test-pod-1 -- mount | grep nfs
10.109.64.37:/export/pvc-61c8ad64-c700-4bdb-b996-586ef8bf59f0 on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.109.64.37,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.109.64.37)
$ uname -r
6.6.30-flatcar
$ kubectl exec -ti pods/test-pod-1 -- mount | grep nfs
10.101.210.186:/export/pvc-10854062-5dc8-4dfe-99c3-62dedcf956ee on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.101.210.186,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.101.210.186)

Did you try with newer Flatcar versions?

I have not tried newer flatcar versions yet. Will do.
The kernel version in 3510.2.1 is 5.15.106-flatcar. Can you try with this version ?

@kthommandra
Copy link
Author

kthommandra commented Jun 11, 2024

Hello @kthommandra,

In addition to @tormath1, I have also tried to reproduce the behaviour using:

There were no hangs in my testing.

@kthommandra, Can you please elaborate on how did you install the NFS server or what NFS hardware box are you using? Maybe there are some logs on the NFS box that might help us with the debug?

Thanks.

We are using an NFS appliance and not Linux NFS server.
We are working with the vendor to collect logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Status: 📝 Needs Triage
Development

No branches or pull requests

3 participants