New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to use NFS shares with Zpool #12509
Comments
|
Right, I was looking at that, then got sidetracked by something catastrophic. I guess I'll go back to looking at it... Aside: I wouldn't call Debian 11 with a Debian 9 kernel Debian 11... |
i would have to agree but due to driver issues with my host bus adapter that interfaces with the disks in this pool i am unable to upgrade to kernels above the installed one. i havent been able to diagnose as it cripples my zfs pool |
|
Hey, check this out. So apparently a workaround is {r,w}size=131072, for now. Because if I mount with those (I was mounted with =256k before)... Still digging... e: 1c2358c seems to be the rotten commit. That's unfortunate... e2: So let's try to narrow down where it's broken - it works on 5.10.48, broken on 4.9.0-12-amd64, broken on 4.14.100, broken on 4.14.232 - gonna test it on 4.19.0-17-amd64 and then examine the config.logs to see what the differing codepaths involved might be... FYI @behlendorf it seems like 83b91ae is still pretty broken on at least 4.14 and 4.9. (In particular, zfs_uiomove_iter is spitting back EFAULT on what I presume to be the second iteration, since it always appears to follow the pattern of "one call succeeds, one immediately following call fails with EFAULT".) I'm going back to printf debugging after spending far too much time trying and failing to convince systemtap to look at the struct definitions... |
|
It seems like what happens when I try doing bs=1M iflag=direct over NFS with [rw]size=1048576, on my Debian 9 and 10 testbeds, is something like:
If we hardcode size to 64k, then we don't run into this on 4.9 or 4.19. If we hardcode size to 128k, then we run into it on both. Next experiment is going to be patching all that debug information into the pre-1c2358c tree and see how it functioned, and run this on e.g. 5.10 and see how it looks there, and then go look at how this looks for local accesses. I'm tempted to just suggest not returning EAGAIN if we read any bytes, but don't feel confident that won't have some side effect somewhere... |
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Fixes: openzfs#12509 Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Fixes: openzfs#12509 Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Fixes: openzfs#12509 Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12370 Closes openzfs#12509 Closes openzfs#12516
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12370 Closes openzfs#12509 Closes openzfs#12516
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12370 Closes openzfs#12509 Closes openzfs#12516
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12370 Closes openzfs#12509 Closes openzfs#12516
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12370 Closes openzfs#12509 Closes openzfs#12516
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12370 Closes openzfs#12509 Closes openzfs#12516
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one loop, get EFAULT back from zfs_uiomove() (because the iovec only holds 64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN gets interpreted as "I didn't read anything", the caller tries again without consuming the 64k we already read, and we're stuck. This apparently works on newer kernels because the caller which breaks on older Linux kernels by happily passing along a 1M read request and a 64k iovec just requests 64k at a time. With this, we now won't return EFAULT if we got a partial read. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #12370 Closes #12509 Closes #12516
System information
Describe the problem you're observing
I have a single raidz2 pool that after upgrading to the latest version no longer with properly utilize NFS shares. This affects nfs-server and sharenfs= in zfs. the issue is when i share a folder or data set with NFS i cannot copy or move files FROM the share. i am however able to read and write to the share 'normally' although its slow.
if i copy a file from the share the process will basically hand and create a zero byte file and never complete the operation and no data is moved. this does not occur on zfs version 0.7 that is within debians repos. This also affects newly created pools.
i did find the below discussion which sounded like my issue
#12370
here is output from zfs get all mediaStorage/Files
https://pastebin.com/NffcaeUs
`### Describe how to reproduce the problem
Compile ZFS 2.1.0-1 using these scripts:
https://github.com/kneutron/ansitest/blob/master/debian-compile-zfs--boojum.sh
https://github.com/kneutron/ansitest/blob/master/ubuntu_zfs_build_install.sh
on a debian machine. create a pool and share it with nfs-server or sharenfs=. mount the server on a network machine or locally via 127.0.0.1. and try and copy a file from the share to the local machine
The text was updated successfully, but these errors were encountered: