New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEEK_DATA fails randomly #11900
Comments
|
@zmedico fyi |
|
Any chance #11910 is related? |
|
See https://bugs.chromium.org/p/chromium/issues/detail?id=1198679 for more of the nitty-gritty debugging. |
|
@ismell it's not clear to me from the comment above, but does the patch you posted above avoid the issue in your testing? One plausible way you might try and reproduce this is to write to the file while performing the |
|
No the patch above was just to confirm that the pointer was at the start of the file. I haven't yet been able to find an easy way to reproduce the problem. There should be no writers to the input file at this point. I'll keep working on trying to find an easy way to reproduce it. Is there anything else I should be tracing that will help? |
|
One thing you could try to rule out the dirty data case would be to add an |
|
Thanks for the tip. So I wrote a script that calls No patchWith fsyncdiff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 352342c06..004b8618a 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -140,8 +140,10 @@ do_lseek_data(int fd_out, int fd_in, off_t *off_out) {
/* Use lseek SEEK_DATA/SEEK_HOLE for sparse file support,
* as suggested in the copy_file_range man page.
*/
- off_t offset_data, offset_hole;
+ off_t offset_data, offset_hole, current;
+ fsync(fd_in);
+ current = lseek(fd_in, 0, SEEK_CUR);
offset_data = lseek(fd_in, *off_out, SEEK_DATA);
if (offset_data < 0) {
if (errno == ENXIO) {It looks like adding |
|
@ismell if you're comfortable rebuilding ZFS for the testing I could provide a debug patch which should give us a little more insight. |
|
I'm sure I can figure it out :) |
|
In that case, would you mind testing commit f73e0a6 from the https://github.com/behlendorf/zfs/tree/zfs-holey branch. It implements a possible fix by closing a small race. It's only been lightly tested since I wasn't able to reproduce the issue. |
|
Sorry for the long cycle time. I was waiting until I was forced to reboot my computer. So I tried This normally reproduced within a few iterations. So maybe it was fixed already? |
To be clear you're saying that even without the path you're unable to reproduce the issue now? There weren't any other recent changes in this area which I would have expected to fix this. |
The dn_struct_rwlock lock which prevents the dnode structure from changing should be held over the dirty check and the subsequent call to dnode_next_offset(). This ensures that a clean dnode can't be dirtied before the the data/hole is located. Furthermore, refactor the code to provide a dnode_is_dirty() helper function which performs the multilist_link_active() check while holding the correct multilist lock. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900
|
So I went back to the original version: It looks like I can't reproduce the issue anymore... I'm wondering if rebooting helped clear what ever bad state I was in... |
|
I've started hitting a similar issue which manifests as More precisely:
I eventually somehow figured I should try downgrading coreutils from 9.0 -> 8.32(-r1 in Gentoo) which made the issue disappear: rebuilding Go worked! coreutils 9.0 changed some default behaviours around reflinking, which ZFS doesn't have, and sparse files, but I tried forcing them off in the I then decided to bisect and arrived at: I've straced Portage (just for the lseek syscall) while installing Go and it is filled with: TL;DR: I seem to be hitting this bug as a result of changes in coreutils which exposes the problem. Please let me know if you feel this is a different issue and I should open a new bug. system information (emerge --info)Portage 3.0.23 (python 3.10.0-candidate-2, default/linux/amd64/17.1/hardened, gcc-11.2.0, glibc-2.33-r7, 5.14.7-adry x86_64) ================================================================= System uname: Linux-5.14.7-adry-x86_64-11th_Gen_Intel-R-_Core-TM-_i7-11800H_@_2.30GHz-with-glibc2.33 KiB Mem: 16105692 total, 1991680 free KiB Swap: 0 total, 0 free Timestamp of repository gentoo: Sun, 26 Sep 2021 20:51:27 +0000 Head commit of repository gentoo: fe49ef8e5b7b42f9fcb1f7bf36fbf4c571b8b120 EDIT: Interestingly, upstream made a kind-of-related remark in a commit after the release: Upstream commit mentioning ZFStests: cp/sparse-perf: make more robust and add zfs comments
@behlendorf I've tried applying the patch you suggested to ZFS 2.1.1 but unfortunately, while it applies, it ends up failing at build time. Is there any chance you're able to please rebase it? The failure looks like: |
The dn_struct_rwlock lock which prevents the dnode structure from changing should be held over the dirty check and the subsequent call to dnode_next_offset(). This ensures that a clean dnode can't be dirtied before the data/hole is located. Furthermore, refactor the code to provide a dnode_is_dirty() helper function which performs the multilist_link_active() check while holding the correct multilist lock. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900
|
@thesamesam thanks for continuing the investigation of this issue. I've rebased my https://github.com/behlendorf/zfs/tree/zfs-holey branch and resolved the build failure (commit 1722451). If you can give it a try that would be great. |
|
@behlendorf Thanks! (Note: I hadn't been able to reproduce this on other ZFS'd machines with pretty similar setups, so I/O speed might be a factor, which I guess is obvious given it's racey). Applied to 2.1.1 and built fine (rebuilt initramfs and so on, and rebooted) but no luck. Same corrupted I'm happy to enable more debugging as necessary, especially given I know this is tricky given you can't reproduce it on your end. Here's a partial strace during this cp line from the ebuild (I've just picked out the bits around copying the corrupted binaries, but I can get you more if it's useful): EDIT: I did out of curiosity try |
openzfs/zfs#11900 Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
That's very surprising. When set ZFS should always report the correct holes at the cost of performance. Were you able to reproduce this behavior with both the unmodified and patched code? |
|
Yeah, I just tried again unpatched and it happens regardless of the value of the toggle (0 or 1). Happy to try enabling more debugging if can think of a way to toggle it on/off easily during the merge. |
|
Looking at the strace output what appears to be happening is:
The story is similar for |
|
Ah, it gets renamed during the build process: Here's the output for 10 lines after every instance of the path being mentioned: You're welcome to the full 550MB of strace spew if you want it rather than me trying to clumsily filter, but I appreciate that's no holiday for you either! |
The dn_struct_rwlock lock which prevents the dnode structure from changing should be held over the dirty check and the subsequent call to dnode_next_offset(). This ensures that a clean dnode can't be dirtied before the data/hole is located. Furthermore, refactor the code to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Finally, hold the znodes range lock to prevent any raciness with concurrent reads or writes. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900
|
@thesamesam would you mind trying this updated patch with your test setup, 6338227. It would be great if you could try it with |
|
I can absolutely not reproduce it any more with the patch in #12745 on any case I've tested. |
|
Same here -- no issues with master + #12745. No opinion on the default although it may be worth revisiting it once things have calmed down. |
|
For sure we need tests to cover ongoing scenarios as they are hard to reproduce in live systems |
|
Bit of an out there idea: would be interesting to see if we could get a CI box fast enough to try doing emerges (in Gentoo) that could try hit such problems. Of course, a more isolated test case would be great, even if it were just calling cp in a loop? |
|
It would be very interesting to work out what happens actually in case of emerge an build test case around that not only guestimating what could be wrong in code and fix every possible slip probably introducing regression. |
I presume you mean something much more constrained than what I'm taking from your comment, otherwise you seem to be proposing solving a somewhat general open AI problem in order to avoid this regression. |
|
I mean if we know where we have bug in code we won't fix other parts that otherwise might look like as source of bug (of course we could find bugs in other places but they are irrelevant to searched bug) and could therefore introduce regression if we are wrong. As a bonus we have real world test case just to implement not guestimating what test case to this bug could look like |
When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12724
When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12724
In addition to flushing memory mapped regions when checking holes, commit de198f2 modified the dirty dnode detection logic to check the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying on the dirty record has not be reliable, switch back to the previous method. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #11900 Closes #12745
In addition to flushing memory mapped regions when checking holes, commit de198f2 modified the dirty dnode detection logic to check the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying on the dirty record has not be reliable, switch back to the previous method. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12745
In addition to flushing memory mapped regions when checking holes, commit de198f2 modified the dirty dnode detection logic to check the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying on the dirty record has not be reliable, switch back to the previous method. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12745
take two Bug: https://bugs.gentoo.org/815469 openzfs/zfs#11900 Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12724
When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12724
In addition to flushing memory mapped regions when checking holes, commit de198f2 modified the dirty dnode detection logic to check the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying on the dirty record has not be reliable, switch back to the previous method. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12745
When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12724
In addition to flushing memory mapped regions when checking holes, commit de198f2 modified the dirty dnode detection logic to check the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying on the dirty record has not be reliable, switch back to the previous method. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12745
|
This can probably be closed now, unless someone's reproduced it since the fixes? |
When using lseek(2) to report data/holes memory mapped regions of the file were ignored. This could result in incorrect results. To handle this zfs_holey_common() was updated to asynchronously writeback any dirty mmap(2) regions prior to reporting holes. Additionally, while not strictly required, the dn_struct_rwlock is now held over the dirty check to prevent the dnode structure from changing. This ensures that a clean dnode can't be dirtied before the data/hole is located. The range lock is now also taken to ensure the call cannot race with zfs_write(). Furthermore, the code was refactored to provide a dnode_is_dirty() helper function which checks the dnode for any dirty records to determine its dirtiness. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12724
In addition to flushing memory mapped regions when checking holes, commit de198f2 modified the dirty dnode detection logic to check the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying on the dirty record has not be reliable, switch back to the previous method. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#11900 Closes openzfs#12745
This reverts commit 4b3133e671b958fa2c915a4faf57812820124a7b upstream. See #14753 - possible corruption again, very similar symptoms to the nightmare that was #11900 and same area of code. We can safely revert it as it's an optimisation rather than a bugfix in itself. Keen Gentoo users may remember the following commits (in Gentoo): - 8e5626d - 9fb275f - 95c250a Bug: openzfs/zfs#14753 Bug: openzfs/zfs#11900 Signed-off-by: Sam James <sam@gentoo.org>
System information
Describe the problem you're observing
lseek+SEEK_DATAis randomly returning-ENXIO.Code showing the problem: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/third_party/portage_tool/src/portage_util_file_copy_reflink_linux.c;l=145
I patched it with the following:
On a successful run this is what the strace looks like(depthcharge.elf):
When a failure happens this is what it looks like (netboot.elf):
man lseekshows:Describe how to reproduce the problem
I don't have a simple way to reproduce this yet. It happens as part of a
portagebuild. I have attached a full strace log. The problem shows up randomly, so running the same command multiple times will cause different files to fail to copy.Include any warning/errors/backtraces from the system logs
depthcharge-strace-patch1.log
The text was updated successfully, but these errors were encountered: